Difference between revisions of "EMPWR: Knowledge Graph Development Platform"
Line 1: | Line 1: | ||
− | = | + | =Background and Motivation= |
− | + | '''Knowledge Graph''' (KG) is an encapsulation of structured knowledge in a graphical representation & used for a variety of information processing and management tasks such as | |
− | + | * Data & knowledge integration from diverse sources | |
+ | * Improve automation | ||
+ | * Enabling new generation of applications | ||
+ | * Empowering machine learning (ML) & NLP techniques with domain knowledge | ||
+ | and applications such as question answering, summarization, text simplification, and Named Entity Recognition (NER). | ||
− | 1. Develop a KG development platform capable of instantiating KGs in any domains from '''structured''', '''semi-structured''', and '''unstructured''' data | + | Most existing KG platforms & tools are limited in |
+ | * Provenance | ||
+ | * Dynamicity (ie: static schema vs schema generation) | ||
+ | * Temporal | ||
+ | * Domain specificity | ||
+ | * Modularity | ||
+ | |||
+ | The '''AIISC Knowledge Graph''' (KG) '''EMPWR''' effort involves the development of a comprehensive tool and platform for KG development with the following aims | ||
+ | |||
+ | 1. Develop a KG development platform capable of instantiating KGs in any domains from '''structured''', '''semi-structured''', and '''unstructured''' data: | ||
+ | ** Biomedical & pharmaceutical domain with '''Percuro''' | ||
2. Improve & address the limitations of existing KG platforms | 2. Improve & address the limitations of existing KG platforms | ||
Line 11: | Line 25: | ||
** Construct a Knowledge Graph out of given entities '''(Bottom-up data driven)''' | ** Construct a Knowledge Graph out of given entities '''(Bottom-up data driven)''' | ||
− | == | + | =Goals & Use-Cases= |
− | [ | + | The goals of KGs are to provide |
− | + | * Contextualization | |
+ | ** [https://www.google.com/url?q=https://www.semanticscholar.org/paper/Context-Enriched-Learning-Models-for-Aligning-in/650e79de9f4a4a123d559240387db0e3c3d1f867&sa=D&source=editors&ust=1654007896106276&usg=AOvVaw2iUUH6T-CCQl6mV6oNlmMN Context-Enriched Learning Models for Aligning Biomedical Vocabularies in the UMLS Metathesaurus] | ||
+ | * Personalization | ||
+ | * Abstraction | ||
+ | * Explainability | ||
− | == | + | =Collaborations= |
− | + | '''Percuro''' is a collaborative research project involving WIPRO, The AI Institute at University of South Carolina (AIISC), and IIT-Patna | |
+ | (IIT-P). It involves development of semantic (i.e., knowledge graph enhanced) approach to natural language processing (NLP), natural language generation (NLG) and natural language understanding (NLU) targeted at the pharmaceutical domain. It will involve techniques for NLP/NLG/NLU on biomedical and clinical documents relevant to pharmaceutical markets. | ||
− | + | '''Percuro''' aims to solve tasks such as (a) text simplification, (b) summarization and (c) question answering. These are tasks that are not straightforward and require more information that what the text provides. | |
− | + | ||
− | + | '''An example:''' Which hormone reduces blood sugar level? | |
− | + | This question requires additional context on what the word hormone means before finding an answer. | |
− | + | ||
+ | =Overview= | ||
+ | <html> | ||
<center> | <center> | ||
− | + | <iframe src="https://docs.google.com/presentation/d/1EOqyl5fK6tUecrjOpslcd9IY8US3fGqKgRZQIttGD3A/embed?start=false&loop=false&delayms=3000" frameborder="0" width="600" height="375" allowfullscreen="true" mozallowfullscreen="true" webkitallowfullscreen="true"></iframe> | |
</center> | </center> | ||
− | |||
− | |||
− | |||
<center> | <center> | ||
− | + | <iframe src="https://docs.google.com/presentation/d/1SFLzGCcTM8oYP0VsiWkvzclJ3zHKTGgY8VWeFw5kqtc/embed?start=false&loop=false&delayms=3000" frameborder="0" width="600" height="375" allowfullscreen="true" mozallowfullscreen="true" webkitallowfullscreen="true"></iframe> | |
</center> | </center> | ||
+ | </html> | ||
+ | =Toolkit= | ||
− | + | The Knowledge Graph Toolkit (EMPWR) V1.0 currently supports knowledge sources from: | |
− | + | * PharmKG (base) | |
− | + | * Open-domain: '''DBpedia''' & '''Wikidata''' | |
− | + | * Biomedical & pharmaceutical domain: '''Drugbank''' & '''UMLS''' | |
− | + | ||
− | + | ||
− | * | + | |
− | + | ||
− | + | ||
− | * | + | |
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
+ | and has an extensive coverage over the domains of '''Drugs''', '''Chemicals''', '''Diseases''' entities and their associated relations: | ||
+ | * '''Chemical''': Drug interactions, diseases cured, etc. | ||
+ | * '''Physical''': Lethal dosage, boiling point, pressure, solubility, etc | ||
+ | * '''Disease''': Symptoms, treatments, differential diagnosis, etc. | ||
+ | * '''Aliases''': Common names, chemical names and external identifiers. | ||
+ | =GitHub= | ||
+ | * [https://github.com/Anirudh-Sundar/APGC AIISC Knowledge Graph (EMPWR) Development Tool] | ||
− | + | =Demo= | |
+ | <embedvideo service="youtube">https://www.youtube.com/watch?v=ggvfAo-yp5g</embedvideo> | ||
− | = | + | <embedvideo service="youtube">https://www.youtube.com/watch?v=fUf8A48r8G0</embedvideo> |
− | + | ||
− | + | =People= | |
+ | *'''Artificial Intelligence Institute, University of South Carolina''' | ||
+ | **[https://www.linkedin.com/in/joeyyip/ Hong Yung (Joey) Yip] | ||
+ | **[https://www.linkedin.com/in/thilini-w/ Thilini Wijesiriwardene] | ||
+ | **[https://sc.edu/study/colleges_schools/engineering_and_computing/faculty-staff/amitsheth.php Dr. Amit P. Sheth] | ||
− | [ | + | *'''Development Team''' |
− | + | **[https://www.linkedin.com/in/joeyyip/ Hong Yung (Joey) Yip] | |
− | + | **[https://github.com/Anirudh-Sundar/APGC Anirudh Sundar] | |
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | * | + | |
− | + | ||
− | + | ||
− | + | ||
− | * | + | |
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | *'''WiPRO''' | |
+ | **[http://www.amitavadas.com/ Amitava Das] |
Revision as of 14:28, 21 March 2023
Contents
[hide]Background and Motivation
Knowledge Graph (KG) is an encapsulation of structured knowledge in a graphical representation & used for a variety of information processing and management tasks such as
- Data & knowledge integration from diverse sources
- Improve automation
- Enabling new generation of applications
- Empowering machine learning (ML) & NLP techniques with domain knowledge
and applications such as question answering, summarization, text simplification, and Named Entity Recognition (NER).
Most existing KG platforms & tools are limited in
- Provenance
- Dynamicity (ie: static schema vs schema generation)
- Temporal
- Domain specificity
- Modularity
The AIISC Knowledge Graph (KG) EMPWR effort involves the development of a comprehensive tool and platform for KG development with the following aims
1. Develop a KG development platform capable of instantiating KGs in any domains from structured, semi-structured, and unstructured data:
- Biomedical & pharmaceutical domain with Percuro
2. Improve & address the limitations of existing KG platforms
3. Constructs a Knowledge Graph (based on a combination of)
- Enrich an existing Knowledge Graph (Top-down declarative)
- Construct a Knowledge Graph out of given entities (Bottom-up data driven)
Goals & Use-Cases
The goals of KGs are to provide
- Contextualization
- Personalization
- Abstraction
- Explainability
Collaborations
Percuro is a collaborative research project involving WIPRO, The AI Institute at University of South Carolina (AIISC), and IIT-Patna (IIT-P). It involves development of semantic (i.e., knowledge graph enhanced) approach to natural language processing (NLP), natural language generation (NLG) and natural language understanding (NLU) targeted at the pharmaceutical domain. It will involve techniques for NLP/NLG/NLU on biomedical and clinical documents relevant to pharmaceutical markets.
Percuro aims to solve tasks such as (a) text simplification, (b) summarization and (c) question answering. These are tasks that are not straightforward and require more information that what the text provides.
An example: Which hormone reduces blood sugar level?
This question requires additional context on what the word hormone means before finding an answer.
Overview
Toolkit
The Knowledge Graph Toolkit (EMPWR) V1.0 currently supports knowledge sources from:
- PharmKG (base)
- Open-domain: DBpedia & Wikidata
- Biomedical & pharmaceutical domain: Drugbank & UMLS
and has an extensive coverage over the domains of Drugs, Chemicals, Diseases entities and their associated relations:
- Chemical: Drug interactions, diseases cured, etc.
- Physical: Lethal dosage, boiling point, pressure, solubility, etc
- Disease: Symptoms, treatments, differential diagnosis, etc.
- Aliases: Common names, chemical names and external identifiers.
GitHub
Demo
People
- Artificial Intelligence Institute, University of South Carolina
- Development Team
- WiPRO