EMPWR: Knowledge Graph Development Platform

From Knoesis wiki
Jump to: navigation, search

Background and Motivation

Knowledge Graph (KG) is an encapsulation of structured knowledge in a graphical representation & used for a variety of information processing and management tasks such as

  • Data & knowledge integration from diverse sources
  • Improve automation
  • Enabling new generation of applications
  • Empowering machine learning (ML) & NLP techniques with domain knowledge

and applications such as question answering, summarization, text simplification, and Named Entity Recognition (NER).

Most existing KG platforms & tools are limited in

  • Provenance
  • Dynamicity (ie: static schema vs schema generation)
  • Temporal
  • Domain specificity
  • Modularity

The AIISC Knowledge Graph (KG) EMPWR effort involves the development of a comprehensive tool and platform for KG development with the following aims

1. Develop a KG development platform capable of instantiating KGs in any domains from structured, semi-structured, and unstructured data:

    • Biomedical & pharmaceutical domain with Percuro

2. Improve & address the limitations of existing KG platforms

3. Constructs a Knowledge Graph (based on a combination of)

    • Enrich an existing Knowledge Graph (Top-down declarative)
    • Construct a Knowledge Graph out of given entities (Bottom-up data driven)

Goals & Use-Cases

The goals of KGs are to provide


Percuro is a collaborative research project involving WIPRO, The AI Institute at University of South Carolina (AIISC), and IIT-Patna (IIT-P). It involves development of semantic (i.e., knowledge graph enhanced) approach to natural language processing (NLP), natural language generation (NLG) and natural language understanding (NLU) targeted at the pharmaceutical domain. It will involve techniques for NLP/NLG/NLU on biomedical and clinical documents relevant to pharmaceutical markets.

Percuro aims to solve tasks such as (a) text simplification, (b) summarization and (c) question answering. These are tasks that are not straightforward and require more information that what the text provides.

An example: Which hormone reduces blood sugar level?

This question requires additional context on what the word hormone means before finding an answer.



The Knowledge Graph Toolkit (EMPWR) V1.0 currently supports knowledge sources from:

  • PharmKG (base)
  • Open-domain: DBpedia & Wikidata
  • Biomedical & pharmaceutical domain: Drugbank & UMLS

and has an extensive coverage over the domains of Drugs, Chemicals, Diseases entities and their associated relations:

  • Chemical: Drug interactions, diseases cured, etc.
  • Physical: Lethal dosage, boiling point, pressure, solubility, etc
  • Disease: Symptoms, treatments, differential diagnosis, etc.
  • Aliases: Common names, chemical names and external identifiers.