Obvio
Obvio (spanish for obvious) is a graph-based framework for exploring biomedical literature to facilitate Literature-Based Discovery (LBD) based on rich knowledge representations. Its broader goal is to uncover hidden and complex associations between concepts in biomedical texts. To achieve this, Obvio utilizes several tools and resources developed at the National Library of Medicine (NIH-NLM), including MetaMap, SemRep, MEDLINE, SemMedDB, MeSH, UMLS, BKR and the UMLS Semantic Navigator. Obvio has resulted in the rediscovery of 8 out of 9 existing discoveries from scientific literature. The project encapsulates the PhD Dissertation<ref>D. Cameron, A Context-Driven Subgraph Model for Literature-Based Discovery, Ph.D. Thesis, Wright State University, 2014</ref> (video on YouTube) by Delroy Cameron, presented on August 18, 2014.
Contents
People
Graduate Students: Delroy Cameron, Swapnil Soni, Nishita Jaykumar, Vishnu Bompally
External Collaborators: Thomas C. Rindflesch, Ramakanth Kavuluru, Olivier Bodenreider
Faculty: Amit P. Sheth (Advisor), Krishnaprasad Thirunarayan
Past Members: Pablo N. Mendes, Tu Danh, Sreeram Vallabhaneni, Hima Yalamanchili, Drashti Dave
PhD Dissertation
Obvio was presented as the core of the PhD Dissertation by Delroy Cameron in the Summer 2014. The dissertation defense (and dissertation proposal) videos are available on YouTube. The dissertation presentation is also available on SlideShare.
Overview
Obvio is driven by assertions extracted from biomedical literature (called semantic predications) as well as statements obtained from structured knowledge sources (such as the UMLS and MeSH). Semantic predications are extracted from MEDLINE using SemRep and made publicly available through the Semantic MEDLINE Database (SemMedDB). These semantic predications can be used for various tasks. Some of these include: 1) Information Retrieval; 2) Question Answering (QA); 3) Document Summarization and 4) Literature-Based Discovery (LBD). Obvio uses the semantic predications specifically for Question Answering and LBD.
Question Answering
Reachability
Semantic predications were first used in Obvio for biomedical QA based on data from the 2006 TREC Challenge. The approach was based on the notion of reachability, to determine whether documents that answer complex biomedical questions could be meaningfully connected using assertions from the literature. Structured background knowledge was used to gain additional insights to connect biomedical texts, when semantic predications alone proved insufficient. The presentation below, together with our paper<ref>D. Cameron, R. Kavuluru, O. Bodenreider, P. N. Mendes, A. P. Sheth, K. Thirunarayan, Semantic Predications for Complex Information Needs in Biomedical Literature, 5th International Conference on Bioinformatics and Biomedicine (BIBM2011), Atlanta GA, November 12-15, 2011 (acceptance rate=19.4%)</ref> in BIBM 2011 on applying predications and background knowledge for QA, provide more details on this approach.
Literature-Based Discovery (LBD)
Rediscovery
The semantic predications were subsequently used for Literature-based Discovery (LBD). Specifically, they were used to determine whether existing knowledge from scientific literature, could be effectively recovered. We developed a graph-based approach that was successfully applied to rediscover and decompose Don R. Swanson's Raynaud Syndrome - Dietary Fish Oils Hypothesis (RS-DFO) from 1986. Much of the early research aimed at rediscovering Swanson's Hypotheses focused on distributional statistics and Information Retrieval (IR) techniques, such as term and concept co-occurrence to find intermediates. Only recently has significant attention been devoted to semantics-based techniques that exploit the meaning of associations between concepts. While generally more intuitive, the feasibility of such semantics-based approaches has not been fully established. Our article published in JBI<ref>D. Cameron, O. Bodenreider, H. Yalamanchili, T. Danh, S. Vallabhaneni, K. Thirunarayan, A. P. Sheth, T. C. Rindflesch, A Graph-Based Recovery and Decomposition of Swanson’s Hypothesis using Semantic Predications, Journal of Biomedical Informatics 46(2): 238-251, (2013). ScienceDirect, PMID </ref> shows that semantics-based techniques can effectively be used for recovering and decomposing Swanson's Raynaud Syndrome-Fish Oil hypothesis using semantic predications, background knowledge and graph algorithms. It is reasonable to expect that if semantics-based techniques are adequate for rediscovering existing knowledge, they ought to be sufficient for discovering new knowledge.
RS-DFO Hypothesis
The following presentation gives more details about the approach for knowledge rediscovery and decomposition. Various datasets and experimental results are also provided.
Datasets and Experimental Results
- Dataset
- Baseline (B1)
- Original PDFs of the 65 articles cited by Swanson's RS-DFO paper (30.5MB)
- ASCII text with end-of-line text wrapping fixed
- Text in Medline format for parsing by SemRep
- SemRep Relations Output
- SemRep Relations Output (vascular reactivity)
- SemRep Extracted Predications
- Manually Identified Predications (vascular reactivity)
- Baseline (B2)
- Baseline (B1)
- Experimental Results
The main limitation of the approach for rediscovery and decomposition using semantic predications is that subgraphs were created manually. An approach to automatically cluster paths based on a specification of context, was developed. The next section provides details on this approach for automatic subgraph creation.
Automatic Subgraph Creation
Following our experiments on knowledge rediscovery, the semantic predications were used to automatically generate subgraphs, which capture complex associations between two concepts<ref>D. Cameron, R. Kavuluru, T. C. Rindflesch, A. P. Sheth, K. Thirunarayan, O. Bodenreider, Context-Driven Automatic Subgraph Creation for Literature-Based Discovery. Journal of Biomedical Informatics 54: 141-157 (2015) </ref> (i.e., closed discovery). We developed a method that creates complex associations in the form of subgraphs along different thematic dimensions of association between such concepts. The generated subgraphs were shown to facilitate the rediscovery of 8 out of 9 existing scientific discoveries, including the RS-DFO scenarios from our article in JBI. Each rediscovery scenario is covered in detail in the following tables. The associations from each subgraph in each rediscovery scenario, can be explored using our live web application: http://knoesis-hpco.cs.wright.edu/obvio/ and a video demo is also available online.
Legend
Not Found | |
Found | |
subgraph x | x (subgraph number) |
singleton y | y (singleton number), where a singleton is a subgraph consisting of only one path |
zero rarity singleton | a single-path subgraph (or singleton) whose concepts never occur together in any article in MEDLINE |
Scenario 1 | Intermediate | Association | Status | |||
---|---|---|---|---|---|---|
Source | Target | Details | ||||
Dietary Fish Oils | Raynaud Syndrome | Cut-off Date: November 1985 By: Don R. Swanson |
Blood Viscosity | Dietary Fish Oils INHIBITS Blood Viscosity | Blood Viscosity CAUSES Raynaud Syndrome | zero rarity singleton15 |
Platelet Aggregation | Dietary Fish Oils INHIBITS Platelet Aggregation | Platelet Aggregation CAUSES Raynaud Syndrome | subgraph1 | |||
Vascular Reactivity | Dietary Fish Oils INHIBITS Vasoconstriction | Vasoconstriction CAUSES Raynaud Syndrome |
Scenario 2 | Intermediate | Association | Status | |||
---|---|---|---|---|---|---|
Source | Target | Details | ||||
Magnesium | Migraine | Cut-off Date: April 1987 By: Don R. Swanson | ||||
Calcium Channel Blockers | Magnesium ISA Calcium Channel Blocker | Calcium Channel Blockers TREATS Migraine | subgraph22 | |||
Epilepsy | Magnesium AFFECTS Epilepsy | Epilepsy COEXISTS_WITH Migraine | subgraph9 | |||
Hypoxia | Magnesium INHIBITS Hypoxia | Hypoxia ASSOCIATED_WITH Migraine | ||||
Inflammation (Brain Edema, Hydrocephalus) | Magnesium INHIBITS Inflammation | Inflammation CAUSES Migraine | zero rarity singleton3 | |||
Platelet Activity | Magnesium INHIBITS Platelet Aggregation | Platelet Aggregation CAUSES Migraine | subgraph1 | |||
Prostaglandins | Magnesium STIMULATES Prostaglandins | Prostaglandins DISRUPTS Migraine Disorders | subgraph4 | |||
Stress/Type A Personality | Stress INHIBITS Magnesium | Stress ASSOCIATED_WITH Migraine | ||||
Serotonin | Magnesium INHIBITS Serotonin | Serotonin CAUSES Migraine | subgraph1 | |||
Cortical Depression | Magnesium INHIBITS Spreading Cortical Depression | Spreading Cortical Depression CAUSES Migraine | ||||
Substance P | Magnesium INHIBITS Substance P | Substance P CAUSES Migraine | ||||
Vascular Mechanisms | Magnesium INHIBITS Vasoconstriction | Vasoconstriction CAUSES Migraine | subgraph9 |
Scenario 3 | Intermediate | Association | Status | ||||
---|---|---|---|---|---|---|---|
Source | Target | Details | |||||
Somatomedin C | Arginine | April 1989 | Don R. Swanson (Pubmed Central) | Growth Hormone | Arginine STIMULATES Growth Hormone | Growth Hormone STIMULATES Somatomedins | subgraph5 |
Body Weight (body mass) | Somatomedins (IGF1) STIMULATES Growth | Arginine STIMULATES Growth | subgraph7 | ||||
Malnutrition | Somatomedins TREATS Malnutrition | Arginine TREATS Malnutrition | subgraph7 | ||||
Wound Healing (NK activity) | Somatomedin STIMULATES Wound Healing | Arginine STIMULATES Wound Healing |
Scenario 4 | Intermediate | Association | Status | ||||
---|---|---|---|---|---|---|---|
Source | Target | Details | |||||
Indomethacin | Alzheimer’s Disease | July 1995 | Neil R. Smalheiser/Don R. Swanson (J. Neurol) | Acetylcholine | Indomethacin INHIBITS Acetylcholine | Acetylcholine CAUSES Alzheimer's Disease | subgraph4 |
Lipid peroxidation | Indomethacin INHIBITS Lipid peroxidation | Lipid peroxidation CAUSES Alzheimer's Disease | subgraph2 | ||||
M2-muscarinic | Indomethacin INHIBITS M2-muscarinic | M2-muscarinic CAUSES Alzheimer's Disease | |||||
Membrane Fluidity | Indomethacin INHIBITS Membrane Fluidity | Membrane Fluidity CAUSES Alzheimer's Disease | |||||
Lymphocytes | Indomethacin STIMULATES natural killer T-Cell Activity | T-Cell Activity INHIBITS Alzheimer's Disease | subgraph14 | ||||
Thyrotropin | Indomethacin STIMULATES Thyrotropin | Thyrotropin AFFECTS Alzheimer's Disease | zero rarity singleton20 | ||||
T-lymphocytes (T-Cells) | Indomethacin STIMULATES T-lymphocytes | T-lymphocytes Activity INHIBITS Alzheimer's Disease | subgraph3 |
Scenario 5 | Intermediate | Association | Status | ||||
---|---|---|---|---|---|---|---|
Source | Target | Details | |||||
Estrogen | Alzheimer’s Disease | July 1995 | Neil R. Smalheiser/Don R. Swanson (Pubmed) | Antioxidant activity | Estrogen INHIBITS Antioxidant activity | Antioxidant activity CAUSES Alzheimer's Disease | subgraph4 |
Alipoprotein E (ApoE) | Estrogen INHIBITS ApoE | ApoE CAUSES Alzheimer's Disease | subgraph3 | ||||
Calbindin D28k | Estrogen REGULATES Calbindin D28k | Calbindin D28k AFFECTS Alzheimer's Disease | subgraph4 | ||||
Cathepsin D | Estrogen STIMULATES Cathepsin D | Cathepsin D PREVENTS Alzheimer's Disease | |||||
Cytochrome C oxidase subunit III | Estrogen STIMULATES Cytochrome Coxidase subunit III | Cytochrome Coxidase subunit III AFFECTS Alzheimer's Disease | |||||
Glutamate | Estrogen STIMULATES Glutamate | Glutamate AFFECTS Alzheimer's Disease | |||||
Receptor Polymorphism | Estrogen EXHIBITS Receptor Polymorphism | Receptor Polymorphism AFFECTS Alzheimer's Disease |
Scenario 6 | Intermediate | Association | Status | ||||
---|---|---|---|---|---|---|---|
Source | Target | Details | |||||
Calcium-Independent PLA2 | Schizophrenia | 1997 | Neil R. Smalheiser/Don R. Swanson (Pubmed) | Oxidative stress | Oxidative Stress INHIBITS Calcium-Independent PLA2 | Oxidative stress CAUSES Schizophrenia | singleton2 |
Selenium | Selenium INHIBITS Calcium-Independent PLA2 | Selenium PREVENTS Schizophrenia | singleton2 | ||||
Vitamin E | Vitamin E INHIBITS Calcium-Independent PLA2 | Vitamin E PREVENTS Schizophrenia | singleton2 |
Scenario 7 | Intermediate | Association | Status | ||||
---|---|---|---|---|---|---|---|
Source | Target | Details | |||||
Chlorpromazine | Cardiac Hypertrophy (Cardiomegaly) | 2002 | Jonathan D. Wren (PubMed) | Calcineurin | Chlorpromazine INHIBITS Calcineurin | Calcineurin CAUSES Cardiac Hypertrophy | subgraph5 |
Isoproterenol | Chlorpromazine INHIBITS Isoproterenol | Isoproterenol CAUSES Cardiomegaly | subgraph12 |
Scenario 8 | Intermediate | Association | Status | ||||
---|---|---|---|---|---|---|---|
Source | Target | Details | |||||
Testosterone | Sleep | 2011 | Christopher M. Miller/Thomas C. Rindflesch (PubMed) | Cortisol/Hydrocortisone | Testosterone INHIBITS Hydrocortisone | Hydrocortisone DISRUPTS Sleep | subgraph7 |
Scenario 9 | Intermediate | Association | Status | ||||
---|---|---|---|---|---|---|---|
Source | Target | Details | |||||
Diethylhexyl phthalate (DEHP) | Sepsis | 2013 | Michael J. Cairelli/Thomas C. Rindflesch (PubMed Central) | PParGamma | DEHP STIMULATES PParGamma | PParGamma INHIBITS Sepsis |
Demos
Web Application: http://knoesis-hpco.cs.wright.edu/obvio/
Video Demo: http://bit.ly/obviodemo
Publications
<references/>
SWLBD Workshop
Kno.e.sis and the National Library of Medicine (NLM) organized The First International Workshop on the role of Semantic Web in Literature-Based Discovery (SWLBD2012) in conjunction with The IEEE Conference on Bioinformatics and Biomedicine (BIBM2012) in Philadelphia PA, USA.
- Due date for full workshop papers submission: Aug 6, 2012
- Notification of paper acceptance to authors: August 28, 2012
- Camera-ready version of accepted papers: September 4, 2012
- Workshop: October 4, 2012
Internal
Obvio Web App
Automatic Subgraph Creation
Recovery and Decomposition
Reachability
Contact: Delroy Cameron