Obvio

From Knoesis wiki
Revision as of 06:04, 8 March 2012 by W007dhc (Talk | contribs) (Publications)

Jump to: navigation, search

Obvio (spanish for obvious) is the name of the project on semantics-based techniques for Literature-Based Discovery (LBD) using Biomedical Literature. The goal of Obvio is to uncover hidden connections between concepts in text, thereby leading to hypothesis generation from publicly available scientific knowledge sources.

Overview

Obvio is driven by assertions extracted from structured text (called semantic predications) as well as assertions obtained from structured knowledge sources (such as the UMLS). The fundamental notion is that LBD could be greatly facilitated by the Semantic Integration of assertions extracted from scientific literature and well curated background knowledge from heterogeneous data sources.

Project Team

Graduate Students: Delroy Cameron, Tu Danh, Sreeram Vallabhaneni, Hima Yalamanchili
External Collaborators: Olivier Bodenreider, Thomas C. Rindflesch, Ramakanth Kavuluru, Pablo N. Mendes
Faculty: Krishnaprasad Thirunarayan, Amit P. Sheth (Advisor)

Applications

We recognize three potential applications of semantic predications for computer science and biomedical science alike. These are:

  1. Question Answering (QA)
  2. Literature-based Discovery (LBD)
  3. Information Retrieval (IR)

Question Answering

We applied semantic predications for biomedical QA to the TREC 2006 Challenge. We exploit the notion of reachability to determine whether answers documents could be connected using the predications and background knowledge.

Reachability

The QA task put forth by the Text REtrieval Conferences (TREC) offer an opportunity to determine whether semantic predications can yielded relevant information given complex information needs.

Literature-based Discovery (LBD)

A second application of semantic predications is to the field of Literature-based Discovery (LBD). LBD refers to uncovering conclusions that have never been made explicit before, but are implicit in publicly available literature. Semantic predications have been demonstrated to be important constructs in facilitating LBD by providing context among associated concepts.

Swanson's Hypotheses

Much of the early research aimed at rediscovering Swanson's Hypotheses focused almost entirely on Information Retrieval (IR) techniques, such as term and concept co-occurrence. Only recently has significant attention been devoted to semantics-based techniques that exploit the meaning of associations between concepts. While generally more intuitive, the feasibility of such semantics-based approaches has not been fully established. It is reasonable to expect that if semantics-based techniques are adequate for discovering new knowledge, they ought to be sufficient for recovering existing knowledge.

RS-DFO Hypothesis

In this work we investigate the applicability of semantics-based techniques for recovering and decomposing Swanson's Raynaud Syndrome--Fish Oil hypothesis using semantic predications, background knowledge and graph-based algorithms for path extraction and subgraph creation. Below is a presentation, and various datasets and experimental results for download and consumption.


Datasets and Experimental Results
  1. Dataset
    1. Baseline (B1)
      1. Original PDFs of the 65 articles cited by Swanson's RS-DFO paper (30.5MB)
      2. ASCII text with end-of-line text wrapping fixed
      3. Text in Medline format for parsing by SemRep
      4. SemRep Relations Output
      5. SemRep Relations Output (vascular reactivity)
      6. SemRep Extracted Predications
      7. Manually Identified Predications (vascular reactivity)
    2. Baseline (B2)
      1. Titles and abstracts of the 65 articles cited by Swanson's RS-DFO paper in Medline format for parsing by SemRep
      2. SemRep Relations Output
      3. SemRep Relations Output (vascular reactivity)
      4. SemRep Extracted Predications
      5. Manually Identified Predications (vascular reactivity)
  2. Experimental Results
    1. Association-Subgraph Comparisons (Experiment I)
    2. Association-Subgraph Comparisons (Experiment II)
    3. All Generated Subgraphs (Experiments 1 & 2)

Information Retrieval

Another application of semantic predications is to the field of Information Retrieval. By modeling documents as a collection of predications (i.e., a subgraph), and modeling a search query as a subgraph as well, documents can be ranked based on their semantic similarity to the search query using subgraph-to-subraph query processing.

Publications

  1. D. Cameron, R. Kavuluru, O. Bodenreider, P. N. Mendes, A. P. Sheth, K. Thirunarayan, Semantic Predications for Complex Information Needs in Biomedical Literature, 5th International Conference on Bioinformatics and Biomedicine BIBM2011, Atlanta GA, November 12-15, 2011 (acceptance rate=19.4%)
  2. D. Cameron, O. Bodenreider, H. Yalamanchili, T. Danh, S. Vallabhaneni, K. Thirunarayan, A.P. Sheth, T.C. Rindflesch A Graph-Based Decomposition of Swanson’s Hypothesis using Semantic Predications, Journal of Biomedical Informatics (Under preparation for submission)

See Also

Swanson's Hypotheses
Reachability

Contact: Delroy Cameron