Difference between revisions of "Human Performance and Cognition Ontology"

From Knoesis wiki
Jump to: navigation, search
Line 1: Line 1:
 
'''Introduction'''
 
'''Introduction'''
  
The human performance and cognition ontology (HPCO) project involves extending our work in focused knowledge (entity-relationship) extraction from scientific literature, automatic taxonomy extraction from selected community authored content (eg Wikipedia), and semi-automatic ontology development with limited expert guidance. These will be combined to create an ontology engineering system that will allow domain experts to semi-automatically create ontologies through an iterative process resulting in a comprehensive human performance ontology.
+
Background knowledge bases are well known to be valuable resources in scientific disciplines. They not only serve the purpose of standalone references of the basic knowledge pertaining to a discipline, but also assist in tasks such as information extraction, literature search and exploration, etc. Because the construction of such knowledge bases requires timely and significant human expert involvement, it is becoming increasing important to explore computational techniques to automate it to keep pace with the large amount and the dynamic nature of new information.  Although manually built ontologies are of high quality (or precision), their automatic counterparts that offer high recall also suffice as backbones for focused browsing and knowledge discovery. More recently semi-automatic approaches that combine human efforts with automatic tools have gained prominence. This is because organization and up-to-date maintenance of the knowledge bases to facilitate investigative research can only be accomplished by a suitable accompaniment of effective tools that match the needs of the researchers.  The human performance and cognition ontology (HPCO) project aims to full these two major objectives
 +
 
 +
#Build a knowledge base using semi-automatic domain hierarchy construction and relationship extraction from PubMed citations;
 +
#Build a tool to browse and explore scientific literature with the help of the knowledge base created in 1.
 +
 
 +
The project involves extending our work in focused knowledge (entity-relationship) extraction from scientific literature, automatic taxonomy extraction from selected community authored content (eg Wikipedia), and semi-automatic ontology development with limited expert guidance. These are combined to create a framework that will allow domain experts and computer scientists to semi-automatically create knowledge bases through an iterative process.
 
The final goal is to provide superior (both in quality and speed) search and retrieval over scientific literature for life scientists that will enable them to elicit valuable information in the area of human performance and cognition.
 
The final goal is to provide superior (both in quality and speed) search and retrieval over scientific literature for life scientists that will enable them to elicit valuable information in the area of human performance and cognition.
 
The project is funded by the human effectiveness directorate of the air force research lab (AFRL) at the Wright-Patterson air force base.
 
The project is funded by the human effectiveness directorate of the air force research lab (AFRL) at the Wright-Patterson air force base.
Line 13: Line 18:
 
Postdocs: Ramakanth Kavuluru, [https://sites.google.com/site/pritiparikhphd/ Priti Parikh]
 
Postdocs: Ramakanth Kavuluru, [https://sites.google.com/site/pritiparikhphd/ Priti Parikh]
  
'''Project Architecture and Status'''
+
'''Project Architecture and Components'''
 
----
 
----
The project has four components
+
The two broad objectives are accomplished using the following steps:
 
#An initial hierarchy of concepts in the area of human performance and cognition is built by using our prior work on domain model extraction through Wikipedia <!--doozer link here-->. The model is approved by the experts at AFRL and forms the basis of the ontology. This component provides the focus for the ontology and its usage.
 
#An initial hierarchy of concepts in the area of human performance and cognition is built by using our prior work on domain model extraction through Wikipedia <!--doozer link here-->. The model is approved by the experts at AFRL and forms the basis of the ontology. This component provides the focus for the ontology and its usage.
#Natural language processing (NLP) based entity and relationship extraction is performed on PubMed abstracts to facilitate enhanced information extraction. Due to the complex nature of the entities and and inherent variations in the writing styles of authors of PubMed articles, this component is continually evolving although the initial set of results we have are already promising.
+
#Natural language processing (NLP) based entity and relationship extraction is performed on PubMed abstracts to facilitate enhanced information extraction. Due to the complex nature of the entities and and inherent variations in the writing styles of authors of PubMed articles, this component is continually evolving although the initial set of results we have are already promising. Relationships between concepts in the hierarchy formed in component 1 are also found through pattern based approaches. The results in this component complement those obtained using NLP based techniques.
#Relationships between concepts in the hierarchy formed in component 1 are also found through pattern based approaches. The results in this component complement those obtained in component 2 and are currently being evaluated by experts at AFRL.
+
#Relationships (triples) extracted from step 2 are mapped onto the hierarchy built in step 1.
#While the first three components are the foundation, the search and query component is the one the users interact with and hence much current focus is on this component. Efficient searching and browsing of the semantic trails facilitated by results from components 2 and 3 is provided. Techniques such as boosting are being employed to improve the performance and quality of ranked results
+
#While the first three components are the foundation, the search and query component is the one the users interact with and hence much current focus is on this component. Efficient searching and browsing of the semantic trails facilitated by results from components 2 and 3 is provided.
  
 +
The following picture
  
[[Image:hpcoarch.png]]
 
  
 
'''Publications'''
 
'''Publications'''
 
----
 
----

Revision as of 22:43, 7 February 2011

Introduction

Background knowledge bases are well known to be valuable resources in scientific disciplines. They not only serve the purpose of standalone references of the basic knowledge pertaining to a discipline, but also assist in tasks such as information extraction, literature search and exploration, etc. Because the construction of such knowledge bases requires timely and significant human expert involvement, it is becoming increasing important to explore computational techniques to automate it to keep pace with the large amount and the dynamic nature of new information. Although manually built ontologies are of high quality (or precision), their automatic counterparts that offer high recall also suffice as backbones for focused browsing and knowledge discovery. More recently semi-automatic approaches that combine human efforts with automatic tools have gained prominence. This is because organization and up-to-date maintenance of the knowledge bases to facilitate investigative research can only be accomplished by a suitable accompaniment of effective tools that match the needs of the researchers. The human performance and cognition ontology (HPCO) project aims to full these two major objectives

  1. Build a knowledge base using semi-automatic domain hierarchy construction and relationship extraction from PubMed citations;
  2. Build a tool to browse and explore scientific literature with the help of the knowledge base created in 1.

The project involves extending our work in focused knowledge (entity-relationship) extraction from scientific literature, automatic taxonomy extraction from selected community authored content (eg Wikipedia), and semi-automatic ontology development with limited expert guidance. These are combined to create a framework that will allow domain experts and computer scientists to semi-automatically create knowledge bases through an iterative process. The final goal is to provide superior (both in quality and speed) search and retrieval over scientific literature for life scientists that will enable them to elicit valuable information in the area of human performance and cognition. The project is funded by the human effectiveness directorate of the air force research lab (AFRL) at the Wright-Patterson air force base.

Project Team


PI: Amit Sheth
Students: Christopher Thomas, Wenbo Wang, Delroy Cameron
Postdocs: Ramakanth Kavuluru, Priti Parikh

Project Architecture and Components


The two broad objectives are accomplished using the following steps:

  1. An initial hierarchy of concepts in the area of human performance and cognition is built by using our prior work on domain model extraction through Wikipedia . The model is approved by the experts at AFRL and forms the basis of the ontology. This component provides the focus for the ontology and its usage.
  2. Natural language processing (NLP) based entity and relationship extraction is performed on PubMed abstracts to facilitate enhanced information extraction. Due to the complex nature of the entities and and inherent variations in the writing styles of authors of PubMed articles, this component is continually evolving although the initial set of results we have are already promising. Relationships between concepts in the hierarchy formed in component 1 are also found through pattern based approaches. The results in this component complement those obtained using NLP based techniques.
  3. Relationships (triples) extracted from step 2 are mapped onto the hierarchy built in step 1.
  4. While the first three components are the foundation, the search and query component is the one the users interact with and hence much current focus is on this component. Efficient searching and browsing of the semantic trails facilitated by results from components 2 and 3 is provided.

The following picture


Publications