Difference between revisions of "Scooner"

From Knoesis wiki
Jump to: navigation, search
Line 3: Line 3:
 
The limitations of key word based search are well known in the information retrieval field.
 
The limitations of key word based search are well known in the information retrieval field.
 
These are more evident in life sciences where most of the reliable scientific information is spread
 
These are more evident in life sciences where most of the reliable scientific information is spread
across biomedical literature in the form of raw text journal articles. Unlike the Web, these journal
+
across biomedical literature in the form textual journal articles. Unlike the Web, these journal
 
articles are devoid of hyper links and multiple key word based searches need to be performed
 
articles are devoid of hyper links and multiple key word based searches need to be performed
 
while aggregating and organizing search results that the user finds interesting. This makes
 
while aggregating and organizing search results that the user finds interesting. This makes
Line 9: Line 9:
  
 
'''Knowledge-based search systems''' are proposed as an improvement over conventional search
 
'''Knowledge-based search systems''' are proposed as an improvement over conventional search
and has gained popularity especially given the availability many expert curated vocabularies
+
and have gained popularity especially given the availability of many expert curated vocabularies
 
and taxonomies in the biomedical domains. The different classes in a given taxonomy are used to
 
and taxonomies in the biomedical domains. The different classes in a given taxonomy are used to
 
provide faceted search over articles that contain the instances of these classes.  
 
provide faceted search over articles that contain the instances of these classes.  
The taxonomies and other forms of ontologies are mostly static blocks of well accepted consensual
+
These taxonomies and other forms of ontologies are mostly static blocks of well accepted consensual
 
knowledge. Also, most of these standard ontologies have a limited number of predicates (or relationship
 
knowledge. Also, most of these standard ontologies have a limited number of predicates (or relationship
types) such as "''part of''" or "''is a''". We believe the search process can benefit from recently published results that are not well
+
types) such as "''part of''" and "''is a''". We believe the search process can benefit from recently published results that are not well
 
known in the research community and also by relationship types that go beyond the taxonomic ones.
 
known in the research community and also by relationship types that go beyond the taxonomic ones.
Scooner is a knowledge-based literature search and exploration system that is built upon these intuitive ideas.
+
Scooner is a knowledge-based literature search and exploration system that is built upon this intuition.
 
We are working on providing more powerful knowledge-based search where recently published results are
 
We are working on providing more powerful knowledge-based search where recently published results are
 
computationally extracted and used a background KB to guide the search process. '''''The key here is that the knowledge-base that guides search is extracted from the same universe of literature that is being explored'''''.
 
computationally extracted and used a background KB to guide the search process. '''''The key here is that the knowledge-base that guides search is extracted from the same universe of literature that is being explored'''''.
  
In Scooner, search is modeled as an interactive process where, besides a search box for key word input, the points of
+
'''Search Process''': In Scooner, search is modeled as an interactive process where, besides a search box for key word input, the points of
 
interaction are based on domain specific assertions (or triples) of the form: subject -> predicate -> object (ex: muscarinic activation -> facilitates -> long-term
 
interaction are based on domain specific assertions (or triples) of the form: subject -> predicate -> object (ex: muscarinic activation -> facilitates -> long-term
 
potentiation). Raw text results are input to a spotter module that annotates them with entities found in the triples used
 
potentiation). Raw text results are input to a spotter module that annotates them with entities found in the triples used
as background KB. Clicking on annotated entities displays all triples where it participates as a subject or object. Clicking
+
as background knowledge. Clicking on an annotated entity displays all triples where it participates as a subject or object. Clicking
on the corresponding object/subject would then bring up articles that potentially contain that triple and in most cases
+
on the corresponding object/subject would then bring up articles that potentially contain that triple; in most cases
 
the original abstract from which the triple was extracted is listed in the top 2 or 3 articles. This way the triples can be
 
the original abstract from which the triple was extracted is listed in the top 2 or 3 articles. This way the triples can be
browsed in the context of the abstracts in which they were found.
+
browsed in the context of the abstracts in which they were found. New implicit knowledge can also be discovered by building trails from individual triples.  
  
Scooner combines these ideas of triple-based search and exploration with persistent search sessions. Users can create search projects and
+
'''Collaborative Extension'''s: Scooner combines these ideas of triple-based search and exploration with persistent search sessions. Users can create search projects and
 
store their search history including the abstracts they felt important, triples they found useful, and also collaborate with colleagues.
 
store their search history including the abstracts they felt important, triples they found useful, and also collaborate with colleagues.
Users can also create new meaningful trails by combining individual triples they explore. The workbench in Scooner facilitates a central aggregation of important abstracts imported for further review. The work bench can be filtered to only show only those abstracts that pertain to a selected set of triples or trails. Additionally, collaborative features were incorporated using which users can create persistent search projects, write comments on abstracts they find relevant, and share the (sub) projects with other users on a public dashboard.  
+
The workbench in Scooner facilitates a central aggregation of important abstracts imported for further review. The work bench can be filtered to only show only those abstracts that pertain to a selected set of triples or trails. Additionally, collaborative features were incorporated using which users can create persistent search projects, write comments on abstracts they find relevant, and share the (sub) projects with other users on a public dashboard.  
 +
 
 
Currently Scooner's KB comes from the [http://wiki.knoesis.org/index.php/Human_Performance_and_Cognition_Ontology human performance and cognition ontology project] and the literature explored is the set of all abstracts available via PubMed as of Oct 2010.
 
Currently Scooner's KB comes from the [http://wiki.knoesis.org/index.php/Human_Performance_and_Cognition_Ontology human performance and cognition ontology project] and the literature explored is the set of all abstracts available via PubMed as of Oct 2010.
 
The knowledge-base is created for the domain of human performance and cognition and is extracted from articles on [http://www.ncbi.nlm.nih.gov/pubmed/ PubMed] published
 
The knowledge-base is created for the domain of human performance and cognition and is extracted from articles on [http://www.ncbi.nlm.nih.gov/pubmed/ PubMed] published

Revision as of 02:30, 19 February 2011

Overview


The limitations of key word based search are well known in the information retrieval field. These are more evident in life sciences where most of the reliable scientific information is spread across biomedical literature in the form textual journal articles. Unlike the Web, these journal articles are devoid of hyper links and multiple key word based searches need to be performed while aggregating and organizing search results that the user finds interesting. This makes literature search a tedious task in life sciences.

Knowledge-based search systems are proposed as an improvement over conventional search and have gained popularity especially given the availability of many expert curated vocabularies and taxonomies in the biomedical domains. The different classes in a given taxonomy are used to provide faceted search over articles that contain the instances of these classes. These taxonomies and other forms of ontologies are mostly static blocks of well accepted consensual knowledge. Also, most of these standard ontologies have a limited number of predicates (or relationship types) such as "part of" and "is a". We believe the search process can benefit from recently published results that are not well known in the research community and also by relationship types that go beyond the taxonomic ones. Scooner is a knowledge-based literature search and exploration system that is built upon this intuition. We are working on providing more powerful knowledge-based search where recently published results are computationally extracted and used a background KB to guide the search process. The key here is that the knowledge-base that guides search is extracted from the same universe of literature that is being explored.

Search Process: In Scooner, search is modeled as an interactive process where, besides a search box for key word input, the points of interaction are based on domain specific assertions (or triples) of the form: subject -> predicate -> object (ex: muscarinic activation -> facilitates -> long-term potentiation). Raw text results are input to a spotter module that annotates them with entities found in the triples used as background knowledge. Clicking on an annotated entity displays all triples where it participates as a subject or object. Clicking on the corresponding object/subject would then bring up articles that potentially contain that triple; in most cases the original abstract from which the triple was extracted is listed in the top 2 or 3 articles. This way the triples can be browsed in the context of the abstracts in which they were found. New implicit knowledge can also be discovered by building trails from individual triples.

Collaborative Extensions: Scooner combines these ideas of triple-based search and exploration with persistent search sessions. Users can create search projects and store their search history including the abstracts they felt important, triples they found useful, and also collaborate with colleagues. The workbench in Scooner facilitates a central aggregation of important abstracts imported for further review. The work bench can be filtered to only show only those abstracts that pertain to a selected set of triples or trails. Additionally, collaborative features were incorporated using which users can create persistent search projects, write comments on abstracts they find relevant, and share the (sub) projects with other users on a public dashboard.

Currently Scooner's KB comes from the human performance and cognition ontology project and the literature explored is the set of all abstracts available via PubMed as of Oct 2010. The knowledge-base is created for the domain of human performance and cognition and is extracted from articles on PubMed published by Aug 2008. Initial evaluations of Scooner by researchers at the AFRL indicate that Scooner does better than NLM's PubMed search tool. For a screencast of Scooner in use, please visit: http://knoesis.wright.edu/library/demos/scooner-demo/

Project Team


Undergraduate Students: Alan Smith, Paul Fultz
Graduate Students: Delroy Cameron, Christopher Thomas, Wenbo Wang
Postdocs: Ramakanth Kavuluru
Faculty: Amit Sheth
Former students who contributed to previous incarnations of Scooner: Pablo Mendes, Cartic Ramakrishnan

Architecture and Components


The following picture shows various components of Scooner

Scooner-arch.jpg