DAO

From Knoesis wiki
Revision as of 01:35, 25 July 2017 by Farahnaz (Talk | contribs) (Hierarchical structure of DAO)

Jump to: navigation, search

DAO is the acronym for Drug Abuse Ontology.The PREDOSE research team at Knoesis has developed preliminary techniques that automatically extract semantic information from Web-based data. Such includes entities, generic sentiment expressions, relationships and triples. To perform entity identification, the research team relies on a combination of lexical and semantics-based techniques, based on a manually curated Drug Abuse Ontology (DAO) - pronounced dow), which is the first ontology for prescription drug abuse.

Automatic Qualitative Coding

This is the most challenging aspect of PREDOSE. The aim is to use various information extraction techniques to extraction semantic information considered semantically equivalent to qualitative codes, from web forums. Drug Abuse Ontology (DAO) manually created to model the prescription drug abuse domain, which is the first ontology on drug abuse in the literature. The current DAO is available online. The DAO is used to facilitate search, and it also serves as the annotation scheme for the entity, relationship and triple extraction.

OntoGrapg of DAO

Fig1: DAO First Version OntoGrapg

Hierarchical structure of DAO

l structure]]

DAO schema in the PREDOSE Research Plan

Fig3: PREDOSE Research Plan

Method: Using custom web crawlers that scrape UGC from publicly available web forums, PREDOSE first automates the collection of web-based social media content for subsequent semantic annotation. The annotation scheme is modeled in the DAO and includes domain specific knowledge such as prescription (and related) drugs, methods of preparation, side effects, and routes of administration. The DAO is also used to help recognize three types of data, namely: (1) entities, (2) relationships and (3) triples. PREDOSE then uses a combination of lexical and semantics-based techniques to extract entities and relationships from the scraped content and a top-down approach for triple extraction that uses patterns expressed in the DAO.

Information Extraction Layer

The information extraction layer of the PREDOSE platform (Fig. 3) utilizes the DAO to extract entities, relationships, and triples. For sentiments, in an adaptation of the technique originally developed in [16] by Chen et al. (a co-author in this work) is made for web forum texts. The Automatic Qualitative Coding Module of PREDOSE, therefore, consists of the following five components: (1) the Drug Abuse Ontology; (2) an entity identification component; (3) a relationship identification component (4) a triple extraction component and (5) a sentiment extraction component.

Fig4: DAO Information Extraction Layer