ResQu

From Knoesis wiki
Jump to: navigation, search

The acronym stands for Rich Knowledge Enabled Evaluation of Summary Quality. This is a an automatic evaluation framework of abstracitve summary quality.

Motivation

The following presentation gives more details about the approach for knowledge rediscovery and decomposition. Various datasets and experimental results are also provided.

ResQu: A Framework for Automatic Evaluation of Knowledge-Driven Automatic Summarization

Here we describe the implementation for the ResQu, which is a system for automatic evaluation of summaries. Automatic generation of summaries that capture the salient aspects of a search resultset (i.e., automatic summarization) has become an important task in biomedical research. Automatic summarization offers an avenue for overcoming the information overload problem prevalent in large online digital libraries.

However, across many of the knowledge-driven approaches for automatic summarization it is not always clear which features highly impact or influence the quality of a summary. Instead, there has been considerable focus on utilizing schema knowledge to facilitate browsing and exploration of generated summaries a posteriori. Informative features should not be ignored, since they could be utilized to help optimize the models that generate these semantic summaries in the first place. In this research, we adopt a leave-one-out approach to assess the impact of various features on the quality of automatically generated summaries that contain structured background knowledge.

We first create the gold standard summaries, using information-theoretic methods, by extraction and validation, then the semantic summaries are transformed into an equivalent textual format. Finally, various similarity metrics, such as cosine similarity, euclidean distance, and Jensen-Shannon divergence are computed under different feature combinations, to assess summary quality against the textual gold standard. We report on the relative importance of the various features used to automatically generate the semantic summaries in a biomedical application. Our evaluation suggests that the proposed approach is an effective automatic evaluation method for assessing feature importance in automatically generated semantic summaries.


Experiments

With the help from domain experts at the National Institutes of Health(NIH), we compiled a carefully selected list of 20 diseases. This list consists of both well-known and rarely occurring diseases.

Gold standard dataset creation

  1. We selected a list of 3 online resources, which were good authoritative source for treatment information of diseases