Twarql Continuous Semantics

From Knoesis wiki
Revision as of 21:56, 13 April 2011 by Pavan (Talk | contribs) (Architecture and Approach)

Jump to: navigation, search

Introduction

The recent years have seen a significant change in the dissemination of news and information. Observations of unfolding events are increasingly shared real-time through ubiquitously accessible microblogging platforms. However, the information being shared is growing exponentially. Twitter alone generates more than 100 Million microposts a day. This avalanche of data makes it difficult to seek out specific information, especially when done real-time. Event-specific information is often only temporarily interesting and gets stale quickly. To achieve the highest information gain it is important that select content finds its way to the user quickly. This kind of information tracking has proved its importance in the recent Egypt protests where twitter and other social networking sites were used as major platforms for protesters to organize gatherings and to stay updated with major changes in the event. This paper presents a semantic web approach to support dynamic event tracking on twitter.

In this work we offer a solution to the problem of event following based on the dynamic creation of semantic event models. The user will need to specify his area of interest only once, when an event model is automatically created. As the event unfolds, microposts are analyzed and, based on new developments, an updated model is created that subsequently filters microposts for the next iteration in the cycle. This work thus presents an early realization of Continuous Semantics.

Architecture and Approach

The following two applications forms an integral part of the architecture

Twarql

Doozer

Event information enters the cycle as streaming microposts from Twitter. Twarql filters microposts matching certain user-defined constraints (e.g. a SPARQL query), creating a corpus of relevant microposts. Keyphrase extraction techniques are used to select prominent relevant terms, in order to keep focus on the unfolding event. The selected keyphrases are fed into Doozer for the automatic creation of an, a domain model that specifically describes the event of our focus. The model is then translated into a Twarql filter. The last step on the cycle is then to update the micropost filter in Twarql to reflect the model created by Doozer.

Continuous semantics architecture.png

People

Pavan Kapanipathi
Christopher Thomas
Pablo Mendes
Amit Sheth

References