Twitris

From Knoesis wiki
Revision as of 17:59, 1 March 2010 by Pablo (Talk | contribs)

Jump to: navigation, search


TWITRIS is a platform for observing social signals from real time social data as exemplified by Twitter. Real-time social activity has moved to platforms such as Twitter where people share their thoughts, views, opinions and information in microblogging formats. The network structure (e.g., follower subscriptions) and conversational practices (e.g., retweeting) mold an online discourse; the analysis of which can give us near real-time insight into the observations made by people. It also allows us to extract semantic summaries of people observations along spatio-temporal-thematic (STT) dimensions of the data and answer questions of the kinds below:

  • What are people talking about today? what are the main topics on their mind? For example, what themes of the "health care debate" are most discussed on date X in state Y in the US?
  • For any given theme/topic, how are key concerns (topics of discussion) changing over a period of time?
  • Are there regional differences in the opinions on a given theme/topic?

Developing such a platform is challenging for several reasons.

  • One needs to understand the user generated content, often relaxed in terms of grammar and conventional writing practices.
  • Twitter in particular imposes a character limit (that makes for creative writing [5]) and has its own conversation lingo (use of # for topic categorization, @ for people references etc.)
  • The volume of information is massive (millions of tweets a day), not all of it is relevant to a topic under investigation and not all of it can be stored in the long term.
  • Access to potentially available data is also limited due to technical, privacy and business limitations. These have consequences on the statistical processing of content to pull out meaningful social signals.

Twitris as you see is a work in progress, but is rapidly maturing. Technical details can be found in [3]. Data aggregation/cleaning, text processing/analysis etc. are highly compute intensive tasks. Following the "Health Care Reform" that we analyzed on a state wide basis for the US, we will next introduce "Iran Elections" which will provide global country wide assessment of social signals. As of now, our system hosts analysis up until a week prior to the current date and we intend to reduce this period. Analysis of real-time data for more events will be added as time permits.

COMING SOON... Twitris currently performs the Spatial, Temporal and Thematic analysis of the currently popular content on Twitter and so, does answer what is popular. Our current work focuses on analyzing Why and How that content is popular. We address the challenges of finding those elements of content and network properties, which contributes in this information diffusion / virality of the content.

TWITRIS is part of a larger research agenda on semantics-enriched social computing [1, 2, 4] at the Kno.e.sis Center at the Wright State University, Dayton, Ohio (other key themes include semantics-enriched services computing and the sensor Web). For some of the related material, see:

[1] A. Sheth, Semantic Integration of Citizen Sensor Data and Multilevel Sensing: A comprehensive path towards event monitoring and situational awareness, February 17, 2009. [2] A. Sheth, Citizen Sensing, Social Signals, and Enriching Human Experience- IEEE Internet Computing, July/August 2009. [3] M. Nagarajan et al., Spatio-Temporal-Thematic Analysis of Citizen-Sensor Data - Challenges and Experiences, Tenth International Conference on Web Information Systems Engineering, Oct 5-7, 2009, Poland. [4] What are people talking about, Why people write, How people write: Meena Najarajan's research [5] Real Time Web - A primer Part I and Part II, August 29, 2009 How to Twitris