Difference between revisions of "Twitris"

From Knoesis wiki
Jump to: navigation, search
(Citation Information)
(References)
Line 124: Line 124:
  
 
=References=
 
=References=
#A. Sheth (2009-a), 'Semantic Integration of Citizen Sensor Data and Multilevel Sensing: A comprehensive path towards event monitoring and situational awareness,' From E-Gov to Connected Governance: the Role of Cloud Computing, Web 2.0 and Web 3.0 Semantic Technologies, Fall Church, VA, USA, February 17, 2009
+
#A. Sheth (2009-a), [http://knoesis.wright.edu/library/resource.php?id=00702 'Semantic Integration of Citizen Sensor Data and Multilevel Sensing: A comprehensive path towards event monitoring and situational awareness,'] From E-Gov to Connected Governance: the Role of Cloud Computing, Web 2.0 and Web 3.0 Semantic Technologies, Fall Church, VA, USA, February 17, 2009
#A. Sheth (2009-b), 'Citizen Sensing,Social Signals, and Enriching Human Experience', IEEE Internet Computing, pp. 80-85, July/August 2009
+
#A. Sheth (2009-b), [http://knoesis.wright.edu/library/resource.php?id=00158 'Citizen Sensing,Social Signals, and Enriching Human Experience'], IEEE Internet Computing, pp. 80-85, July/August 2009
#M. Nagarajan, K. Gomadam, A. Sheth, A. Ranabahu, R. Mutharaju and A. Jadhav (2009-a), 'Spatio-Temporal-Thematic Analysis of Citizen-Sensor Data - Challenges and Experiences,' Tenth International Conference on Web Information Systems Engineering, Poznan, Poland, October 5-7, 2009
+
#M. Nagarajan, K. Gomadam, A. Sheth, A. Ranabahu, R. Mutharaju and A. Jadhav (2009-a), [http://knoesis.wright.edu/library/resource.php?id=00559 'Spatio-Temporal-Thematic Analysis of Citizen-Sensor Data - Challenges and Experiences,'] Tenth International Conference on Web Information Systems Engineering, Poznan, Poland, October 5-7, 2009
#H. Purohit, Y. Ruan, A. Joshi, S. Parthasarathy, A. Sheth (2011-a), Understanding User- Community Engagement by Multi-faceted Features: A Case Study on Twitter. SoME 2011 (Workshop on Social Media Engagement, in conjunction with WWW 2011), Hyderabad, India, March 28 - April 1, 2011
+
#H. Purohit, Y. Ruan, A. Joshi, S. Parthasarathy, A. Sheth (2011-a), [http://knoesis.org/library/resource.php?id=1095 Understanding User- Community Engagement by Multi-faceted Features: A Case Study on Twitter]. SoME 2011 (Workshop on Social Media Engagement, in conjunction with WWW 2011), Hyderabad, India, March 28 - April 1, 2011
#A. Jadhav, H. Purohit, P. Kapanipathi, P. Ananthram, A. Ranabahu, V. Nguyen, P. Mendes, A. G. Smith, M. Cooney, A. Sheth (2010), Twitris 2.0: Semantically Empowered System for Understanding Perceptions From Social Data , Semantic Web Application Challenge at ISWC, Shanghai, China, November 7-11, 2010
+
#A. Jadhav, H. Purohit, P. Kapanipathi, P. Ananthram, A. Ranabahu, V. Nguyen, P. Mendes, A. G. Smith, M. Cooney, A. Sheth (2010), [http://knoesis.org/library/resource.php?id=1702 Twitris 2.0: Semantically Empowered System for Understanding Perceptions From Social Data] , Semantic Web Application Challenge at ISWC, Shanghai, China, November 7-11, 2010
#L. Chen, W. Wang, M. Nagarajan, S. Wang and A. Sheth (2012), Extracting Diverse Sentiment Expressions with Target-dependent Polarity from Twitter. In Proceedings of the 6th International AAAI Conference on Weblogs and Social Media (ICWSM), Dublin, Ireland, June 5-7, 2012
+
#L. Chen, W. Wang, M. Nagarajan, S. Wang and A. Sheth (2012), [http://knoesis.org/library/resource.php?id=1689 Extracting Diverse Sentiment Expressions with Target-dependent Polarity from Twitter]. In Proceedings of the 6th International AAAI Conference on Weblogs and Social Media (ICWSM), Dublin, Ireland, June 5-7, 2012
 
#W. Wang, L. Chen, K. Thirunarayan and A. Sheth. Harnessing Twitter 'Big Data' for Automatic Emotion Identification (2012), In Proceedings of International Conference on Social Computing (SocialCom), 2012, Amsterdam, Netherlands, September 3-5, 2012
 
#W. Wang, L. Chen, K. Thirunarayan and A. Sheth. Harnessing Twitter 'Big Data' for Automatic Emotion Identification (2012), In Proceedings of International Conference on Social Computing (SocialCom), 2012, Amsterdam, Netherlands, September 3-5, 2012
 
#M. Nagarajan, K. Baid, A. Sheth, and S. Wang, 'Monetizing User Activity on Social Networks - Challenges and Experiences' (2009-b), IEEE/WIC/ACM International Conference on Web Intelligence, Milan, Italy, September 15-18 2009
 
#M. Nagarajan, K. Baid, A. Sheth, and S. Wang, 'Monetizing User Activity on Social Networks - Challenges and Experiences' (2009-b), IEEE/WIC/ACM International Conference on Web Intelligence, Milan, Italy, September 15-18 2009

Revision as of 16:33, 26 April 2013

Twitris, a Semantic Web application that facilitates understanding of social perceptions by Semantics-based processing of massive amounts of event-centric data. Twitris 2.0 addresses challenges in large scale processing of social data, preserving spatio-temporal-thematic properties. Twitris 2.0 also covers context based semantic integration of multiple Web resources and expose semantically enriched social data to the public domain. Semantic Web technologies enable the system's integration and analysis abilities.

Introduction

Well over a billion people have become 'citizens' of an Internet- or Web-enabled social community. Web 2.0 fostered the open environment and applications for tagging, blogging, wikis, and social networking sites that have made information consumption, production, and sharing so incredibly easy. With over 5 billion mobile connections, over a billion with data connections (smartphones) and with many more having ability to communicate using SMS, digital media can be shared with the rest of the humanity instantly. As a result, humanity is interconnected as never before. This interconnected network of people actively observe, report, collect, analyze, and disseminate information via text, audio, or video messages, increasingly through pervasively connected mobile devices, has led to what we term citizen sensing (Sheth, 2009-a) (Sheth, 2009-b). This phenomenon is different from the traditional centralized information dissemination and consumption environments where citizens primarily act as consumers of reported information from several authoritative sources.

Figure 1: Twitris- three primary dimensions of analysis

This citizen sensing is complemented by the growing ability to access, integrate, dissect, and analyze individual and collective thinking of humanity, giving us a capability that is recognized as collective intelligence. Citizen sensing involves humans in the loop, and with it all the complexities associated with and intelligence captured in human communication. As citizen sensing has gained momentum, it’s generating millions of observations, creating significant information overload. In many cases it becomes nearly impossible to make sense of the information around a topic of interest. Given this data deluge, analyzing the numerous social signals can be extremely challenging. In response to this growing citizen sensing data deluge, Twitris has been developed with the vision of performing semantics-empowered analysis of a broad variety of social media exchanges.

Twitris, named by combining Twitter with Tetris, a tile-matching puzzle game, has incorporated increasingly sophisticated analysis of social data and associated metadata, combining it with background knowledge, and more recently (albeit not discussed here) machine sensor or data captured from sensors and devices that make up Internet of Things(IoT). Twitris’ evolution can be characterized in three phases (and corresponding versions of the system). Figure 1 outlines the corresponding dimensions Twitris considers.

Twitris is a comprehensive platform for analyzing social content along multiple dimensions leading to in-depth insights into various aspects of an event or a situation. The central thesis behind this work is that citizen sensor observations are inherently multi-dimensional in nature and taking these dimensions into account while processing, aggregating, connecting and visualizing data will provide useful organization and consumption principles. Twitris evolved in three phases, characterized by the versions of the systems:

  • Twitris v1: Spatio-Temporal-Thematic (STT) processing of Twitter and associated news, multimedia and Wikipedia content (Sheth, 2009-b), (Nagarajan, 2009-a) (Jadhav, 2010)
  • Twitris v2: People-Content-Network Analysis (PCNA) (Purohit, 2011-a) with use of background knowledge and semantic metadata extraction and querying/exploration
  • Twitris v3: sentiment-emotion-intent (SEI) extraction (Chen, 2012), (Wang 2012), (Nagarajan, 2009-b) along with personalization (Kapanipathi, 2011-a) and emerging

continuous semantics (Sheth, 2010) capability involving semantic streaming social stream (i.e., real-time) processing using dynamically generated and updated domain models for semantics and context
The above versions, or phases, of Twitris development is not as granular as painted above, that is, the issues identified above are not explicitly segregated by the version of the Twitris which has been in continuous development with senior students graduating and new students picking up the work. Four talks including a tutorial cover many of the issues covered by Twitris (Sheth, 2009-a), (Nagarajan, 2010-a), (Nagarajan, 2011), (Sheth, 2011).

Key Points

Social media group at Kno.e.sis investigates the role and benefits of using semantic approach, especially by metadata extraction and enrichments and contextually applying relevant background knowledge, along with demonstrating examples on real-world data using system (Twitris) developed at Kno.e.sis.

  1. Event-specific analysis of citizen sensing and discuss opportunities and challenges in understanding temporal, spatial and thematic cues
  2. Facets of people-content-network analysis with focus on user-community engagement analysis
  3. Real time social media data analysis, and the concept of continuous semantics supported by dynamic model creation
  4. Sentiment and emotion identification from citizen sensing data
  5. Recent advance in developing semantic abstracts or semantic perception to convert massive amounts of raw observational data into nuggets of information and insights that can aid in human decision making

Historical Background

The idea for research and technology development leading to Twitris occurred on November 26, 2008. Terrorists struck Mumbai, India, and over the next three days, they proceeded to make mayhem in nine locations. Each of the nine sub-events of this overall event separated by time and location (space) had distinct thematic elements or topical content. The importance of Twitter, especially in terms of citizen sensing - the ability of a regular person to use his or her mobile device to share his or her personal observation, thoughts and belief- well before a traditional news media has a chance to do reporting and to shape opinions - was extensively discussed in the immediate aftermath of this momentous event. This event also gave us a clear case for the needs and benefits of analyzing social media content such as tweets and flickr posts, and related news stories along the three dimensions of spatial (location of observation) - where, temporal (time of observation) - when, and thematic (the event in question) -what (Battle, 2009), (Impact Lab, 2008), (Keralaravind, 2008).

Twitris Platform and Three Stages of Its Evolution

Twitris v1: Spatio-Temporal-Thematic (STT) processing of Twitter and associated news, multimedia and Wikipedia content

Figure 2:  A snapshot of spatio-temporal-thematic slice of citizen sensing: showing content related to Mumbai terrorism (thematic) related to Taj hotel (spatial, thematic), during a period of interest (temporal)

Twitris v1 (Jadhav, 2010) was designed with the following three major steps:

  1. Data collection: collect user posted tweets pertaining to an event from Twitter, associated news, multimedia, and Wikipedia content
  2. Data analysis: a) process obtained tweets to extract strong event descriptors considering spatial, temporal, and thematic event attributes b) process event related news, multimedia, and Wikipedia content to get event context and gain a better understanding
  3. Visualization: present extracted summaries on Twitris v1 user interface
Figure 3: STT biased scoring mechanism of Twitris v1 for relevance and ranking of keyphrases compared to traditional TFIDF based ranking: “mumbai” ranked highest based on TFIDF is far less informative compared to “foreign relations perspectives”

Twitris v1 performs a two-step processing to extract strong event descriptors from tweets. First, it creates the Spatio-Temporal clusters of the tweet corpus surrounding an event, since every event is different and we want to preserve the social perceptions that generated this data. TFIDF computation is performed to fetch the n-grams from this set. The second step involves the association of spatial, temporal, and thematic bias to these n-grams by means of enhancing the weights, while preserving the contextual relevance of these event descriptors to the event. Further details of the text-processing algorithm are available in (Nagarajan, 2009- a). Twitris v1 user interface (Figure 3, a and b) facilitates effective browsing of when (temporal/time), where (space/location), and what (thematic/context) slices of social perceptions behind an event.

Figure 4: Early version of Twitris v1 user interfaces for displaying thematic component (using STT biasing) on right (b) based on spatial and temporal selection on left (a)

The objective of the Twitris v1 user interface is to integrate the results of the data analysis (extracted descriptors and surrounding discussions) with emerging visualization paradigms to facilitate sensemaking. To start browsing, users are required to select an event. Once the user chooses a theme, the date is set to the earliest date of recorded observations for an event and the map is overlaid with markers indicating the spatial locations from which observations were made on that date. We call this the spatio-temporal slice.

Figure 5:Twitris v1 user interface (a) with spatio-temporal slice and multimedia widgets
Figure 6: Twitris v1 user interface (b) with event descriptor cloud, related tweets, news and Wikipedia articles for event “Austin plane attack”. Joe Stack the man responsible for the Austin suicide plane attack on the IRS office, put up his suicide note online about the attack. He was a former bass player for the Billy Eli band. Here Twitris captures STT event descriptors summarizing the important facets.

Users can further explore activity in a particular space by clicking on the overlay marker. The event descriptors extracted from observations in this spatio- temporal setting are displayed as an event descriptor cloud. The spatio-temporal-thematic (STT) scores determine the size of the descriptor in the tag cloud. In order to get event context and better understanding of the event, we enhanced Twitris, by integrating event related news, multimedia (images and videos) and Wikipedia articles. We leveraged explicit semantic information from DBPedia to identify relevant news and Wikipedia articles. When a user clicks on a particular descriptor, we display tweets containing the event descriptors and the top current news items, as well as related Wikipedia articles.

Twitris v2: People Content Network analysis (PCNA) with use of background knowledge and semantic metadata extraction and querying/exploration

The Mumbai Terrorism event of 2008 gave the impetus to study the event from STT dimensions, and focus on connecting with relevant news content. Social media continues to grow and revolutionize the way users interact with each other and information. Social network users are not only creators and recipients of the information, but also critical relays to propagate information. This powerful ability of sharing has played an important role in events with varied social significance, audience, and duration, such as political movements (e.g. the Jasmine Revolution in Tunisia), brand management and marketing, and perhaps most visibly, crisis and disaster management (e.g., Haitian and Japanese earthquakes). The Twitris team started to look at the issues such as the role of content nature for high vs. low attributed information diffusion (a phenomenon of propagating messages via friendship/follower connections among users of social network) (Nagarajan, 2010-b) and user engagement (given a discussion topic on social media, what motivates a user to engage in the discussion for his/her first interaction) (Purohit, 2011-a), (Ruan, 2012). Consequently, Twitris v2 embarked on a more comprehensive analysis along the three pillars of what makes anything social: who is engaging in the social activity, what is being communicated, and how does this communication flow between those engaged in the social activity. The idea is to gain insights into how permanent and transient networks arise, and what and why information flows across such networks. Twitris v2 developed the significant capability to extract more types of metadata, and the infrastructure became more semantic with the use of Semantic Web standard RDF, as well as relevant background knowledge. The latter enabled Twitris v2 to support the deep exploration capability with use of DBPedia and SPARQL over metadata extracted from the tweets. Twitris v2 research focus on coordination during disasters also led to integrate Twitris with Ushahidi’s SwiftRiver open source platform, and support ingestion of SMS which were used for events such as Pakistan Floods in 2010.

Let’s look at some examples of Twitris v2 capabilities:

  • Evolving ad-hoc nature of social media communities:

Event-centric communities with varied nature (Purohit, 2011-a) often bring together users from different parts of the social network, especially in Twitter where we keep switching discussions of our interests, and we may not already be connected to other participants of those communities. Therefore, in such ad-hoc communities, it is difficult to depend on just follower graphs for understanding the dynamics. Twitris v2 introduced analysis of user interaction networks so that human dynamics in the evolving communities can be understood at granular levels- influencer analysis, contextually important people with roles to engage with, community evolution, etc. Twitris v2 built this feature by extending our research in the user interaction network analysis on brand-page communities (Purohit, 2012-a)

Figure 7. Contrast in the community structure of influencers in user interaction networks, centered on two popular events #OccupyChicago and #OccupyLA
  • Contrast in the structure of interaction networks:

The Figure 7 shows the networks of influencers in two topical communities during the Occupy Wall Street (OWS) movement, ‘OccupyChicago’ on the left and ‘OccupyLA’ on the right. Such an analysis provides insights to not only understand real dynamics of the actors (e.g., what organizations supporters belong to, to whom are they strongly connected) but also the potential of the influencers to drive actions in the communities (tightly connected influencers are likely to drive effective ‘call for action’ propagation in the communities). In this figure, the influencer network of OccupyLA is highly connected and self organized as compared to sparsely connected one for OccupyChicago and therefore, likely to reach masses effectively for any call-for-action. Even the Facebook page for OccupyLA reflected such activism.

Figure 8. Sentiment of the influencers for the target candidate in the interaction network centered on that target: Romney (1st cluster) vs. Ron Paul (2nd cluster)
  • Slicing and dicing the networks by user features:

To glean insights about actionable information in the ad-hoc communities, we need to understand the participants better. Therefore, Twitris v2 introduced Slicing and Dicing analysis of the interaction networks by providing user/node centric features. For example, professional or organizational affiliation of users provides clues to understand the cause for dynamics- e.g., who are the people behind the organized network of OccupyLA? Are such users from the same type of organizations leading to coordinated actions? Similarly, Twitris v2 introduced the content-centric analysis, thus realizing the full potential of PCNA. Users are clustered by grouping them into sentiment segments of the target topic, thus answering questions like: which candidate is going stronger in the influencer network from a sentiment perspective (Figure 8) between Mitt Romney and Ron Paul and for what issues?

Figure 9. Interaction network evolution for topical community surrounding Mitt Romney, US Presidential Election 2012
  • Understanding group dynamics by community evolution

Twitris v2 focused on the larger goal of predictive ability for group dynamics and the People- Content-Network Analysis (PCNA) framework was the key to the untapped potential of group dynamics. Therefore, Twitris v2 created clusters in the ad-hoc communities based on the sentiment of the users for a targeted topic over time and associated events on the timeline for causal analytics. Figure 9 shows an example of community evolution centered around Republican presidential nominee Mitt Romney during March 1 to 31, 2012. It shows three snapshots over a 10 day period and we observed an extremely modularized community in the end of the analysis, which was not really the case for the closest competition, Rick Santorum. And as we know, Santorum exited the race on April 9th. Thus, the analysis of community evolution made Twitris v2 capable of understanding group dynamics of ad-hoc communities by not limiting the output to just understand users but also the group behavior.

Figure 10. Leveraging Semantic Web technologies to provide insights of Events


Twitris v2 leverages Semantic Web technologies by the use of background knowledge such as DBpedia to provide deeper insights about the event. Background knowledge changes the way you can look at the information, as it puts the information in context. This is especially important for tweets because they are short, and therefore individually lack the volume of information that provides an informative context. For example, in the above Figure 10, questions such as “Who are the dead people that are mentioned in the context of OWS movement” can be answered using the background knowledge whereas simple keyword search cannot put the information of tweets in context. Further, to answer the questions in the figure and generate answers such as Rosa Parks, the system has to have the background knowledge about this named entity is a Person, and also that she is dead. Going deeper into the background knowledge provides information that Rosa Parks was famous for the Montgomery Bus Boycott during the US civil rights movement in 1955-56.

Twitris v3: Emotion-Sentiment-Intent, Real-time view and other advancements

Behind every (well, most of the important) tweet, there is a human. And a human is complex. Through a tweet, a person expresses emotion, sentiment, and intent. Understanding this dimension is a key to unlock the true potential of social media. This is especially true for monetization of social media. Understanding an underlying intent can tell us if a user is expressing a transactional (potentially for buying a product) intent, seeking information, or just sharing information (Nagarajan, 2009-b). Sentiment is perhaps the most sought after type of analysis of social data. Currently, it is the primary basis of social media analysis to predict whether a product or a movie will succeed, who is more likely to win an election, or to attempt to identify consumer interest and hence use it for targeting the advertisement. Analysis of or identification of emotion is likely the dark horse of the three-- while techniques for its analysis are not yet as mature as sentiment analysis, it is likely to be combined with the other two to give far more signal than without it.

A key innovation in sentiment analysis, employed in Twitris v3, is topic specific sentiment analysis - to associate sentiment with an entity (Chen, 2012). This enables us to identify two different sentiments associated with different entities in a single tweet. For example, in tweet “The King’s Speech was bloody brilliant. Colin Firth and Geoffrey Rush were fantastic!” we can identify both the sentiment (i.e., bloody brilliant) associated with the movie “The King’s Speech” and the sentiment (fantastic) associated with the actors Colin Firth and Geoffrey Rush. More recently, we are associating sentiments with events - when there is a significant change in sentiment, we attempt to associate that with real world events. For example, by tracking both the event and entity specific sentiment, Twitris v3 is able to capture a substantial increase of positive sentiment towards President Obama on the immigration issue on June 15, 2012 (the day on which President Obama outlined a new immigration policy), and associate it with the event descriptors such as “dream act”, “obama 's immigration move”, and “new immigration policy”. Figure 10 shows that Twitter users have the opposite sentiments towards two candidates: Obama (green/positive) and Romney (red/negative) on the same topic “final debate”. The reason is that Obama received more positive feedback from Twitter users than Romney did, which is in line with the impression from news media. This example demonstrates Twitris’s power in identifying topic specific sentiments.

Figure 11. Twitter users show the opposite sentiments towards two candidates on the same topic “final debate” in 2012 presidential election
Figure 12. Peek patterns of the emotion joy due to excitement of Twitter users caused by three debates and one TV Program (the Daily Show) in the 2012 presidential election

Compared with sentiment, emotion is more implicit. For example, “I will have a calculus test in two hours, but I’m not prepared at all.” We can infer that the person is nervous about the test, though there are no explicit emotion words, such as “nervous” or “panic”. It is very difficult and time consuming to label sentences with emotions, considering the implicitness of emotion. In Twitris v3, we are able to automatically create a large emotion-labeled dataset (of about 2.5 million tweets) covering: joy, sadness, anger, love, fear, thankfulness, and surprise, by harnessing emotion-related hashtags available in the tweets (Wang 2012). Machine learning classifiers are trained on the large dataset to learn how to identify people’s emotions behind their tweets. And, as another key innovation, Twitris v3 can analyze people’s emotional responses in different events. For example, Figure 11 shows the volume of joyful tweets, reaching peaks on Oct. 3rd 2012 (1st debate), Oct. 16th 2012 (2nd debate), Oct. 22nd (3rd debated) and Oct. 19th (Obama went to the Daily Show). The reason is that Twitter users are very enthusiastic about all three presidential debates and Obama’s presence in the Daily Show TV program. Other than analyzing emotions out of tweets, Twitris v3 is also able to identify emotions from blogs, news headlines, etc. The reason is that we adapt the classifiers trained on Twitter data to other domains with a relatively small amount of labeled emotion data in other domains.

Besides detecting users’ emotional states, we also explore how to automatically identify users’ intents from posts so that monetization can be more targeted on users’ needs (Nagarajan, 2009-b). The highlights of our study is that we discover and differentiate three types of posts: (a) transactional posts, e.g., “I am looking for a 32 GB iTouch” (b) information sharing posts, e.g., “I like my new 32 GB iTouch” and (c) information seeking posts, e.g., “what do you think about 32 GB iTouch?” For monetization purposes, transactional posts and information seeking posts are more valuable than information sharing posts because users are looking for information that advertisers can exploit. By extracting intent/keywords/cues from transactional and information seeking posts, our system achieved an accuracy of 52% on ad impressions using MySpace and Facebook data, while the baseline, without using our system, only achieved an accuracy of 30%.


All the above-mentioned precious assets (sentiment, emotion, and intent) of content exist due to an actionable purpose of humans. When such individual level purposes start to bring higher engagement in the groups, they become source of group level actions, apparently, leading to the evolution of human dynamics in the social network. Therefore, we are exploring the integral role of intent with sentiment and emotions for purposeful actions in the groups. Specifically, we are focusing on intent and sentiments behind group coordination because coordinated activity has the potential to make or break the system.

Figure 13 shows some of the capabilities of Twitris v3: (1) Shows popular topics – also called social signals (weighted n-grams) related to the chosen event for today and any day of the past since the event began to be tracked, (2) search from among the event related tweets with autocomplete, popular event hashtags, and active users, and explore content for deep analysis (e.g., who are the dead people mentioned most often in the “occupy wall street event”) using background knowledge (default source is Wikipedia/DBpedia) and semantic web technologies (RDF/SPARQL), (3) show key topics of discussions by locations/regions – states, country (e.g., see the differences in social signals from Mississippi (a “red state”) vs Massachusetts (a “blue state”) related to President Obama’s Nobel Prize, (4) see event relevant tweets in real-time on a world map or any region, (5) analyze topic/people/region specific sentiment (e.g., for the US election, sentiment on candidates by states, and by topics identified by election specific topics), (6) see the networks with insights from static (e.g., followers) and dynamic features (e.g., retweet) and people/demographics (e.g., with knowledge of profession of each person), (7) display tweets, recent news, and Wikipedia pages related to selected events and social signals, (8) show event specific multimedia (images and video), (9) see tweet traffic (10) change date of video/analysis, (11) select location of interest—each pin shows a collection of social signals emanating from a location, and (13) select an event of interest (e.g., US Election, Occupy Wall Street, Japanese Tsunami).

Detailed research on social data analysis encompasses social intelligence in real-time (Gruhl, 2010) which involved a Kno.e.sis-IBM collaboration leading to the operationally deployed BBC SoundIndex system, prediction of topic volume on Twitter (Ruan, 2012), emotion identification using Twitter “big data” (Wang 2012), brand tracking (Purohit, 2012-a), psycholinguistic analysis during emerging coordination (Purohit, 2012-b), privacy aware content dissemination (Kapanipathi 2011-b), user-community engagement (Purohit, 2011-a), information diffusion (Nagarajan, 2010-b), trust in social media (Thirunarayan, 2011), monetization of social activities (Nagarajan, 2009-b), reported in over 30 publications, and summarized in a comprehensive tutorial (Nagarajan, 2011).

Key Applications

Twitris has been used in a research context for studying and analysing social sensing and perception of a broad variety of events: politics and elections, social movements and uprisings, crisis and disasters, entertainment, environment, etc. We are now investigating more commercial applications including brand tracking and advertisement campaign effectiveness, empowering professional users, and others.

Future Directions

The next evolution of Twitris will be in incorporating social media content along with data from sensors and Web of Things, as well as in advanced applications for health informatics, crisis and disaster support.

Recommended Reading

Amit Sheth and Krishnaprasad Thirunarayan, Semantics Empowered Web 3.0: Managing Enterprise, Social, Sensor, and Cloud-based Data and Services for Advanced Applications, Morgan & Claypool Publishers, December 09, 2012. ISBN: 1608457168

Acknowledgements

This work is partially supported by NSF funded grants “SoCS: Social Media Enhanced Organizational Sensemaking in Emergency Response” (IIS-1111182) and “EAGER: Expressive Scalable Queries over Linked Open Data," (IIS-1143717).

Citation Information

Amit Sheth, Ashutosh Jadhav, Pavan Kapanipathi, Chen Lu, Hemant Purohit, Gary Alan Smith, Wenbo Wang, 'Twitris- a System for Collective Social Intelligence', Encyclopedia of Social Network Analysis and Mining (ESNAM), 2013

References

  1. A. Sheth (2009-a), 'Semantic Integration of Citizen Sensor Data and Multilevel Sensing: A comprehensive path towards event monitoring and situational awareness,' From E-Gov to Connected Governance: the Role of Cloud Computing, Web 2.0 and Web 3.0 Semantic Technologies, Fall Church, VA, USA, February 17, 2009
  2. A. Sheth (2009-b), 'Citizen Sensing,Social Signals, and Enriching Human Experience', IEEE Internet Computing, pp. 80-85, July/August 2009
  3. M. Nagarajan, K. Gomadam, A. Sheth, A. Ranabahu, R. Mutharaju and A. Jadhav (2009-a), 'Spatio-Temporal-Thematic Analysis of Citizen-Sensor Data - Challenges and Experiences,' Tenth International Conference on Web Information Systems Engineering, Poznan, Poland, October 5-7, 2009
  4. H. Purohit, Y. Ruan, A. Joshi, S. Parthasarathy, A. Sheth (2011-a), Understanding User- Community Engagement by Multi-faceted Features: A Case Study on Twitter. SoME 2011 (Workshop on Social Media Engagement, in conjunction with WWW 2011), Hyderabad, India, March 28 - April 1, 2011
  5. A. Jadhav, H. Purohit, P. Kapanipathi, P. Ananthram, A. Ranabahu, V. Nguyen, P. Mendes, A. G. Smith, M. Cooney, A. Sheth (2010), Twitris 2.0: Semantically Empowered System for Understanding Perceptions From Social Data , Semantic Web Application Challenge at ISWC, Shanghai, China, November 7-11, 2010
  6. L. Chen, W. Wang, M. Nagarajan, S. Wang and A. Sheth (2012), Extracting Diverse Sentiment Expressions with Target-dependent Polarity from Twitter. In Proceedings of the 6th International AAAI Conference on Weblogs and Social Media (ICWSM), Dublin, Ireland, June 5-7, 2012
  7. W. Wang, L. Chen, K. Thirunarayan and A. Sheth. Harnessing Twitter 'Big Data' for Automatic Emotion Identification (2012), In Proceedings of International Conference on Social Computing (SocialCom), 2012, Amsterdam, Netherlands, September 3-5, 2012
  8. M. Nagarajan, K. Baid, A. Sheth, and S. Wang, 'Monetizing User Activity on Social Networks - Challenges and Experiences' (2009-b), IEEE/WIC/ACM International Conference on Web Intelligence, Milan, Italy, September 15-18 2009
  9. P. Kapanipathi, F. Orlandi, A. Sheth, A. Passant, 'Personalized Filtering of the Twitter Stream' (2011-a), 2nd workshop on Semantic Personalized Information Management at ISWC 2011, Koblenz, Germany, October 23-27 2011
  10. A. Sheth, C. Thomas, P. Mehra, 'Continuous Semantics to Analyze Real-Time Data’ (2010), IEEE Internet Computing, vol. 14, no. 6, pp. 84-89, Nov./Dec. 2010
  11. M. Nagarajan, Understanding User-Generated Content on Social Media (2010-a), Ph.D. Dissertation, Wright State University, 2010
  12. M. Nagarajan, A. Sheth, S. Velmurugan Citizen Sensor Data Mining, Social Media Analytics and Development Centric Web Applications (2011), Proc of the WWW 2011,, Hyderabad, India, March 28 - April 1, 2011
  13. A. Sheth (2011), 'Citizen Sensing-Opportunities and Challenges in Mining Social Signals and Perceptions' Invited Talk at Microsoft Research Faculty Summit 2011, Redmond, WA, July 19, 2011
  14. C. Battle (2009), 'New Media’s Moment in Mumbai', Foreign Policy Journal, January 15, 2009 Accessed on March 31, 2013
  15. Impact Lab (2008), Twitter Provided a Vital Link in Mumbai Terrorist Attacks November 28, 2008 Accessed on March 31, 2013
  16. Keralaravind (2008), ‘Hash Mumbai’, Time Line of Citizen Journalism & Social Media during Mumbai Terrorist Attacks ,Youtube.com, uploaded on November 28, 2008 Accessed on March 31, 2013
  17. M. Nagarajan, H. Purohit, A. Sheth (2010-b). A Qualitative Examination of Topical Tweet and Retweet Practices. 4th Int'l AAAI Conference on Weblogs and Social Media (ICWSM), pp. 295-298, Washington, DC, USA, May 23-26, 2010
  18. Y. Ruan, H. Purohit, D. Fuhry, S. Parthasarthy, A. Sheth (2012). Prediction of Topic Volume on Twitter. 4th Int'l ACM Conference of Web Science (WebSci), Evanston, Illinois, USA, June 22–24, 2012
  19. H. Purohit, J. Ajmera, S. Joshi, A. Verma, A. Sheth (2012-a). Finding Influential Authors in Brand-Page Communities. 6th Int'l AAAI Conference on Weblogs and Social Media (ICWSM), Dublin, Ireland, June 5-7, 2012
  20. D. Gruhl, M. Nagarajan, J. Pieper, C. Robson, A. Sheth (2010), Multimodal Social Intelligence in a Real-Time Dashboard System to appear in a special issue of the VLDB Journal on 'Data Management and Mining for Social Networks and Social Media'
  21. H. Purohit, A. Hampton, V. Shalin, A. Sheth, J. Flach (2012-b). What kind of communication is Twitter? A psycholinguistic perspective on communication in Twitter for the purpose of emergency coordination. NSF SoCS Symposium, 2012
  22. P. Kapanipathi, J. Anaya, A. Sheth, B. Slatkin, A. Passant (2011-b), Privacy-Aware and Scalable Content Dissemination in Distributed Social Networks, International Semantic Web Conference (ISWC), Koblenz, Germany, October 23-27 2011
  23. K. Thirunarayan and P. Anantharam, 'Trust Networks: Interpersonal, Sensor, and Social (2011),' In: Proceedings of 2011 International Conference on Collaborative Technologies and Systems (CTS 2011), Philadelphia, Pennsylvania, USA, May 23-27, 2011

Internal

For project members only: Twitris Internal Page