Difference between revisions of "Twarql"

From Knoesis wiki
Jump to: navigation, search
(Twarql API)
Line 4: Line 4:
 
<code>
 
<code>
 
'''''NOTICE: Twarql is in pre-alpha stage.'''''  
 
'''''NOTICE: Twarql is in pre-alpha stage.'''''  
We are currently refactoring some parts of the code and stabilizing our REST API. We plan to release a first version of the source code and API by June 7th, 2010. <!--If you would like more information about the project, please visit our project page [[Linked Open Social Signals]].-->
+
We are currently refactoring some parts of the code and stabilizing our REST API. We plan to release a first version of the source code and API by June 28th, 2010. <!--If you would like more information about the project, please visit our project page [[Linked Open Social Signals]].-->
 
</code>
 
</code>
  
Line 20: Line 20:
 
<!--
 
<!--
 
You can try out our [http://knoesis1.wright.edu/twarql/ live demo] that is currently streaming tweets about the [http://en.wikipedia.org/wiki/Deepwater_Horizon_oil_spill oil spill]. The [http://knoesis1.wright.edu/twarql/featured.html featured streams] will illustrate concept feeds in action, and the [http://knoesis1.wright.edu/twarql/query.html query page] will allow you to define your own concept feed through our query formulation interface.
 
You can try out our [http://knoesis1.wright.edu/twarql/ live demo] that is currently streaming tweets about the [http://en.wikipedia.org/wiki/Deepwater_Horizon_oil_spill oil spill]. The [http://knoesis1.wright.edu/twarql/featured.html featured streams] will illustrate concept feeds in action, and the [http://knoesis1.wright.edu/twarql/query.html query page] will allow you to define your own concept feed through our query formulation interface.
 +
-->
 +
 +
<!--
 +
For the Triplify Challenge 2010 we have collected tweets mentioning iPad to demonstrate our system in a brand tracking scenario. The dataset is available via a SPARQL Endpoint http://.... (Powered by Openlink Virtuoso).
 +
Here are some example questions you can ask:
 +
- each of the use case queries that return results go here
 
-->
 
-->
  
Line 41: Line 47:
 
* encode content in a structured format (RDF) using shared vocabularies (FOAF, SIOC, MOAT, etc.);
 
* encode content in a structured format (RDF) using shared vocabularies (FOAF, SIOC, MOAT, etc.);
  
We offer Twarql Annotation both as REST and Java APIs. You can easily extend the annotation pipeline with your own extractors.
+
We offer Twarql Annotation both as REST and Java APIs. You can download our source code and easily extend the annotation pipeline with your own extractors.
  
 
=== Concept Feeds ===
 
=== Concept Feeds ===

Revision as of 08:22, 18 June 2010

Twarql: Twitter Feeds through SPARQL

Error creating thumbnail: File missing
Twarql Demonstration http://bit.ly/twarql

NOTICE: Twarql is in pre-alpha stage. We are currently refactoring some parts of the code and stabilizing our REST API. We plan to release a first version of the source code and API by June 28th, 2010.

Introduction

Our approach encompasses the following steps:

  • extract content (entity mentions, hashtags and URLs) from microposts;
  • encode content in a structured format (RDF) using shared vocabularies (FOAF, SIOC, MOAT, etc.);
  • enable structured querying of microposts (SPARQL);
  • enable subscription to a stream of microposts that match a given query (Concept Feeds);
  • enable scalable real-time delivery of streaming data (SparqlPuSH).

Demonstration We have two demonstration videos. The first video demonstrates the user perspective, interacting with the system to formulate a query and obtain microblog posts that match that query. The second video focuses on the server side and demonstrates the modules of our architecture at work, distributing the microposts via pubsubhubbub.


Architecture

See the workflow between the components of the architecture:

Tweet Annotation

  • extract content from microposts;
    • entity mentions (e.g. from DBpedia)
    • hashtags
    • URLs
    • user mentions
  • encode content in a structured format (RDF) using shared vocabularies (FOAF, SIOC, MOAT, etc.);

We offer Twarql Annotation both as REST and Java APIs. You can download our source code and easily extend the annotation pipeline with your own extractors.

Concept Feeds

  • enable structured querying of microposts (SPARQL);
  • enable subscription to a stream of microposts that match a given query;

SPARQLPuSH

Twarql API

REST Endpoints

Summary:

  • URL scheme: http://<base-url>/<operation>/?<parameter>=<value>&...&output=<output format>
  • Base URL: http://knoesis1.wright.edu/twarql
  • Operations: search, register, stream, query
  • Output formats: twitter-json, sparql-json, entities

Parameters:

  • http://knoesis1.wright.edu/twarql/search?keyword=k1,...,kn&output=<output type>
    • input: keywords, output type (tweets, entities, sparql)
    • output: tweets, entities, triples

Output Formats

We also provide output according to the format presented on the twitter-api-announce message.

{
"text" : "hey @raffi tell @noradio to check out http://dev.twitter.com #hot",
...
"entities" : {
 "user_mentions" : [
 {
   "id" : 8285392,
   "screen_name" : "raffi",
   "indices" : [4, 9]
 },
 {
   "id" : 3191321,
   "screen_name" : "noradio",
   "indices" : [16, 23]
 }
],
"urls" : [
 {      "url" : "http://dev.twitter.com",
   "indices" : [38, 64]
 },
],
"hashtags" : [
 {      "text" : "#hot",
   "indices" : [66, 69]
   "url" : "http://search.twitter.com/search?q=%23hot"
 }
]
}
...
}

Error/Warning Messages

ERROR

  • Unknown Stream: You are requesting a stream id that was not registered.
  • Invalid Query: You are trying to register an invalid SPARQL query.
  • Unsupported Content-type: The requested content type is not supported.

WARNING

  • No Results: There are no results for the query.

Sample Dataset Generated(triples)

A sample of the triples generated for the ipad tweets streamed from Thu Jun 03 to Tue Jun 08 is here

Supported Clients

  • SPARQL Protocol-compliant Clients
    • Cuebee is a SPARQL query formulation and results exploration engine. We provide a TweetExplorer that can be directly plugged into Cuebee.


  • RSS/Atom clients
    • View SparqlPuSH

People

You may contact us if you have any questions about the implementation or API. We have listed our major contributions below our names so that you know to whom you should direct your question.

  • Pablo Mendes (@pablomendes)
    • Architecture, SPARQL client (Cuebee), Social Sensor (Extraction, Annotation), API, Documentation
  • Pavan Kapanipathi (@pavankaps)
    • Application Server, Semantic Publisher, Streaming SPARQL, API Content Negotiation
  • Alex Passant (@terraces)
    • SparqlPuSH, Annotation Vocabularies