Difference between revisions of "Analyzing URL Chatter on Twitter"

From Knoesis wiki
Jump to: navigation, search
(Week 2)
Line 14: Line 14:
 
==Status==
 
==Status==
 
===Week 1===
 
===Week 1===
 +
Programming Language:Java
 
* Extracting the Urls from the tweets - Done
 
* Extracting the Urls from the tweets - Done
 
* Recognizing the short/tiny Urls and transforming it into the long Urls. - Done
 
* Recognizing the short/tiny Urls and transforming it into the long Urls. - Done
  
 
===Week 2===
 
===Week 2===
 +
Programming Language:Java
 
* Creating a table for the Urls and the tweets to store the urls - Done
 
* Creating a table for the Urls and the tweets to store the urls - Done
* Analysing the urls with the presently available themes - In Progress
+
* Analysing the urls with the presently available themes - Done
  
 
===Week 3===
 
===Week 3===
 +
Languages: Sql, Java(Servlets), JavaScript(Jquery), HTML, XML
 +
* Queries for performing the related operations
 +
* Working around with the Timeline javascript to integrate with the project
 +
 
===Week 4===
 
===Week 4===
 +
* Integrating the code to show the desired results
 
   
 
   
 
 
 
 
 
==Future work==
 
==Future work==
 
*1. Make sure the Url is not short by checking it recursively
 
*1. Make sure the Url is not short by checking it recursively

Revision as of 19:30, 18 November 2009

Project Description

This project helps analyzing the urls available in the tweets with the theme. The data(Tweets) crawled for twitris project is being used.

Objectives

  • Classify the content of the urls
  • Understand user perception of the websites

Motivation

  • Search Engine Perspective - How to choose a page which is interesting to the user, given the keywords
  • Publisher Perspective - What do people think about the page(URL)

Status

Week 1

Programming Language:Java

  • Extracting the Urls from the tweets - Done
  • Recognizing the short/tiny Urls and transforming it into the long Urls. - Done

Week 2

Programming Language:Java

  • Creating a table for the Urls and the tweets to store the urls - Done
  • Analysing the urls with the presently available themes - Done

Week 3

Languages: Sql, Java(Servlets), JavaScript(Jquery), HTML, XML

  • Queries for performing the related operations
  • Working around with the Timeline javascript to integrate with the project

Week 4

  • Integrating the code to show the desired results

Future work

  • 1. Make sure the Url is not short by checking it recursively
  • 2. Themes-Entity extraction from the tweets rather than using the present available themes
  • 3. Provisions in the DB to know the popularity of the url at that particular theme.

Assumptions

  • URL max length in the DB is 300