rss feed

BOALT Blog

Industry musings on what is or isn't relative to BOALT.

Gossip Girl Social Media Experiment

Between blog posts and tweets, internet users generate an enormous amount of content on any given topic daily. While there’s many products and websites that will aggregate this data, relatively few actually analyze it. We at Boalt interactive thought it would be fun to put together an website which does just that with a topic that people are sure to talk about – Gossip Girl.

Gossip Girl Social Exeperiment

If you click on the image above you can visit the site. It basically pulls in tweets and blog posts about the actors and actresses in the show, analyzes them and sorts the people according to their popularity.

At the core of this page are two PHP scripts that are setup to run as cron jobs. The first script pulls posts and tweets from the internet and inserts them into a database table while the second uses the Jane16 API to determine the sentiment of the post.

Pulling blog posts and tweets into the database is pretty straightforward. Since both Google Blog Search and Twitter can display search results as an RSS feed, we simply use the lastRSS php library to download and parse the results:

$rss = new lastRSS;
$url_flux_rss = 'http://search.twitter.com/search.rss?q='.urlencode($search_string);
$rss->cache_dir   = './cache';
$rss->cache_time  = 10;
$rss->date_format = 'Y-m-d G:H:i';

if ($rs = $rss->get($url_flux_rss)) {
foreach($rs['items'] as $i){
//insert $i into database
}
}

To determine the sentiment of a post we used the Jane16 API. This online content analyzer has an open API and will return an xml description of text passed to it in a POST request. An example xml response is below:

<J16>
<SENTIMENT>
<ENTRY certainty="100.0" positiveHits="0" negativeHits="2" positivePolarity="0.0" negativePolarity="434.18">NEGATIVE</ENTRY>
</SENTIMENT>
</J16>

The second cron script we use will pull fresh articles from the database, send the text to Jane16 and update the row with the sentiment of the text. One particular limitation we had to deal with was that the api allows only one request per 10 seconds per ip address.  Adding a sleep() function in the php script solves the problem but significanly increases the amount of time it takes to go through all the posts.

While these two PHP scripts do the majority of the work, the part of the project that took the longest to code was the javascript that displays and sorts the posts on the page. The page uses two unordered lists: one on the left to display the actors and actresses and one on the right to show new posts. We used the jQuery javascript library for this project which simplified things a lot and made it easy to iterate through the unordered lists.

The first step is to update the list of posts. Posts are retrieved by making an AJAX call to http://www.boalt.com/gossipgirl/get_post.php which returns a post encoded in json.  A typical response will look like this:

{"id":"3759","topic_id":"3","title":"i'm going crazy!! where's my
dan humphrey?? xx","description":"i'm going crazy!!","image":...

The json response is parsed and put into an <li></li> element which is then added to the unordered list that holds on the posts. Using some jquery magic the whole process is animated and looks rather smooth.

The next step is to update the statistics for the actor/actress that the post is referring to.  It’s not the most elegant way to do it, but if there were a positive post about Penn Badgley, the variable penn_good would be incremented by 1. Likewise, a negative post would increase the penn_bad variable by one. Lastly, we need to check to see if this new post will affect the person’s ranking.  To do this we use jquery to iterate through the unordered list that contains the actors/actresses and move the person if necessary.

Concluding Thoughts

While there are a couple of things that could be improved, the content analysis could be more accurate and the javascript could be streamlined, the results aren’t bad for a short project. Watching social media in near-real time is both visually interesting and gives you a unique spin on a popular subject.

2 Responses to “Gossip Girl Social Media Experiment”

Leave a Reply