Cuttlepress

Blogging about Hacker School?

Project Writeup: Curated Twitter

Curated Dannel” is a project I’ve been joking about implementing for at least a year. While I would love to follow Dannel on Twitter, his tweeting frequency is a bit higher than what I can keep up with. I needed some way of curating his tweets. Luckily, Twitter has a couple of ways to vouch for the attention-worthiness of tweets: favorites and retweets. So, in theory, it’s easy to crowdsource.

I decided to use Node again for this project, and using ntwitter allowed me to interface with the Twitter API without dealing with any of the details of authenticating via OAuth. The API provides access to individual entities (such as existing tweets or users) through queries, and separately, access to realtime events (such as incoming tweets or favorites) through streams. Information is not necessarily consistent across the two, so a hybrid approach seems to be recommended. (One convoluted behavior I dealt with did turn out to be a bug, not a design decision, at least.)

After trying out many different approaches, I ended up with this method: I open an authenticated User Stream as Dannel, which allows me to see his incoming “favorite” and “retweet” events. If the event is on a tweet written by Dannel (his stream also sees events on other relevant tweets, such as further retweets of tweets he previously retweeted), I then check the current total number of favorites and retweets on the included tweet, via the Search API. I store the text, tweet ID, favorite count, and retweet count in a database at MongoHQ, using mongodb. If I haven’t previously retweeted that tweet, I check its eligibility, potentially retweeting it and marking it in the database as retweeted.

Determining a tweet’s eligibility for retweeting is simple:

1
2
3
function tweetQualifiesForRetweet(tweet) {
  return tweet.retweet_count*2 + tweet.favorite_count >= 2;
}

I figure that a retweet should count for more than a favorite, because it’s a more public way of vouching for a tweet’s attention-worthiness. This cutoff seems to provide about as many tweets per day as I’d like; in an optimal universe, the code would adjust its metrics to provide tweets at an approximate quantity over time instead, but in this less-than-optimal universe, I am very ready to move on from this project.

I grabbed a public domain image to make the user icon of the account. (I also used a gratuitous Photoshop “art” filter, which made me inordinately happy.) Another possible extension of this project would be to automatically update its content based on Dannel’s current icon, since he changes them monthly.

The last step was to ensure that, ideally, I never have to think about keeping the code running. It’s hosted on a small Amazon Web Services instance, and theoretically it should happily run for a pretty long time, but I wanted to not have to deal with crashes or server reboots. I used supervisord to restart the code after any crashes, and a startup script to start supervisord after any reboots.

So! @curateddemarko is live. The code is all on github. Now we just need to monetize it: