Infochips now offers a query API for two interesting datasets: a Twitter collection and US Census data.
The Twitter data covers 500M tweets from 35M users collected between March 2006 and November 2009. The API currently included the following services.
- Trstrank – a trust metric for Twitter users based on network centrality (see trst.me:
http://api.infochimps.com/soc/net/tw/trstrank.json?screen_name=SarahPalinUSA
- Wordbag – returns the 100 tokens (i.e., words) that a particular Twitter user tweets more often than the average Twitter user.
http://api.infochimps.com/soc/net/tw/wordbag.json?screen_name=ladygaga
- Influencer metrics – replies in/out and retweets in/out for a given user
http://api.infochimps.com/soc/net/tw/influence.json?screen_name=algore
- Conversations – find interactions between two users. Currently this just yields direct messages but will include retweets and mentions later. For example, check out conversations between Lady Gaga and Sarah Palin:
http://api.infochimps.com/soc/net/tw/conversation.json?user_a_id=14230524&user_b_id=65493023
Pricing varies with use and ranges from Baboon” (free for 100K calls/month) to “Golden Ape” ($4000/month for 15M call/month).