SQL on Twitter "If I had more time, I would have written shorter letter" -- Blaise Pascal There have been lengthy articles on analyzing twitter data by Cloudera here , here and here . More from Hortonworks here and here . This article is going to be short. Thanks to features in Couchbase 4.5 ! There have been lengthy articles on analyzing twitter data by Cloudera here , here and here . More from Hortonworks here and here . This article is going to be short. Thanks to features in Couchbase 4.5 ! Step 1: Install Couchbase 4.5 Use Couchbase console create a bucket called twitterand CREATE PRIMARY INDEX on twitter using query workbench or any other tool. Step 2: Request for your twitter archive . Once you receive it, unzip it. (You can use larger twitter archives as well). cd <to the unzipped location>/data/tweets Step 3: $ for i in `ls`; do grep -i ^Grailbird $i > $i.out ; ...
A blog about all things data and data processing issues and interests. SQL, NoSQL, flexible schema, scale-up, scale-out, transactions, high availability.