Skip to main content

Posts

Showing posts with the label datatabase

SQL on Twitter: Analyzing Twitter Data Made Simple

SQL on Twitter "If I had more time, I would have written shorter letter" -- Blaise Pascal There have been lengthy articles on analyzing twitter data by Cloudera here , here and here . More from Hortonworks here and here . This article is going to be short. Thanks to features in Couchbase 4.5 ! There have been lengthy articles on analyzing twitter data by Cloudera here , here and here . More from Hortonworks here and here . This article is going to be short. Thanks to features in Couchbase 4.5 ! Step 1: Install Couchbase 4.5 Use Couchbase console create a bucket called twitterand CREATE PRIMARY INDEX on twitter using query workbench or any other tool. Step 2: Request for your twitter archive . Once you receive it, unzip it. (You can use larger twitter archives as well). cd <to the unzipped location>/data/tweets Step 3: $ for i in `ls`;          do            grep -i ^Grailbird $i > $i.out ;  ...