I was curious to see how valuable the use of hashtags can be in performing some basic research. Â Two answer the question, I did a little study over the past couple of weeks. Â Over the past 12 days (7/20 to 7/31/13), I captured all the Twitter posts that had either #BigData or #Tableau in the post. Â How I captured the Twitter data using the Twitter API is another interesting story but is beyond the scope of this article, although you can see a short video of a part of the process below.
Once I had the data, I processed all the hashtags that were included in these 2,450 posts. Â I sorted the data set, organized it by category and sent it to Tableau. The organizational step simply put all the like terms such as BigData into one bucket (i.e., #BigData, #bigdata, #Bigdata…) so that I could do some counting and visualization. Â A simple dashboard showing the results is shown below.
If you like this article and would like to see more of what I write, please subscribe to my blog by taking 5 seconds to enter your email address below. It is free and it motivates me to continue writing, so thanks!
This short study taught me a few of things regarding hashtags. Â First, I was surprised to see how many spelling variations there can be for key terms used as hashtags. Â For the term #BigData, there were 21 variations captured in the Twitter feeds. Many people apparently don’t really understand what a hashtag is supposed to represent (and they probably don’t know about hashtags.org) so they just invent their own hashtag at the time they are writing the post! Secondly, Â the study allowed me to look at the amount of peripheral hashtag noise that is being generated for these two key terms. Â There are over 900 other hashtags used in the posts that are principally related to BigData (1847 posts = 75%) and Tableau (698 posts = 28%). Â Lastly, I was very surprised to see that Google, one of the leaders and innovators in BigData, were mentioned in only 4 hashtags, with two of these being related to Google glass. Â I guess I expected to see more hashtags referencing company names and/or emerging technologies, especially for companies that are promoting their BigData technologies. Â Maybe the 140 character limit has something to do with this. I was also surprised to see that only 9 posts combined the hashtags #BigData and #Tableau (0.4% of activity, but required exact hashtag spelling), especially considering that Tableau was a 2013 Codie best big data solution finalist. Click here for a link to the Tableau Public Workbook for this analysis. For more information on how to use hashtags properly, have a look at this article.Â
In the second part of my analysis, I’m going to investigate who wrote the posts and who is referenced in the posts using the @content available in these posts. Â I want to see who is doing the bulk of the publishing on these topics and what they are writing about in particular. Â More on this later.