How I Use #Tableau LOD’s To Process Asynchronous Time Series Data

tseries

Introduction

I have been incrementally making progress on the complete analysis of the 2.5 year blogging experiment I conducted between May 2013 and Dec 2015. After a 3 month break from the topic, it is time for me to get back to work and get the job done.

While thinking about how I wanted to analyze the experiment, I had a vision of a workbook that I thought would help me quickly identify and comprehend various aspects of my blog article performance. I want to uncover and understand why certain articles performed great in terms of readership, while others did not perform as I expected they would.

To fulfill my vision, I needed  to look at two forms of time series data. First, I needed to see how articles performed over their individual time series histories. These articles are independent, asynchronous, and they have their own unique characteristics of readership continuity, daily readership, and peak readership. Each article is asynchronous because they begin at different times and they have differences in daily readership over time. Secondly, I needed to see a simultaneous view of all articles written so I could make quick comparisons between them.

To accomplish both of these goals, I built a special type of view that used the power of Level of Detail (LOD) calculations to deliver the data I needed and the insights I wanted to achieve. I send many thanks for this ability to the Tableau developers.

I conducted the 3danim8 blogging experiment to learn how to become a better blogger. For this desire to become reality, I have to complete the quantitative analysis of the experiment that ended about 1 year ago.

However, the amount of time needed to gather the data I need for the complete numerical analysis and regression modeling I plan on doing is daunting, to say the least. For this reason, I’m attacking the problem from different angles than I originally expected to do. This article represents one of the new approaches.

Previous Analyses

For new readers of the blog, these two articles tell you what I did in the experiment and why I did it.

  1. Click here to review the official ending of the experiment (The Conclusion and overview of why I did the experiment)
  2. Click here to review the epilogue  of the experiment (An examination of some of my frustration in failing to achieve one of my objectives in the experiment )

To date, I have published three articles that quantified various aspects of the blogging experiment. Here are the links:

  1. Click here to review part 1 of the blogging experimental analysis (Geographical expansion of the blog)
  2. Click here to review part 2 of the blogging experimental analysis (Quantifying the importance of  the terms “How, How To, How To Use” in blog titles)
  3. Click here to review part 3 of the blogging experimental analysis (Determining whether slow or fast-burner articles are more important to the long-term sustainability of a technical blog )

Understanding the Problem

When you conduct an experiment that takes over 2.5 years to complete, a lot of asynchronous time series data gets created. With over 220 articles for me to process, the time series histories of these articles each tell individual tales of success, and for some other articles, less than optimal performance. My goal is to understand why these articles performed the way they did.

In the following video, I explain the problem I have and the solution I developed using LOD calculations. I think the application is interesting so I decided to write this article. This type of article is great for people that are trying to understand why they would need to use LOD calculations and it shows the versatility of this approach.


The Calendar Views

To process the time series histories, I developed LOD calculations that operated at the level of each article. For example, the LOD expression shown in Figure 1 calculates the average number of article views per day over the lifetime of the article. This number does not agree with a simple average reference line as shown in the previous video because there are many days that occur where an article does not get read. For this reason, the LOD calculated average is a more accurate measure for data that is not complete across the time domain.

LOD_AVG.JPG

Figure 1 – The LOD expression used to calculate the average number of article reads over the lifetime of the article.


To simultaneously compare three performance metrics for this time series data, I developed a calendar style matrix that allows me to quickly identify high-performing articles.

Figures 2 – 4 show the total views per article, the average views per day, and the maximum readership in one day for all the articles I wrote, respectively. Each of these views are created using Tableau level of detail calculations. Each of these views has taught me valuable insights into how readers responded to articles written during the experiment. This packaged Tableau workbook can be downloaded by clicking this link.

3danim8-calendar_total_views

Figure 2 – Total article views.


3danim8-calendar_avg_per_day

Figure 3 – Average views per day.


3danim8-calendar_max_per_day

Figure 4 – Maximum article reads in any day.


The description of how the level of detail calculations (LOD) are formed and a discussion of how I used these views is presented in this video.


The Purpose of These Views

I have understood for a long time that visual analytics can be so powerful that at times it can either be used to replace regression modeling, or at least act as a supplement to it. There have been instances in my past where I could determine statistically significant factors from a process improvement experiment before I conducted the numerical analysis of the test. This was possible because of the power of Tableau.

In this case, I wanted to quickly visualize the articles that have been able to sustain decent readership over many years. I also wanted to quickly identify which articles triggered a strong readership reaction when they were published, or at any time after publication. To do this, I needed to simultaneously display three distinct key measures for all the articles I published.

By creating the calendar type view and coupling it with LOD calculations, I could see what I was looking for in a matter of a few minutes. These views give me powerful insights into the how the experiment unfolded over time and in the different categories of topics I write about. This is very pragmatic usage of Tableau because it also allows me to launch any article just by clicking on the dot in the matrix.

Special One-Day Advanced Notice

Today is Tuesday and tomorrow is Wednesday. If you happen to work at a certain place like I do, come join us tomorrow for another awesome “Tableau Event”. If you happen to do this and you happen to be good at LOD expressions and problem solving, you will be given a chance to take this analysis even further than I have already shown in this article. If you solve the problem I will be posing, you will win some awesome Tableau goods. I promise to make it worth your time. Once this event is over, I’ll add to this article to show additional ways that LOD’s can be used to gain insight into this data.

Final Thoughts

I took about 4 months off of blogging (Figure 5) after I finished the experiment in Dec 2015. This was a time of reflection, a time for thinking, and a time to rebuild energy. By giving myself a chance to evaluate what I did in the experiment, I have been able to improve the quality of my work and I have thought of new ways of looking at the data that was collected during the experiment. I think these are both good things.

2016_posting_view

Figure 5 – I didn’t work much on the blog from Dec 15 to Apr 16. I needed a break. The grey dots are days that I didn’t write, and the blue and black dots are days that I did. This is another form of a simplified calendar view.


To understand several of the techniques I use to continue to improve my skills, I would strongly recommend that you read this article. Thanks for reading.

.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.