Introduction
Whenever I have a recurring question asked to me about a Tableau Desktop feature, I realize that something is not quite intuitive in the software and needs to be fixed. Such is the case with the Tableau trend models as they are currently described in the software.
The Tableau trend models need to be clarified and I’m going to show why that is the case using linear trend model examples. I might write another post on other types of trend models (which have similar problems), but for today, I’m sticking with linear models. I am a fan of the great physicist Richard Feynman who once said something to the effect of: “If you can’t teach it to a 5 year-old, then you really don’t understand the topic.” Additionally, since I have worked for 30 years as a computational scientist/mathematical modeler, I have had to decipher many types of models in various scientific disciplines for reasons similar to this. I like to keep things simple, clear, and easy to explain.
If you like this article and would like to see more of what I write, please subscribe to my blog by taking 5 seconds to enter your email address below. It is free and it motivates me to continue writing, so thanks!
An Example Trend Line Model
Lets take a look at some data as shown in Figure 1 and the linear trend model Tableau Desktop created for that data.
Before I begin addressing this model, let’s take a minute to reminisce. Back in seventh grade or so, we learned about one mathematical model for lines having the form of y = mx + b. This is called the slope-intercept form, where y is the value on the y-axis for any value of x, with m representing the slope of the line and b being the y-intercept (at x=0). We learned about how to calculate the slope as rise over run, and slope can be positive or negative. We used this equation all through high school and into college for all kinds of things. For me, the slope-intercept form is easy to understand and is intuitive. I can teach this to a 5 year-old. Although not intuitively obvious (for reasons shown below), the Tableau linear trend models are simply slope-intercept equations of a line.
Now getting back to the example presented above in Figure 1, the Tableau trend model is presented and statistics are given for that model. The model formula is described like this:
Now I am a professional mathematical modeler, so I take this description to literally mean that the trend line model formula is:
y = (week of ordered date + intercept).
Looking at the graph, I can see that the x-axis is the Week of Ordered Date, so that is where my x value will come from. I can also see the intercept is shown in the table as being equal to -1.51996e+009. The problem is, if I calculate using this equation, I don’t get anything that follows the trend line because the values of x and b have radically different size. In fact, this formula looks like y = (x + b), not y = mx + b, which is what I expect the equation for the line to look like. The slope is missing from the stated model and the parenthesis lead me to believe that I add x + b! Where is the slope? Why isn’t it in the stated model?
These are the reasons my students get confused and ask me to explain the linear model in the Tableau training courses I teach. Now let’s get to the point where we can clarify this model using the details provided by Tableau as shown below in Figure 2.
To answer this question of where is the slope in this model, we need to look at the information above. There is a Column called Coefficients shown in the table which gives the the name of the term and the value of the term. It turns out that for this example, the slope is shown as a coefficient for the “Week of Ordered Date”, and the slope has a value of 36931.3. However, remember that the model formulation shown above doesn’t say anything about using a coefficient for “Week or Ordered Date”. It is assumed that we will know to use the “Value” as the slope.
The Correct Trend Line Model
Now back to calculations for the trend line. For this example, the correct model formulation is y = mx + b = (36931.3*Week of Ordered Date) + -1.51996e+009, where Week of Ordered Date is any value (date) along the x-axis. You don’t have to pick a date that is shown on the axis, any date can be chosen since date is continuous along the x-axis, even though the date is being aggregated by week. The date itself is just a number and this number represents the number of days after Jan 1, 1900 (or 1/1/1904 for Mac users!). For example, if we choose the date of 4/10/2013, the numerical representation of this day is 41,374. Plugging this date into our example model above, we see that the trend model predicts a value of y = (36931.3*41374) + -1.51996e+009 = 8,035,606.2 on the date of 4/10/2013. This is perfectly consistent of where the trend line is drawn. Therefore, we have a working model.
Tableau should be more clear in stating the linear model by including the slope in the model formulation. It would be easy for Tableau to simply state the model with the correct coefficients shown as y = (36931.3*Week of Ordered Date) + -1.51996e+009, making sure to keep the parentheses in the correct location!
Multiple Trend Lines
To add further confusion to the story, lets look at a case where you have three linear models in one visualization as shown in Figure 3 and Tableau gives you the details of each linear model as shown below. It is a nice summary of the graphs for each year, but the model formula is now more confusing than ever!
Now the mathematical model is shown in Figure 4 as:
Now this model looks like y = “Year of Ordered Date”*(“Week of Ordered Date” + Intercept). Once again, this doesn’t look like a y = mx+b formulation and nothing is said about the slope. The parentheses are in the wrong position and there is no way that this model will calculate anything like the trend lines shown for each year. The correct equations for each year follow the same pattern as described in the first example above. All you have to do is put the correct model together with the correct coefficients and parentheses and you will have working models.
The Correct Trend Models for Each Year
For the three cases shown in the example above, the models are: