Today I was creating a scatter plot in Tableau using about 1.5 years of monthly data. All the data points looked like they correlated nicely except for the March 2013 data point. With March 2013 included in the trend line calculation, the scatter plot looked like this and the r-squared was 59.4%:
If the outlier was removed, the r-squared jumped to 98.1% and looked like this:
Notice that in the second graph, the March 2013 data point still shows up (but isn’t included in the trend line calculation). If you want to learn how that was done, simply watch this video to learn a couple of tricks.
This is a nice idea for a static visualization, but when you filter the data, the reference area will not move. Correct?
Sorry for the delay in responding! I was locked-out of this blog for a few weeks but I’m glad to be active again.
This trick is really only intended for a static visualization, to show one or maybe two data points that were not included in the calculation of the linear trend model. Since this trick is using an annotation at a point on a screen, this wouldn’t work if axis limits were changed in a dynamic visualization. Sometimes we use this to show how that there is a data quality issue with one month of data, for instance, and the best way to show that is to create the linear model with and without the bad data point.
Thanks!
Pingback: NASA Study Reveals Pattern Of Galactic Change – SpaceCoastDaily.com | I am John Becker