A Descriptive and Predictive Study of North American Air Temperature Changes Using #Alteryx and #Tableau

model_summary


Introduction

In this article, I include a video that attempts to explain what I have learned from processing a lot of air temperature data over the past few months. I have been interested in quantifying and visualizing the temporal and spatial changes in air temperatures that have happened over the course of my lifetime in North America.

For your viewing pleasure, I even included a couple of minutes of a dorky selfie-video to describe one of the reasons I have decided to do this work.  I really don’t like doing those, but sometimes I feel the need to explain why I have chosen to pursue this work with such dogged determination, even though nobody reads it! I predict that someday someone will look at this work and say: “This is pretty good stuff”.

For whatever reason(s), I still can’t stop my brain from thinking about this topic. As I explain late in the video, I have one final question to answer regarding temperature changes using multiple linear regression modeling in Alteryx. I hope to find the time to do that work soon and I’ll explain how it works in an upcoming article. I am interested in seeing if I can improve upon my ability to predict future temperature changes.

Hopefully none of those mean-spirited climate scientists will blast me for doing this work, or tell me that I don’t know what I’m doing, or that the data is bad, blah blah blah. I don’t want to have to get derailed from finishing the job, and I certainly don’t want to have to send them a can of whup-ass.

Finally, after publishing this article, I saw a CNN article that caught my attention tonight. Here is a link to that article which is titled: “The geography of climate confusion”. That work discusses why we should be talking about global warming. I guess i’m doing my part by trying to elucidate the findings I’m discovering by processing all this weather data. In another article, my home town of Chicago just experienced its first snowless Jan/Feb, coupled with a record warm February. Could this be a sign of climate change?

Motivation

My motivation for doing this work is to educate myself on what is really happening to air temperatures across North America and to see if I can simulate the complex history of temperature changes. For me, looking at a single time series plot of global temperature change is a little less than satisfying (see the video).

I cannot believe that with millions upon millions of temperature readings in existence, that all these scientists want to do is argue about whether or not temperatures are rising across the planet. I have been perplexed about this for a long time because I think all we have to do is look at what the data is indicating to us. I have no other agenda than this. Well, that sentence is only partially true since I don’t want to stop learning how to use Alteryx and Tableau to do awesome work!

Let me definitively state that this work is strictly being driven by my curiosity. I am not trying to prove or disprove anything with respect to air temperature changes (or global warming) or the reasons why they occur.

I am simply doing a descriptive study on the historical air temperature data that has been collected over the past 50 years because I think it is important and interesting to do so. Furthermore, I switched over to using the historical data to make predictions on total temperature change over the 55 year period. When the temperature data is brought to life over time and space, the truth emerges about what is happening and the resulting story is different and way more interesting than I thought it was going to be.

Methodologies

I have attacked the problem of understanding air temperature changes from a few different points of view. To understand when and where the air temperatures have changed, I used a few different approaches as outlined below.

  1. Created visualizations at several temporal scales;
  2. Created visualizations at several spatial scales;
  3. Created over 38,000 predictive models to estimate the total temperature changes both spatially and temporally.
  4. Created custom data sets that were designed to support the quantitative and visual analysis of the data.
  5. Compared simulated to observed temperature changes using my highly simplistic modeling approach. Hey, with only a few hours to do the work over a couple of nights, I think I made great progress.

All of this work was done using Alteryx for data preparation, Tableau for visualization, and a combination of both tools for the predictive analytics as is discussed in the video.

The Air Temperature Changes Video


Descriptive Results

No matter how I look at this data, I keep seeing the same results over and over. The consistency in what I have learned by using different approaches to quantify and view the data has astonished me.

The primary concept I have learned is that big-scale temperature changes are occurring with many months showing increasing temperatures over time (55 years). The changes are not uniform from month-to-month and not all months have exhibited ubiquitous warming patterns. There plenty of large-scale zones where cooling has occurred over the past 50+ years. It is not possible to understand these types of changes when all you are shown by the experts is a single time series plot of average temperature across the globe.

I think the dashboard animations I created have allowed me to comprehend and visualize the complexities present in our atmosphere. I have learned that for certain months such as March and September, larger changes in air temperatures have happened compared to other months. These changes will potentially have big effects on growing seasons, reproductive cycles, and other things that depend upon seasonality. The warmer months appear to be expanding while the colder months are contracting.

Figures 1 and 2 show the consistency between the state-aggregated data and the complete level of detail included in the monitoring station data. This indicates generally that the warming and cooling zones are very large (bigger than state outlines) and that aggregating by month is sufficient for me to achieve some of the answers to my questions.

state_1970

Figure 1 – State-level aggregation of Tmax for the comparison of March 1970 to 1961.


ms_1970

Figure 2 – Monitoring station Tmax changes between March 1970 and 1961.


Predictive Results

Once I developed the 38,000 predictive linear models, I used Alteryx to process the data to compute the total temperature change over the periods covered by the linear models. Figure 3 shows the workflow that did those calculations.

total_temp_change

Figure 3 – The Alteryx workflow that computed the total temperature change over the years of record for each monitoring station.


The video shown below is the predicted total temperature change that is calculated from every monitoring station. The results of this work at first glace seem fairly consistent with the observed temperature changes over time.


How Good Are These Predictions?

Of course, these are just predictions. To understand if this modeling approach I developed is worth anything, I have to compare the predicted temperature changes to the measured temperature changes. The differences between the simulated and predicted temperatures are known as residuals.

In the following video, I show how I can compare the observed and simulated data, while having a look at the residuals for this simple temperature-predicting model.  I think Tableau and Alteryx make such a great companion tool set that if I had these tools back when I was doing this type of work, my life would have been much easier!


The systematic biases I showed in the residuals indicate that I have more work to do if I want to be able to accurately predict temperatures in the years beyond 2016. In my future work, I’m going to keep experimenting to see if I can uncover some key elements that I can add to my model to make the temperature predictions more accurate. This is where multiple linear regression modeling will be needed.

Final Thoughts on Using Power BI For This Work

For those readers that have been following the Tableau vs Power BI comparison series, you might be happy to know that you will be seeing a comparison of Tableau to Power BI for portions of this work. It will be very interesting, to say the least. I want to know if I can produce over 38,000 linear models in Power BI like I did in Tableau in 20 minutes. Let the games continue.

One thought on “A Descriptive and Predictive Study of North American Air Temperature Changes Using #Alteryx and #Tableau

  1. Pingback: 3danim8's Blog - Do You Live In An Area Impacted By Global Warming?

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.