Introduction
As with much of my global warming work, I never expected to discover the primary insight that I discuss in this article. This work has been so fascinating for so many reasons, which is why I continue to perform the research.
Background
Recently I had a light bulb moment. With years of evaluating temperature changes across the globe now completed, I thought about how I could do something new to add credibility to my global warming insights. I thought it would be good for me to try to predict future temperatures.
Instinctively I knew that predicting future temperatures would be tricky and likely to fail, but I thought that I would try to do it nonetheless. I thought that if my predictions were good maybe people would begin to see that I’m not so looney after all.
In July 2021, I created future daily maximum temperature predictions across the upcoming six months (July – Dec 2021). I made predictions for thousands of monitoring stations around the world. I waited half a year for the temperature data to be created, recorded, and put on a web server for me to download. By mid-December, the waiting ended and I went and got the data. It was at this time that the fun began as I used Tableau to visualize the comparisons.
I gingerly started the comparisons because I was sure that they were going to be ugly and inaccurate. I first looked at my home-based (Knoxville, TN) monitoring station. Surprisingly, the results were good enough that I thought it had to be a coincidence. Next, I looked at a few other stations around the US, including my old hometown of Chicago. I picked some major metropolitan airports starting with O’Hare. The visual results from those sites were good, too, so I decided to compute monthly-based error terms.
Figure 1 shows the measured vs predicted daily maximum temperatures from the airports I chose to use, and Figure 2 shows three types of error analyses for these comparisons. The error estimates were much lower than I expected them to be.
Once I finished that work, I put the power of Tableau to good use. I made graphs of every monitoring station around the world for which I had made predictions. I scanned the world to see if my method was failing anywhere. I created charts for every country and I studied them. Afterward, I made an interactive dashboard to scan stations around the world to see how my prediction method worked. I was very happy with the results. If you want to see what those look like, you can watch the video shown below.
The accuracy of the temperature predictions really surprised and delighted me. I had no idea that my methodology would allow me to estimate future daily maximum temperatures with such accuracy. Considering I was able to write a computational code that could make all these predictions within 40 minutes using only one predictor, I was encouraged to take another step in my research.
The next step involved pulling out the big predictive guns. I wanted to know how my predictions would compare to the best time-series AI/ML techniques in existence. I believed that the AI/ML routines would perform better than my method. My natural curiosity had to be satisfied, so the next phase of the work began.
A Little Help From My Very Talented Friend
About the time I was beginning the work, I got a call from a good friend from Alteryx. On that call, Alan Jacobson gave me a tip that helped me conduct and expand upon a time series analysis that I was just beginning. I told Alan that I wanted to evaluate the robustness and accuracy of the computational framework I developed for predicting future temperatures. I told him that I wanted to compare my algorithm (The Ken Black Trender) to as many AI/ML time series predictors as I could find.
Alan immediately told me about the Python pycaret package. This package allowed me to evaluate my algorithm against multiple time-series prediction methods. Initially, I was a little concerned that this competition would be a blowout because I was sure that the AI/ML methods would kick my a$$. Therefore, I wasn’t sure I was happy about Alan’s advice, but I have never been known to back down from a new learning experience.
Figure 3 shows the time series methods present in the pycaret package. There are very robust computational methods in the package, and I tested my algorithm (#21) against each of them. With names like extreme gradient boosting, passive-aggressive regressor, and the decision tree regressor, I knew I was in trouble. Once I saw this list of algorithms, Lindsey Buckingham popped into my mind singing: “I think I’m in trouble“.
There were two reasons I wanted to do this work. First, I wanted to learn more about the characteristics of some newer-generation time-series predictors. Second, I wanted to see how the characteristics of my method compared to the leading AI/ML methods.
Simulation Method
In the following video, I discuss the techniques I used to perform the simulations. You can download the San Francisco Airport example I discuss by clicking this link.
Since the Ken Black Trender model uses only historical temperature data to make predictions, the 20 modeling methods were set up the same way. Each model was configured to predicted future temperatures using only historical daily maximum temperature data. Additional simulations with more predictors were also conducted, as described later in this article.
Results
After conducting the model simulations and writing an error analysis Alteryx workflow, I tossed the predicted vs measured temperatures over to Tableau for visualization. I am a proponent of using data visualization to evaluate the accuracy of any modeling predictions.
For studies like this one, it is not sufficient to depend upon error terms to pick the best predictor. There are methods that can minimize errors but fail to capture the essence of what is being simulated. This is one lesson I learned during the 25 years I worked as a numerical modeler. This insight is why I spent two decades writing graphical post-processors for evaluating numerical model results.
Visual Analysis of Predicted Results
The best way to evaluate the predictive characteristics of each model type previously discussed is to plot the actual vs predicted daily maximum temperatures. In the slideshow below, you can review how each model performed with respect to predicting daily maximum temperatures at the San Francisco Airport. There are several different behaviors, which is why I chose to include them in the slideshow. I think that this example really helped me learn more about the predictive characteristics of each method.
The lower part of the charts show daily error bars, which indicate whether the predictions were above or below actual temperatures. If the bars are red, the actual temperatures were hotter than the predictions. If the bars are blue, the actual temperatures were cooler than the predictions.
Quantitative Error Analysis
I chose three forms of error analysis for this exercise. Method 1 is the mean absolute error (MAE). I calculated this by month for each prediction method. Since August 2021 was the first-month having complete data, I sorted the MAE from low to high for that month. These results are shown in Figure 4. The order of all models is the same for the upcoming error tables.
The Bayesian Ridge, Linear Regression, Ridge Regression and Huber Regressor all had lower MAEs than the Ken Black Trender method.
Figure 5 contains the second method I used. The root mean squared errors (RMSE) were calculated by month. For the month of August 2021, the Ken Black Trender had the lowest RMSE.
Figure 6 contains the third method I used. The mean absolute percentage errors (MAPE) were calculated by month. For the month of August 2021, the first three models had the lowest RMSE.
The Two Best Models
Based on a visual comparison, the two best models for computing future daily maximum temperatures at the San Francisco airport using only one predictor are the XGBoost model (Figure 7) and the Ken Black Trender (Figure 8). The XGBoost method does a better job predicting the daily highs/lows (i.e. the overall daily fluctuations), but the Ken Black Trender has lower overall errors while being very stable over time (no indication of errors growing over time).
The XGBoost method has another advantage in that it can easily be extended to use other predictors, as will be shown below. The Ken Black Trender model has an advantage in that thousands of predictions from monitoring stations all over the work can be completed in minutes within Alteryx. I’ll have to do additional experimentation to see how long it would take for XGBoost to make the same number of predictions.
Extending the XGBoost Method To Multiple Predictors
For the fun of it, I wanted to see how much better the XGboost method could predict future temperatures if precipitation and daily minimum temperatures were added to the model. I also increased the number of folds to 5.
Figure 9 shows the results (compare this to Figure 7). The model predictions definitely improved (see Figures 4-6), which gives me evidence that now I can work on adding multiple predictors to the Ken Black Trender methodology.
Final Thoughts
When I was a kid, there were a few Donovan songs that I liked. One was called “Mellow Yellow” and another one was called “Sunshine Superman”. Donovan claimed that Sunshine Superman was a love song.
At the 1:00 mark of the Sunshine Superman video shown below, he uses the line:
Superman or Green Latern ain’t got a-nothin’ on me
Donovan, 1966 in Sunshine Superman
Fifty-six years later, my brain created the following line:
AI or ML ain’t got a-nothin’ on me
Ken Black, 2022 in Knoxville, TN
I’m happy with the results of my method, but as always there is more to learn. Thanks for reading this article!
Ken, thanks for sharing your work on this. I’ve been learning Python to prepare for a forecasting project I’m now working on, and this is a great example of how to apply to a real-world case.
You are welcome, Gary. The download link should be useful for you. Let me know if I can help you in any way.
Hey Ken, I wanted to thank you for all the work you have done and I saw you in person at the Alteryx conference where you did your presentation. I was wondering if you had a tableau workbook that has climate projections by city? If not have you looked in the climate of Phoenix,Az and southern Az Elgin/sonoita area long term? Thanks!
Brian
Hi Brian,
First, I’d like to thank you for attending my talk! Alteryx Inspire has many good presentations, and I appreciate everyone that chose to participate in my presentation.
I have a Tableau workbook that compared observed and predicted temperatures from July 2021 through April 2022 for locations worldwide. I made the predictions in July 2021 and then waited until April 2022 to collect the actual data. I took a look at a few of the monitoring stations in southern Arizona. I will send them to you via email. I have another file that has multi-year predictions for these sites, but I do not have them in a Tableau workbook. Please let me know if you are interested in receiving some data in an Excel file or in another format.
Thanks,
Ken