Going Behind The Scenes Of 3danim8’s Blog

Introduction

Writing a technical blog is challenging in many ways. Finding topics to include in your blog is just one of the challenges. This story helps elucidate this challenge from my perspective.

If you want to write an interesting technical blog, one approach to use is to write about things that you cannot find when doing a Google search. If you are able to develop solutions to these types of topics using your creativity and skills, then chances are good that your blog will help others in the future when they try to solve similar problems.

When you use this approach, however, you should keep in mind that people will find your work when the time is right for them. That time will usually be when they need to solve that type of problem. My blog experience has shown me that 3 out of 4 hits to my blog occur due to Google searches.

My advice is for new technical bloggers is to not be deterred when you write a piece about something that interests you, only to find that no one seems to care. It is fine when this happens because eventually that blog post will be found and spread around in ways that will boggle your mind.

In this post, I am going to discuss the processes that go on behind a technical blog. I want to explain how work, creativity, curiosity and determination all get mixed together to produce blog post content. For me, blog posts seem like they suddenly appear in an instant. In actuality, the creative process is much more involved than just suddenly seeing Barbara Eden emerge as vapor from her genie bottle on “I Dream of Jeannie“. If you don’t know what I mean, watch that video link for three minutes. You will get a laugh!


If you like this article and would like to see more of what I write, please subscribe to my blog by taking 5 seconds to enter your email address below. It is free and it motivates me to continue writing, so thanks!

Subscribe to Blog via Email

Enter your email address to subscribe to this blog and receive notifications of new posts by email.


Generating Ideas For Blog Posts

My blog ideas are created by solving problems at work, at home, and while at play. If I’m trying to solve a problem and it is taking me a bit of time to do it, I instinctively know that the technical approach I am developing is good fodder for a blog post. My reasoning goes something like this.

If I am trying to solve this type of problem, then it is likely that someone else will also try to do the same thing at some time in the future. If a solution to the problem already existed, I would have found it through my on-line searches. I would not be taking the time to develop a solution technique if one existed. Therefore, if no published solution exists, I decide that I might as well write about the topic to help that anonymous person in the future resolve their problem.

It is really that simple to identify blog post material. Once you have the idea, write it down in a list on your phone so that you remember what you want to do. Sometimes multiple ideas appear and if you do not write them down, you will forget them.

Explaining How I Go From An Idea to A Blog Post

To explain what I mean, I am going to give an example from today. It is currently 2:34 am and I’m writing this post about 12 hours after initially having an idea of something I wanted to do using Alteryx and Tableau. I wrote the idea on my list and began researching how to do it a while ago.

This example is a good illustration for a few reasons. First, it demonstrates how I have to be creative in solving problems. Secondly, it shows how I have to use the work of others (bloggers and software developers) to expand my knowledge base. Lastly, it shows how I have to collaborate at times by asking for help when the solution is just beyond my grasp in the amount of time I have to work on it. In a perfect world, I would have unlimited time to solve problems. In the real world, I have to get things done in a set amount of time.

My Big Data Example Using Worldwide Climate Data

For the past few years, I have been doing a late-night dance with worldwide climate data and Tableau. I have previously written about the beginnings of this work, but I’m now getting ready to unveil a more comprehensive story.

In upcoming blog posts, I am going to explain how Alteryx and Tableau can be utilized to examine world-wide climatic data. These posts will not be trying to assess the chaos and complexities of global climate and global warming like this brilliant post from John Baez does. However, my posts will show how we can attack a big data problem and use our software tools to see what has been happening to climatic variables measured across the earth over time. I will explain how Alteryx can be used to process the data and Tableau can be used to visualize it. This work will attempt to offer a solution to the statement I previously wrote:

There are limits to everything, of course, and I’m starting to experience a bit of a bumpy ride as I speed down the multi-billion line Big Data super highway with Tableau.

The Data Setup

I am not going to hit you with the full complement of details for this particular example at this time. That will occur soon enough. What I will do is explain the problem and show how I developed a solution.

The climate data could be assembled into a multi-billion line behemoth of a file, and I had previously done that a couple of years ago. It wasn’t fun, it took a long time to do, and if I wanted to add the newest data to my analysis, I would have to do it again! When I sent this giant file to Tableau, things didn’t go as planned. Tableau balked and turned around and ran as fast as it could, as though it was trying to fight the behemoth shown in Figure 1. In a more recent version of Tableau, I suspect that the outcome would be better due to a new data driver installed in Version 8.2, but I didn’t want to do the same experiment over.

behemoth

The behemoth.

To have a chance to really work with the worldwide data from nearly 100,000 monitoring stations, I decided to employ my best new buddy – Alteryx. Using Alteryx, I wanted to find a way to pick and choose any number of monitoring stations for my analysis. I also didn’t want to do down a path of a one-time assembly of files, only later on to realize that I needed to do that again when the data got updated (which happens every month). I wanted to create a repeatable workflow for this job so that next year, or the year after that, I can continue to work with the latest data as it gets published.

Statement of the Problem

With that as a background, my problem was that I have a lot of files to process to be able to reach my goal of visualizing climate data from around the world. In fact, I have about 100,000 really ugly flat files that represent worldwide climate data from these monitoring stations, some of which go back to the mid-1700’s. Working with flat files is not a whole lot of fun, unless you have Alteryx, that is.

Flat files are used to efficiently store data by creating compact data structures that have no extra content like field delimiters. The problem is that the compact data structures have to be parsed, transposed, and rejoined in a multi-step procedure to put the data into the format needed for efficient analysis in Tableau. That requires a workflow and Alteryx is the tool to make that happen.

In this example, what I needed Alteryx to do was to process only a subset of those 100,000 flat files. My goal was to go directly from a user-defined series of ugly flat files to a series of beautiful Tableau (*.tde) data extract files for any monitoring stations I chose to examine. In fact, I wanted to send a list of files (i.e., monitoring stations) to Alteryx to process in rapid succession. The list could be anywhere from 1 to 100,000 files long.

This sounds exactly like a job that Alteryx is designed to do, right? Well, not so fast. After assembling the principal components of the workflow as shown in Figure 2, I was stopped in my tracks. I simply could not determine how to tell Alteryx to process a selected list of files in succession.

Figure 2 - A part of the workflow for processing climate data.

Figure 2 – A part of the workflow for processing climate data. Details of this workflow will be explained in upcoming posts because there are some awesome concepts captured here.

I studied the Alteryx documentation. I thought I had a solution using a particular tool called the “dynamic input tool”. Unfortunately, not enough documentation existed for me to be able to apply this tool in the way I needed to. So I continued to search. I searched the internet and I search through blogs. I did many combinations of searches, most of which did not return insight. I thought about the job from a programmer’s point of view. I realized how I would solve it by writing a program. I also realized how important examples are for using Alteryx tools that you haven’t used before.

Since I wasn’t able to find a solution, I decided it was time to ask for help. I wrote an email to my Alteryx super-hero Ned Harding (Alteryx, CTO) as shown in Figure 3. I described the problem in general terms.

Figure 3 – An unsent email to Ned.

Just before sending the email, however, I had a memory emerge from one of my Tableau super-heroes. I thought of a conversation I had with Joe Mako a few weeks back. Suddenly I realized that I could use a macro approach to solving the problem. I remembered that Joe had told me about a few macro training videos on the Alteryx website, so I took a look at them. Even after watching them, I was still unsure of how to solve the problem. I wondered if it was a batch macro that I needed.

So I started my internet search over. I started looking for macro applications for Alteryx. One of the first things I found was a blog post (Feb 12, 2014) from Ned on building Simple Batch Macros. By reading his blog post and watching the videos, I was able to create my own batch macro approach for the workflow shown in Figure 2.

Finding this solution took me a bit of effort, so the details of the approach will be documented in an upcoming blog post. I had to have two forms of input for me to understand how to solve the problem – a blog post coupled with a training video. Neither information source by itself was sufficient.  Ned’s blog post did not show me how to install the macro into the driver program (sometimes the obvious things are easily overlooked when you are the program developer!). The training video didn’t show me how to load a list of files in a batch macro, but it did show me how to install the macro into the driver program. There were missing details in each that I had to piece together to understand how to make the solution work. By taking the time to study these guides, I developed the solution.

Once I finished the program and experienced the satisfaction of having found a solution, I did something that I should do more of. I wrote Ned a thank you note as shown in Figure 4.

Ned_thank-you

Figure 4 – A thank you note to Ned.

By Ned having spent his time to write that blog post, and Joe taking the time to chat with me late a night one evening, I was able to piece together an approach that allows me to continue my research. This is called collaboration and it is fun to experience. That is a real-life example of the power of blogs and the value of sharing your knowledge.

This example illustrates how you can use existing information to solve problems. This also gives some insight into the work required behind the scenes to make a technical blog possible. The upcoming climate blog posts will document a lot of the techniques required to solve a big data problem like this. The genie does not always emerge from the bottle to instantly grant your wishes, but if you are patient and determined enough, the genie will usually help you out. Most of the times the genies don’t look like Barbara Eden did back then – they look like Joe or Ned! Hey, whoever said life was always fair?

One thought on “Going Behind The Scenes Of 3danim8’s Blog

  1. Pingback: How To Build An #Alteryx Workflow to Visualize Data in #Tableau, Part 4 | 3danim8's Blog

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.