How to Use and Better Understand Alteryx Batch Macros

Preamble

I originally wrote this article 1.5 years ago, in March 2015. I believe it has sat languishing in my draft blog box, like a long-forgotten Grossery Ganger! I think that it is now time to get on with it.

Tonight I decided to make an instructional video to get this concept completed. The article is less than I wanted it to be, but somehow maybe more than I hoped it would be. The instructional video will help people in the future, so I decided it was time to unleash this beast as I continue to work through my blog backlog.

Was this a simple case of procrastination on my part? Or did I stop writing the article to learn the macro techniques better than when I started writing the paper? Only I know that answer, but I encourage you to watch Adam Grant in this TED talk (Figure 0) for more enlightenment on the subject of original thinkers and procrastination.

Original_thinkers
Figure 0 – Click the Figure to go to the excellent TED talk.

Introduction

When you are a serious mountain climber, you eventually try to reach the highest peaks. As you ascend towards the heavens, you try to gobble up air that isn’t there. You may subsequently need supplemental oxygen to make it to the top.

That might be true for most people unless you are the impressive, superhuman Wim Hof. Wim climbed nearly to the top of Mt Everest without oxygen while wearing only shorts!  Go ahead and watch the documentary and read the article if you don’t believe me. If I can find the time to write another essay, you will soon be reading about how I practiced a method similar to the Wim Hof method starting in 1979 or so, which was at least ten years before he discovered the ways he now practices. (Update – here is the article!).

Now getting back to the story, when you work with Alteryx, your skills can improve dramatically over time. You might find yourself becoming more capable of producing efficient, fast workflows that can generate a lot of helpful information.

Eventually, you might find yourself producing workflows that cannot execute because too much information is being created within the workflow or the workflow is taking too much memory or time to complete. These are the times when you need to reach for supplemental oxygen. In this case, the oxygen comes to you in the form of Alteryx macros.

Disclaimer

I am still learning a lot about Alteryx daily, that is for sure. I am just a guy that wants to understand how better to use one of the most fantastic software packages ever created.

By the time this article is complete, I will have authored several batch macros because I will produce examples that will help me (and hopefully you) to understand Alteryx macros better.

For these reasons, I might not produce perfect macro solutions to the problems I am attacking. However, I promise you that my approaches will be well-documented and will work. The macros will be for real-world problems developed by me as I strive to accomplish previously impossible tasks in Alteryx.

I read something that has driven me as a teacher and as a scientist a long time ago. The great physicist Richard Feynman once said that if he couldn’t explain a topic to a kindergarten-age child, he didn’t understand the topic.

At the time of the initial writing of this article, I do not entirely understand Alteryx macros. Eventually, I will have a better understanding of them because I will teach myself new concepts. If this article is a good one, you will have a better understanding of them, too.

What Are Alteryx Macros?

You can write Alteryx macros in various forms, including (1) standard macros, (2) batch macros, and (3) iterative macros. These techniques appear to me to generally increase in complexity from 1 to 2 to 3. Each type of macro allows a process (workflow) to be repeated until all records or groups of records (or a specified number of them) pass through the macro.

A simple macro is an approach that allows a user-entered selection to be included in the macro calculations. You might also ask the user a question that impacts how you are going to execute your workflow. For example, you might have a workflow that calculates monthly margin for your company, but that margin depends upon a multiplier that gets calculated outside of your workflow. This multiplier can be obtained in a simple macro from the user at the time the monthly margins are created.

A batch macro is a tool that allows you to send information into a workflow to run the workflow a set number of times. By controlling what you send to the workflow, you can control how much content is being produced, how much memory is being consumed, and how long each workflow will take. Batch macros allow you great flexibility in executing your workflows.

An iterative macro is a tool that allows you to run a workflow any number of times (up to 2 billion iterations) until a specified criterion is met. You can place a limit on the number of iterations that can occur in the macro. Writing an iterative macro is more advanced than either the simple or batch macros and requires more thought on how to design the workflow.

The Focus of This Article

In this initial article, I focus on the second type of Alteryx macros – batch macros. The reason for this is that I recently had to develop a couple of batch macros to get some computationally intensive work completed. Eventually, I will produce examples of each type of macro to demonstrate their flexibility and how I use them in many Alteryx workflows.

The reason I want to write this article this is simple. As I continue to improve my Alteryx skills, I am finding that I need to use macros. The existing documentation and examples I have seen have not been sufficient to learn to be a competent macro builder.

For some of my more advanced workflows, I have to have some fine-scale control over the workflows to avoid excessive run times and memory consumption. When you work with massive data sets, you may have to be smart with designing and executing the workflows.

Background

I have only needed to use a batch macro once in the time that I have been using Alteryx. I wrote an article that described the usage of this macro.

In that example, I used a simple batch macro to send a list of climate monitoring stations into a moderately complex workflow to process time-series climate data. I needed that macro to avoid producing too many output files at too many climate monitoring stations.

I wanted to process data from a select list of monitoring stations I had chosen, which required using a batch macro. So if you find yourself in a situation where you have a lot of input files to send to a workflow, and you only want to send a subset of them, read that article to find out how to create that type of batch macro.

Developing that macro took me way too long, and even now, when I revisit it, I still have to think about how I created it to understand how it works.

Conceptually, it is straightforward. I send a list of files into a workflow to process the data at these monitoring stations. The list of files is used in the batch macro to process the data one file at a time, which is the control that I desired for this workflow.

The review of this work taught me one thing. Alteryx batch macros may not be intuitively obvious when you first start using them. My inability to immediately comprehend what my macro was doing (after a few months of elapsed time) surprised me a bit since I have created hundreds of macros for all kinds of computer applications for over 30 years.

To improve my comprehension and command of Alteryx macros, I decided to research, test, and develop additional examples to help others use these techniques.

Existing Macro Documentation

When I researched using Alteryx macros, I primarily studied three training videos available as online resources. I would strongly recommend viewing these three macros to anyone that wants to learn how to write Alteryx macros. The hyperlinks to these training videos are below:

  1. Standard macros
  2. Batch macros
  3. Iterative macros

Each example gives a complete description of the sequences needed to complete the macros. However, the problems are insufficiently explained in many of these videos. The workflows and macros are quickly developed without adequately preparing the user with a problem statement. Additionally, the examples are not necessarily real-world-based, and the problems are ad-hoc in nature and are intended only for demonstrating the macro techniques.  For these reasons, I had to watch these videos repeatedly to understand the full complexity of each type of macro.

Additionally, Alteryx provides several macro examples in sample files that are supplied with the software. As shown in Figure 1, there are many samples to study. I didn’t realize that these examples existed until I developed the examples shown later in this article. Now I have more studying to do to learn additional techniques!

Figure 1 - Aleryx sample macros.
Figure 1 – Aleryx sample macros.

 

My Example Batch Macros

I recently developed a workflow that extracted demographic data from a series of over 4200 locations across the US. An Excel data source contained the locations that formed the beginning of the workflow.

If I tried to run the workflow in its entirety, the workflow eventually ground to a halt because the setup I used consumed all of the system resources (RAM and hard drive space). This approach did not allow me to complete all of the needed processing. To solve this problem, I thought of a few ways that I could complete the work.

Method 0 – The Brute Force Method (Not Recommended, but It Can Work!)

Although I am ashamed to admit it, I went old-school and went toe-to-toe with Alteryx in a weekend-long battle of wills. I reverted to my roots and used the brute-force method to complete the job while whittling away the time down in Austin, TX.

The brute force method is not elegant, and it is not macro-based. I did this for three reasons: (1) I didn’t know how to write a macro to do this work; (2) it was a Friday night, and I was all alone with nothing better to do all weekend; and (3) I wanted to get these results for a big demonstration project. Sometimes you have got to get down and get dirty, and that is what I decided to do that weekend. I don’t regret it at all!

I started the workflow and used the record-limiting feature of Alteryx (Figure 2) to control how many locations would be processed at one time. Through trial and error, I found out that if I chose about 200 locations, the workflow took about 1.5 hours to complete.

With 4200 locations to process, this meant that I had to do this 21 times!. After each workflow run, I had to delete the top 200 rows of the Excel database to get to the next set of 200 records to process. It took me over two days of nearly around-the-clock attention to complete this work. I wouldn’t recommend this to a friend, however.

Brute_force
Figure 2 – Using a 200 record limit to execute the workflow 200 times. After completing each 200 record block, I had to have to remove the top 200 rows from the database to get to the next set of 200 records.

 

Moving Beyond the Brute Force Method into Macro Land

After completing the work, I thought about setting up a macro to complete the work. There were two ways I could visualize for doing this type of job.

First, I wanted to run the workflow (n) times based on a field in the database. In this case, I tried to process locations by state name, which was a field in the database.

Second, I wanted to control the group size (or record count) pushed through the batch macro. If I chose, I could process 1, 5, 10, or 100 records at a time. I wanted to create a macro that had manageable workflow run times.

You might wonder how this is different than the brute-force method. Well, the macro can execute the entire job n-records at a time. I had to do Excel record deletions in the brute force method above and start the workflow 21 times. With this proposed batch approach, I only had to run the workflow once, with the batch processing doing the job 200 records at a time for 21 times!

To accomplish the second objective, I summoned some Alteryx help, which came back to me in the form of an excellent Alteryx sales engineer named Anthony Johnson. We built the group-size-based batch macro in a relatively short working session that did what I wanted to do. Sometimes there is no substitute for experience, and Anthony had the right experience for the job.

Method 1 – Create a batch Macro to Process a List of States

Today, I decided to complete this article. Rather than writing more jargon, I demonstrate how I converted a standard workflow into a batch macro. The video shown below discusses all aspects of the standard workflow and how it gets converted to a batch macro.

I also show how to build the driver program, including how to insert the macro. I think this video will be well worth watching for anyone wanting to learn how to make an Alteryx batch macro. The twelve minutes it takes you to watch it will save you hours in the long run.



 

Method 2 – A Batch Macro To Process a Set Group Size

Anthony and I built a batch macro to execute my workflow N-times, in which (N) represents the number of locations to be sent through the workflow. Before I show the macro for this case, let me explain something this is important to know.

To build batch macros, you start by converting the standard workflow into a batch macro. Then you install some controls in your batch macro to enable the batch behavior. Once the controls are in place,  you design a driver workflow to send the information into the batch macro. The video in the previous section shows these steps.

Figure 3 shows the batch macro we developed for this case of running a macro N-times based on a group or block size. You should choose the group size to maintain control on memory usage and/or execution time.

Groupsize_workflow
Figure 3 – The batch macro developed for running a workflow N-times, based on a group size that is set in the driver program.

Figures 4 through 7 show workflow details from Figure 3. These tool configurations were developed to run the workflow in the group sizes defined in Figure 8.

Group_Numbers_Step1
Figure 4 – Details of the beginning of the batch macro.

Group_Numbers_Step2
Figure 5 – The mathematics of the GroupName.

Group_Numbers_Step3
Figure 6 – Configuration of the Update Value action.

Group_Numbers_Step4
Figure 7 – When the GroupName = 0, the macro is executed with the block of data you predetermined in your driver program.

The constructed driver program (Figure 8) was used to drive this program. With over 4,200 locations used and thousands of demographic variables extracted using each drive time polygon, the batch macro approach was perfect for executing this massive volume of work. The results of this work were mind-blowing, by the way. Long live Alteryx, the king of data prep, processing, and full-spectrum analysis!

Group_Numbers_Driver_Program
Figure 8 – The GroupName block size is set to 420 rows of data. This means that with about 4200 locations, this macro would execute 10 times.

 

Final Thoughts

According to Ned Harding, the chief mastermind of Alteryx, the dynamic input tool is better for doing work like this. He once explained why using the dynamic input tool is superior, but that was more than a year ago, and I cannot remember what he said!

As I explained to Ned at the time, I could not find any examples of how to use the dynamic input tool. Since that time, however, there are now a few examples of dynamic input tool usage thanks to superstars like Chris Love and a few others. I’ll have to indulge in this greatness at a later date.

However, this batch macro technique makes sense for an old-school guy like me and is easy to implement once you understand the fundamentals. One day I’ll write the second article on iterative macros, especially since I spent about two decades studying the topics of convergence, iterations, and numerical methods.

Additional Resources

The information lab published some Alteryx macro articles in 2017. Here are some references:

Iterative macro: Creating an Iterative Macro

A general search of their website: The complete list of macro articles


Update in October, 2021

After writing many batch macros in the five years since I wrote this article, I realized that I should explain a few techniques I developed for processing routine big data sets. I recently made a couple of YouTube videos that include:

  • an example of how I run monthly workflows (Part 1)
  • an example for simultaneously downloading multiple monthly data files (Part 2)


6 thoughts on “How to Use and Better Understand Alteryx Batch Macros

  1. A very helpful post, Ken…your videos are what I used in my early Alteryx days because they really helped me understand the thought process behind building workflows. And as you know, it took me a little while to get the hang of macros as well. 🙂 But once you understand their basic logic, new worlds of opportunity open up, similar to parameters and table calculations in Tableau.

    • Hi Mike,

      This article was a way for me to organize my thoughts and to put reference material in one place. I still think there is more work for me to do on this topic because there is so much untapped potential in this technique. I used to long for a similar capability in Tableau: //datablends.us/2013/10/08/i-wish-tableau-had-these-two-capabilities/ , which I understand will likely never happen.

      I appreciate your comments about how my videos have helped you. You certainly have an immense amount of talent based on your recent Turducken creation!: http://mixpixviz.blogspot.com/2016/07/creating-twb-audit-workflow-in-alteryx.html. You have rocketed through the Alteryx learning curve like you are riding a SpaceX rocket to the moon.

      Here is a funny story for you. In my first 50 years of life, I ate enough meat to fuel several people. Now in my second 50 years of life, there will be no meat in my future because I am now a plant-based eater thanks to the profound influence and teachings of Rich Roll: http://www.richroll.com/.

      For this reason, I didn’t understand what a Turducken was in your article, even with the awesome graphic included. Yesterday at the Tableau 10 roadshow, Nelson Davis and I were talking about you. I was bragging about your ascent into the Alteryx stratosphere and I told him about your article, of which he was already familiar. Next, he had to explain to me what a Turducken represented! He said: “I think its a chicken stuffed inside a duck, that gets stuffed inside a turkey, or something like that!” I was cracking up at the prospect of seeing such a thing. The light finally turned on for me!

      Thanks for your friendship, Mike,

      Ken

      • That is pretty funny…I think I may have actually had a Turducken before but it’s quite possible that I’ve only been in their mythic midst once or twice but never partaken. I suspect you won’t leave the earth with any regret if you don’t eat one. 🙂

        Thanks for the compliments on my Alteryx pursuits…they mean a lot coming from you since you were definitely one of the key people out there who made me really want to give that software a spin (the other was Joe Mako). What you’ve been able to do with it has been inspiring, to say the least.

        I do hope we get to share a beer or two at some point – assuming you haven’t given up alcohol in your latter 50 as well. I won’t be able to make it to TCC this year (our company has drastically cut their conference/travel budget this year), but maybe I’ll get to the next Alteryx one. But if you ever find yourself near the Dagobah System (aka Seattle, Tableau’s HQ), let me know.

      • Yes, Vegas it will be. When I told my wife that Inspire will be in Vegas next year, she said, pack your bags and start saving some serious cash$$$! She and my daughter Sarah had a great time in Vegas while we were at TC15. Me, you and Joe need to find some time to hang out and do a little brain stormin’ or Turducken, as the case may be!

  2. Pingback: I Wish #Tableau Had These Two Capabilities | 3danim8's Blog

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.