We Need To Reframe How We Learn Recursion

It's about time we reframed the education of recursion around real life scenarios instead of elegant mathematical equations

Algorithms
Comp Sci
JavaScript
December 3rd, 2020

Two circles recursively regenerating inside of themselves

For programmers - especially self-taught programmers - our first introduction to the world of 'recursion' is often in the form of something math-related. It's not uncommon for programmers to intrinsically call upon some of our favourite F words when we think of recursion - no not that F word, these ones:

Fibonacci

function fibonacci(position) {
  if (position < 3) return 1;
  return fibonacci(position - 1) + fibonacci(position - 2);
}

Factorial

function factorial(num) {
  if (num === 1) return num;
  return num * factorial(num - 1);
}

Recursive versions of the fibonacci and factorial functions are some of the more beautiful pieces of code you can expect to see. These concise slices of code rely on little more than themselves to get the job done. They embody the very definition of recursion - a function that calls itself (the calling of itself is the part that is recursive). It's really no wonder that just like when a tutorial incorporates a new topic by demonstrating how it interacts with something trivial, like a counter or a 'to do' list, fibonacci and factorial functions encapsulate the topic of recursion with little to zero interference from the external complexities you might find as soon as you start introducing other bits of code to interact with. Just like the counter and the 'to do' list, fibonacci and factorial are trivial.

You may have heard a saying that "all iterative algorithms can be expressed recursively". In other words, a function that uses a loop can be rewritten to use itself. Now if any iterative algorithm can be written recursively, the same must be true for the inverse.

Note: An iterative algorithm, or function, is one that makes use of a loop to get the job done.

Let's go back to our recursive Fibonacci function and write it as an iterative Fibonacci function instead:

function fibonacci(index = 1) {
  let sequence = [0, 1, 1];
  if (index < 3) return 1;
  for (let i = 2; i < index; i++) {
    sequence.push(sequence[sequence.length - 1] + sequence[sequence.length - 2]);
  }
  return sequence[index];
}

Let's also take our recursive Factorial function and write that as an iterative Factorial function:

function factorial(num) {
  if (num === 1) return 1;

  for (let i = num - 1; i >= 1; i--) {
    num = num * i;
  }
  return num;
}

Both of our recursive and iterative approaches reach the same conclusion. If you give them the same input, they will produce the same output. Even the path the take is, arguably, the same. The only difference is the metaphorical mode of transportation used on the path to reach said output.

So if we're able to write recursive functions iteratively, why do we even need to bother learning about recursion in the first place, and what use does it have in programming?

The main reason we use recursion is to simplify an algorithm into terms easily understood by most people. It's important to note here that the purpose of recursion (or rather, its benefit of recursion) is to make our code easier to read, and easier to reason with. However, it is important to be aware that recursion is not a mechanism we can use to optimize our code for performance - if anything, it can have an adverse effect on performance compared to an equivalent function written iteratively.

If you want a short soundbite to remember, I'd say this: Recursive functions optimise legibility for developers. Iterative functions optimise performance for computers.

The frustrating thing about a lot of articles and tutorials on recursion I've read before is that they veer on the side of caution and fixate themselves on only talking about functions such as a recursive fibonacci or factorial. There's no real advantage a developer can gain from these being written recursively. This is like giving someone the punchline to a joke, without any of the steps leading up to it.

In the case of recursive fibonacci and factorial functions, we saw earlier that if we convert them into iterative functions the code itself isn't actually that bad. It's arguably just as easy to read and legible as the recursive equivalents. Sure, it might not have the same level of elegance, but if it's still readable, I'm going to go ahead and favour optimising for perfornance.

It therefore seems plausible that a lot of people learning about recursion for the first time may struggle to see any real use from using it. Maybe they just see it as some sort of overengineering abstraction. So how is recursion useful? Or better yet, how can we make the study of recursion useful?

Useful recursion can be found when we actually try to write code that resembles a real life scenario.

I can probably take a punt and say that close to zero developers anywhere (outside of a whiteboard interview) have been expected to write a recursive fibonacci or factorial function for their job - and if they were, they could have just googled it because there's a million articles that already explain it.

There aren't, however, enough articles demonstrating how recursion can be used in real life scenarios. We need fewer 'Introduction to Recursion' articles, and more articles about how recursion can help you solve that coding problem you're currently facing at work.

So now that we're on the topic, let's imagine the following scenario: your boss has given you a data structure full of departments that each contain the email addresses of everyone that works in that department. However, each department sometimes is made up varying levels of subdepartments, objects and arrays. Your boss wants you to write a function that can send an email to each of those email addresses.

Here's the data structure:

const companyEmailAddresses = {
  finance: ["jill@companyx.com", "frank@companyx.com"],
  engineering: {
    qa: ["ahmed@companyx.com", "taylor@companyx.com"],
    development: ["cletus@companyx.com", "bojangles@companyx.com", "bibi@companyx.com"],
  },
  management: {
    directors: ["tanya@companyx.com", "odell@companyx.com", "amin@companyx.com"],
    projectLeaders: [
      "bobby@companyx.com",
      "jack@companyx.com",
      "harry@companyx.com",
      "oscar@companyx.com",
    ],
    hr: ["mo@companyx.com", "jag@companyx.com", "ilaria@companyx.com"],
  },
  sales: {
    business: {
      senior: ["larry@companyx.com", "sally@companyx.com"],
    },
    client: {
      senior: ["jola@companyx.com", "amit@companyx.com"],
      exec: ["asha@companyx.com", "megan@companyx.com"],
    },
  },
};

So, how do you tackle this?

Well, from what we can see, a subdepartment appears to take the form of an object, while arrays have been used to store email addresses. So I could try and write some sort of iterative function that loops through each department, and checks if it is a subdepartment (object) or a list of email addresses (array). If it's an array, we can then loop through the array and send an email to each email address. If it's an object, we could create another loop to tackle that subdepartment, using the same 'check if it's an object or an array' tactic we used earlier. From what we can tell, our data structure doesn't go any deeper than two sub levels. So one more iteration should satisfy all levels and do what we want it to.

Our final code might look something like this:

function sendEmail(emailAddress) {
  console.log(`sending email to ${emailAddress}`);
}

function gatherEmailAddresses(departments) {
  let departmentKeys = Object.keys(departments);
  for (let i = 0; i < departmentKeys.length; i++) {
    if (Array.isArray(departments[departmentKeys[i]])) {
      departments[departmentKeys[i]].forEach((email) => sendEmail(email));
    } else {
      for (let dept in departments[departmentKeys[i]]) {
        if (Array.isArray(departments[departmentKeys[i]][dept])) {
          departments[departmentKeys[i]][dept].forEach((email) => sendEmail(email));
        } else {
          for (let subDept in departments[departmentKeys[i]][dept])
            if (Array.isArray(departments[departmentKeys[i]][dept][subDept])) {
              departments[departmentKeys[i]][dept][subDept].forEach((email) => sendEmail(email));
            }
        }
      }
    }
  }
}

I've checked it out, maybe I've even written a little unit test for it. It looks like a mess, but it works. With the amount of nested loops here you could argue that it's highly inefficient. Someone in the background hollers "Big O? More like Big OMG, am I right?"

Now of course, there are other ways we could have tackled this, but what we have at the moment works. Anywho, it's just a little function and there's not too many email addresses in there and we probably won't use it often, so it doesn't matter. Let's get back to working on more important projects before the boss finds another side task for me to work on!

Five minutes later and your boss returns and mentions that they also want the function to still work even if new departments, subdepartments, and email addresses are added to this data structure at a later point. Okay, this changes things because now I need to factor in the possibility of subsubsubdepartments, subsubsubsubdepartments and so on.

Suddenly my iterative function no longer satisifes the criteria.

But lo and behold, a recursive solution enters!

function sendEmail(emailAddress) {
  console.log(`sending email to ${emailAddress}`);
}

function gatherEmailAddresses(departments) {
  let departmentKeys = Object.keys(departments);
  departmentKeys.forEach((dept) => {
    if (Array.isArray(departments[dept])) {
      return departments[dept].forEach((email) => sendEmail(email));
    }
    return gatherEmailAddresses(departments[dept]);
  });
}

Okay, we finally have a function that makes use of recursion in something more fitting of a real-life scenario. Let's explain how it all works.

Our function loops through the keys from the companyEmailAddresses data structure, checks if the value of that key is an array, if it is, it sends an email to each value in that array. However, if the value of the aforementioned key is not an array, it'll call itself - gatherEmailAddresses again (it recurses). However, instead of passing in the entire companyEmailAddresses object like it did the first time, it will just pass in the object node for the subdirectory it was initially looping through.

This function provides us with two benefits on our previous iterative counterpart:

  1. It caters to the additional criteria imposed by our boss. As long as it continues to follow the same pattern, the function can still handle any new objects or arrays added to it without the need to change a single line of code;
  2. It is easier to read. We don't have a bunch of nested loops that our brains have to try and keep track of.

You'll also notice that our function actually contains iterative and recursive elements. There's no reason why an algorithm has to be exclusively one or the other. It's perfectly fine to have something iterative such as a forEach function that contains recursive calls to itself.

Let's just pick up on my second point for a moment. It's only easier to read if you understand how recursion works outside of the confines of fibonacci, factorial, and any other clinical coding trials you might find in a 'How To Crack The Coding Interview' book/course. So let's spend some time explaining exactly what is happening inside of our recursive function.

Our function takes one value - an object - as its only parameter. The object we pass in is the entire companyEmailAddresses variable, which is this grotesque monster:

const companyEmailAddresses = {
  finance: ["jill@companyx.com", "frank@companyx.com"],
  engineering: {
    qa: ["ahmed@companyx.com", "taylor@companyx.com"],
    development: ["cletus@companyx.com", "bojangles@companyx.com", "bibi@companyx.com"],
  },
  management: {
    directors: ["tanya@companyx.com", "odell@companyx.com", "amin@companyx.com"],
    projectLeaders: [
      "bobby@companyx.com",
      "jack@companyx.com",
      "harry@companyx.com",
      "oscar@companyx.com",
    ],
    hr: ["mo@companyx.com", "jag@companyx.com", "ilaria@companyx.com"],
  },
  sales: {
    business: {
      senior: ["larry@companyx.com", "sally@companyx.com"],
    },
    client: {
      senior: ["jola@companyx.com", "amit@companyx.com"],
      exec: ["asha@companyx.com", "megan@companyx.com"],
    },
  },
};

The first thing we do is run Object.keys() on it which returns an array with each department. Something that would look like this: ["finance", "engineering", "management", "sales"]. We then proceed to forEach loop through our companyEmailAddresses, using our array of departments as a way to check each department for certain things. In our case, we use it to check if the structure of each node is an array or not, which we do with Array.isArray(departments[dept]. If it returns true, we just proceed to loop through that array, applying out sendEmail() function on each value. Simple enough, and so far we've not used any recursion. Maybe you didn't even need to read this paragraph, but at least this explanation was here if you did need it. Anyway, let's get to the good bit - the recursive bit.

If our Array.isArray(departments[dept] returns false, that means we have an object, which in our case means we have a subdepartment. In our iterative function, we simply repeated the process - ie, we made another loop - but did it for the subdepartment. But instead, for our recursive function we call the gatherEmailAddresses() function again, seemingly passing in the same companyEmailAddresses object as before. The key difference here is that instead of passing the object in from its root (the entire object), we pass it in from the position of the subdirectory - the subdirectory becomes the new root. We know that our companyEmailAddresses object is just made up of lots of objects that either contain another object or an array. Therefore, our function has been written in such a way that if it hits an array, it knows that it is the end of a node (a lead), so it will try and 'send emails'. But if it hits an object, it needs to continue to traverse deeper.

Make sense? Here are some diagram-ish code blocks I cobbled together to try to help illustrate further:

Our entire object has four departments.

// The first department is an array.
// It requires no additional traversal as we have immediately hit the leaf node.
// This department would return true to Array.isArray()

  finance: [
            "jill@companyx.com",
            "frank@companyx.com"
           ],
// The second department is an object.
// This department would return false to Array.isArray()
// It requires additional traversal
// so we call gatherEmailAddresses() function, passing in department[dept]
// which is equivalent to companyEmailAddresses["engineering"]
// or the code you see below.

  engineering: {
    qa: [
         "ahmed@companyx.com",
         "taylor@companyx.com"
        ],
    development: [
                  "cletus@companyx.com",
                  "bojangles@companyx.com",
                  "bibi@companyx.com"
                 ],
  },

As we have now recursively called our function again, we do not pass onto the third department yet as our recursive function needs to be handled first - ie, we are still handling the second department. The technical way to explain this is that our recursive call is added to our call stack.

// If you recall, the first thing our function does
// is Object.keys() on the object passed in.
// This gives us ["qa", "development"].
// We then loop through each department (or subdepartment in this case).
// We check if the "qa" department/subdepartment is an array with Array.isArray()
// It is, so would return true, so we can then use the sendEmail() function.
// The same would also occur with "development", as that is also an array.

  engineering: {
    qa: [
         "ahmed@companyx.com",
         "taylor@companyx.com"
        ],
    development: [
                  "cletus@companyx.com",
                  "bojangles@companyx.com",
                  "bibi@companyx.com"
                 ],
  },

The third department (three subdepartments) has a similar structure to the second department (two subdepartments), while the fourth department has two subdepartments, but those two subdepartments also contain two subdepartments - so our recursive function takes place again inside of each. I've skipped showing code block breakdowns for these as you hopefully understand how the recursive part is working.

Okay so now we have seen a practical example of recursion in practice. Hopefully this has provided you with a deeper understanding of recursion, along with another way of being able to approach a problems that require some sort of loop.

However, I'd like to highlight again that performance may suffer if recursion is used when it shouldn't be. Recursion shouldn't all of a sudden become your go-to tool instead of iteration, the gains you make in readability may be lost in performance. Ultimately it depends on the scenario presented to you. You will face some problems in programming that may lend themselves well to recursion, while others may lend themselves better to iteration. In some cases, like with the problem we faced earlier, a little bit of both approaches might be best.