forEach is a code smell

09 May 2023JavaScript codesmell

This is a bit of a hot-take, I suppose. And it definitely falls under the category of "silly micro optimizations" that people shouldn't trouble themselves with, but it's something I see a lot and I want to call it out: forEach, to me, is a code smell.

"Has Ben lost his mind?" Maybe.

"Has Ben gone downhill so far that this is sort of nonsense content he's producing?" Yes. Absolutely.

In truth, I've had this discussion a few times over the years. It's a strong opinion I hold weakly, and I wanted to just have all of my thoughts on this written down and published somewhere so I can share them rather than type them up over and over.

What is a "code smell"? #

What I mean when I say "code smell" is simply something that, when I see it, it makes me stop and look closely at what's going on in that area. It doesn't mean the usage is "always bad" or "not good" it just means it's something that gets my attention when I'm reviewing code. According to Wikipedia:

A code smell is any characteristic in the source code of a program that possibly indicates a deeper problem.

It goes on to be much more descriptive than my definition of a "code smell". But whatever, I'm saying when I see it, I think it "smell something" and I investigate.

What's wrong with forEach? #

Let me start off by saying: forEach isn't always bad, forEach can be really good. It's just suboptimal in many cases, so it warrants some care. Every use of forEach is essentially to loop over every element of a set of things. Almost always an Array, but sometimes other types, like a NodeList or Set for example. The difference between forEach and other methods of looping over sets of things — like for(;;), for..of, while, etc — is the fact that forEach requires a function instance be passed to it. And there's where its weakness (and strength) lies.

As soon as forEach is used inside of another function (which is almost always), particularly with an arrow function (again, almost always), that means every time we reach that forEach we have to allocate for a new function, including any scoped closures that function might have (very, very common). In fact, the only way we don't have to allocate a new function for that forEach would be if we didn't have any closed over variables, and therefor we could move the forEach callback up to a shared/module level.

If you have nested forEach calls, that only exacerbates things. Because now you're allocating O(n^nest) functions per call to the parent function, which perhaps is also called in a loop.

Generally, in-lined loops using for or while are going to perform a bit better in the nested case because they don't need to allocate an inner function per outer iteration.

Consider the following, overly simplified example functions:

ts
function processWithForEach(users: User[]) {
  let count = 0;
  users.forEach((user) => {
    user.orders.forEach((order) => {
      order.items.forEach((item) => {
        count++;
      });
    });
  });
  return count;
}
 
function processWithForOf(users: User[]) {
  let count = 0;
  for (const user of users) {
    for (const order of user.orders) {
      for (const item of order.items) {
        count++;
      }
    }
  }
  return count;
}
 
function processWithForIndex(users: User[]) {
  let count = 0;
  for (let u = 0; u < users.length; u++) {
    const user = users[u];
    const orders = user.orders;
    for (let o = 0; o < orders.length; o++) {
      const order = orders[o];
      const items = order.items;
      for (let i = 0; i < items.length; i++) {
        const item = items[i];
        count++;
      }
    }
  }
  return count;
}

When benchmarked the results are as follows:

processWithForEach x 299 ops/sec ±1.09% (88 runs sampled)
processWithForOf x 593 ops/sec ±0.95% (89 runs sampled)
processWithForIndex x 1,453 ops/sec ±0.89% (96 runs sampled)

Now, it's worth noting that what you do in each layer of these nested loops can cause the performance of each of these methods of looping over items to start to converge. The less efficient the code is inside of the loops, the more that becomes a determining factor of speed. But again, that's generally going to cause the performance of these loop types to converge, it's unlikely make forEach faster on its own.

Should you remove all `forEach` from your code? #

NO! STOP IT!! This is one person's silly blog entry, not gospel. Do whatever you find most readable/maintainable. That's the most important thing. I'm just saying that when I personally see a forEach, it causes me to look at the code. Particularly if it's code that is performance sensitive.

Does that mean you should use indexed for loops all the time if you want to move away from forEach? Nah. While for(;;) loops are indeed faster when dealing with indexed collections like Array, they're verbose and arguably harder to read. They also don't work with types that aren't ArrayLike. Instead, I tend to opt for for..of loops in almost every case unless I need the index, or performance needs dictate otherwise.

In the case of performance, there's likely something else that is your real performance bottleneck.. Like, for example, too many loops at all. If perf is your problem, start there. But while you're in there, maybe consider removing usages of forEach, just to make me happy. ;)

When is `forEach` good? #

Where forEach really shines is when you have to apply the exact same function over more than one array/set of things. In which case, you can create a function, then pass it to forEach of each array:

ts
const friends = ['Jay', 'Tracy'];
const neighbors = ['John', 'Joe', 'Allison'];
const coworkers = ['Steve', 'Jason'];
 
function sayHi(name: string) {
  console.log(`Hi, ${name}!`);
}
 
friends.forEach(sayHi);
neighbors.forEach(sayHi);
coworkers.forEach(sayHi);

It's clean, concise, and definitely more readable than if we were to do a for..of loop for each of those arrays just to call sayHi with each name.

When is `for(;;)` good? #

When you need an index for each item, or when you need to iterate over an indexed ArrayLike type very quickly. Otherwise, it's a bit too verbose in my opinion.

I prefer `for..of` in every other case #

for..of iteration loops have a lot of advantages.

Clean and readable
Work with ANY iterable: Array, Set, Map, NodeList, et al
Generallly good performance

The only real gotcha with for..of is it's going to allocate and use an Iterator under the hood, and there's a little more overhead to that than simple index incrementing like for(;;) does.

Other thoughts #

`for..of` perf is tied to the iterable implementation #

If you're doing a for..of loop over a type with a poorly implemented Symbol.iterator method. For example one that tries to use a generator (function*) function to create the returned iterator, then the performance of your for..of loop could be pretty darn slow. In these cases, if you control the type, I recommend you implement your Symbol.iterator method using a plain object-based iterator return, rather than a generator function. It will perform a lot better. This is something that's top of mind, because I literally submitted a PR to an OSS project related to this yesterday

What about reduce? #

A few people might look at those overly simplifed examples above and think "That could be done with nested reducers", and you're absolutely right. Nested reduce can even out perform the for..of implementation, but this is largely because they don't require any closed over variables. Making allocations and optimizations a little easier for the runtime. That said, and I know this is hard to take from the "RxJS Guy", but the nested reduce version is substantially less readable, and overall I find a plain for..of loop to be more readable and maintainable for most people than reduce. This is a view I've changed my mind on over the years. But perhaps I'll change it back some day.

Iterables as arguments #

Another advantage of using for..of loops is that, if you type your functions properly, they can be used with more types. Which is super interesting and sort of fun.

If we have a function that takes an array and is implemented with for..of like so:

ts
function countSteves(names: string[]) {
  let count = 0;
  for (const name of names) {
    if (name === 'Steve') {
      count++;
    }
  }
  return count;
}

We can simply change the argument type to Iterable<string> and all of a sudden the function can accept arrays, sets, and more, with no actual change to the runtime code. The only thing you'll have to watch out for is now you're supporting that API. So make that change mindfully.

ts
function countSteves(names: Iterable<string>) {
  let count = 0;
  for (const name of names) {
    if (name === 'Steve') {
      count++;
    }
  }
  return count;
}

Previous: RxJS Operators In-Depth - Part 2 - Subscription Chains

forEach is a code smell

What is a "code smell"? #

What's wrong with forEach? #

Should you remove all forEach from your code? #

When is forEach good? #

When is for(;;) good? #

I prefer for..of in every other case #