webdevjeff.us

Web Developer Jeff George

Big-O Basics

Explaining Big-O notation in words your mom can understand

Posted Nov. 2, 2015

Now that every computer on the planet is connected to every other computer on the planet, and each computer user has user accounts with dozens of online services, and each account has many files with megabytes of data, it’s hard not to be overwhelmed by the magnitude of the information we have to deal with as developers. When the amount of data we have to process gets this huge, we have to have a way to think and talk about it that our brains can comprehend. That way, believe it or not, is big-O notation.

I’m not going to go deep into the weeds to explain the math underlying big-O notation here—that’s the subject of a career, not a blog post. What I am going to do is try to explain the general concept that big-O is trying to encapsulate in a way that people without graduate degrees in mathematics—like me—can grasp without exploding our brains. In other words, I’m going to tell you how I’d explain big-O to my mom.

What’s it for, anyway?

We know from our earliest programming experiences that there are a potentially infinite number of ways to write a program to accomplish a given task, but that some ways are more efficient than others. Of course, if you’re just processing a handful of records in a prototype app, modern computers are so fast that it really doesn’t matter if the program is efficient or not. If your solution is ten times faster than mine, and your version runs in 50 milliseconds, mine still finishes in half a second. I’m betting that venture capitalists won’t notice the extra 0.45 seconds my shady MVP took to execute, so I’ll still get funded. But when my app goes live, and it’s suddenly hit with hundred thousand or a million users, 0.45 seconds here and 0.45 seconds there starts to add up really fast.

In computer science, big-O is a way of expressing how your program performs as the amount of data it needs to process gets really, really big. In other words, does it scale? Basically, big-O notation expresses the worst-case scenario for how long your program takes to do its work. In other words, big-O is about how bad it can possibly get, because in real-world computing, that’s how bad it’s usually going to be.

From an arithmophobe’s point of view, the great thing about big-O is that it doesn’t have to be precise to be useful. We’re not concerned with details and digits—big-O is concerned with the difference between 10 and 100,000, not the difference between 10 and 11. That means we get to hand-wave the details away, and just pay attention to the exponents. A big-O of O(n) is much better than O(n3), but not nearly as good as O(1). But what do any of those things mean? Let’s go to Big Oscar’s Sandwich Shop, and find out.

Big-O for your mother

You’re Big Oscar, and you make sandwiches. Really good ones. Everyone tells you that you should open a sandwich shop, and since sandwiches are so easy to make, you decide to go into business. Your mom is very proud. (Remember, this is for your mother.) I mean, how hard can it be to make sandwiches for a few people?

Let’s think about how hard it can be. Because your sandwiches are so good, you only have to make one kind of sandwich each day, and people will buy it. So the one thing you have to do at the start of the lunch rush every day is write the name of the sandwich of the day on the blackboard over the register. That takes about a minute, and you only have to do that once a day. In big-O terms, that’s a constant-time task, since it’s the same no matter what else happens: that’s O(1). So far, so good.

Now, on your first day, you have one person who comes in and orders your sandwich. So that day, you spent a minute writing the sandwich on the blackboard, and another minute to make one sandwich. On the second day, you have five customers, so you spend one minute writing on the blackboard, plus one minute each on five sandwiches, for a total of six minutes of work for the day. No big deal. In big-O terms, your sandwich-making operations are now experiencing a linear progression: O(n), where n is the number of sandwiches you make, plus 0(1) for the blackboard task, or O(n + 1). By day three, when you’ve got 15 customers, it’s becoming clear that the minute you spend each day writing on the blackboard is trivial, so we can just forget about it going forward (that’s what I meant about waving off the details). Your sandwich operation has a big-O of O(n).

By the end of the first week, you’re closing in on 100 customers a day (congratulations!), and its clear you can’t keep up by yourself. So you hire a second sandwich artist, and now you can make sandwiches at a rate of O(2n). But that’s still a linear progression, so the rules of big-O let us drop that two-times multiplier as well—your sandwich operation still counts as O(n). As I said, big-O doesn’t care about the difference between 10 and 11, or even between 50 and 100. Big-O is concerned about the difference between 100 and 1,000, or 10,000. But what does that have to do with sandwiches? Let’s find out…

By the end of week two, you and your sandwich artist are consistently moving 100 sandwiches a day, spending 101 man-minutes writing on the blackboard and assembling sandwiches. But the next Monday, something wonderful and terrifying happens: over the weekend, your sandwiches have blown up on the internet, and now every one of your 100 daily customers wants to take about ten sandwiches back to the people at their office! In one weekend, demand shot up from 100 sandwiches a day to 1,000. You just shifted from a linear progression—sandwiches ordered = number of customers, or O(n)—to an exponential one—sandwiches ordered = number of customers times the number of people in their office, or O(n2)!

Big-O doesn’t care if some people order 8 sandwiches and others order 12—that’s one of those details we are allowed to dismiss with a hand-wave. What matters is that demand has just increased by an order of magnitude: instead of ordering for themselves, each one of your customers is now ordering for a large group. You were able to hire a second person to increase capacity to meet the linear growth in demand during your second week, but in your third week, you’re going to need 10 times as many people to keep up with the new, exponential growth in demand. Can you find, hire, and train 18 more sandwich makers in a single weekend? In other words, will your process scale? That’s the question big-O is asking.

What do sandwiches have to do with software?

Silly reader, sandwiches are software. Or at the very least, the sandwich-making process is an algorithm. Let’s walk it through in pseudocode, and see how the big-O of our sandwich program changes. We start with the first step in sandwich-shop operation: writing the sandwich du jour on the blackboard:

No matter how many sandwiches you make on a given day, that blackboard-writing step takes the same amount of time. That’s a big-O of O(1), or constant time. If a few customers walk in, and each one orders a sandwich for him or herself, that’s a loop:

When you add a loop to your program, execution time grows proportionally with the number of iterations of the loop. 5 customers means 5 iterations; 25 customers, 25 iterations. Your processing time is growing as a linear progression, so it’s now O(n), where n is the number of items in the data set (customers, in this case). Noticeable, particularly as the numbers move into the hundreds or thousands, but still manageable by modern computers.

But eventually, your customers start buying sandwiches for more than just themselves. To handle this, we need to add a loop inside our previous loop:

Nested loops are something that makes big-O start to sweat. Now we’re making sandwiches at a rate of customers times co-workers, which is O(n2). In this scenario, 10 customers means about 100 sandwiches, but 100 customers means 1,000 sandwiches. This is now an exponential progression, and it’s getting serious. But it can get worse. What if all of a sudden, every co-worker of every customer started wanting to take ten sandwiches home to their family for dinner? You’d be nesting another loop:

All of a sudden, instead of wanting one sandwich, each customer is coming in the door needing 100 sandwiches! (1 customer, times 10 co-workers, times 10 family members = 100 sandwiches!) You’re now at a big-O of O(n3). How can you possibly scale your sandwich-making operations to meet that kind of explosive demand? By now, you’re probably wishing you’d kept your job at Staples, and just made sandwiches in your spare time…

Back off, math nerds

I’m sure that everyone who’s taken a college-level math course is gnashing their teeth and looking for satellite pictures of my home on Google Earth about now, but please, bear with me. This is an explanation aimed at your mom, not your graduate advisor. The takeaway is that while constant-time operations (writing the name of the sandwich on the blackboard) don’t matter when predicting or assessing software efficiency, things like iterating over large data sets do matter, and iteration over complex data structures matters a lot. You don’t need a PhD to learn big-O kinds of things to watch out for when planning and programming software. Planning your app or service to scale gracefully may not make an obvious difference while you’re testing it during development, but it will really, profoundly matter when you’ve been live for a weeks, and hundreds of thousands of users are clicking away after waiting 400 milliseconds for your site to load.

And that’s what big-O means, in words your mom can understand.