The Five Vital Signs of a Scalable Idea and How to Avoid a Voltage Drop

As an academic, I give a lot of lectures around the world. From Rome to Beijing to Sydney to L.A. and back home to Chicago. Even though the venues change, one recurrent question I receive, especially among the younger audience members, is “We have studied poverty for decades. Why don’t we have viable solutions yet?” Substitute “public education,” “discrimination,” “climate change,” and many other social issues in place of “poverty” and it is the same old song.

I believe a big part of the reason for our failures is that we haven’t understood how to scale our ideas from the petri dish to the large. That is, we don’t backward induct from what our idea will look like when fully scaled—with all the necessary warts and constraints we face in the large—when conducting our original research in the small. Instead of creating policy- or market-based evidence, we are driven to explore evidence-based policy, a flawed approach if we want to change the world at scale. 

Despite the undeniable importance of scalability, our understanding of the mechanisms that lead to either success or failure at scale has lagged the urgent issues that need addressing. So for the last several years, I’ve been part of a movement known as implementation science, or the science of scaling. My colleagues and I study why some ideas thrive and grow, while others have limited impact or even peter out altogether. 

One of the first steps to reaching scale is not losing steam as your idea grows. When a seemingly promising idea loses efficacy or profitability as it expands, we call this a “voltage drop.” In my new book, The Voltage Effect, I have identified five specific and universal causes of voltage drops and how to avoid them. If you can overcome the potential voltage drops I outline below, then your idea has the signatures of something that has the potential to scale. 

1. False positives: The inference problem

A false positive is an erroneous sign that success will continue (or even increase) as your enterprise grows. Imagine you’re an organization that tests an incentive program to combat absenteeism. You test your idea with a behavioral field experiment on a small sample and the feedback suggests it works. You take these results to the CEO and she is excited, thinking how much money will be saved when it is rolled out company-wide. 

Instead of creating policy- or market-based evidence, we are driven to explore evidence-based policy, a flawed approach if we want to change the world at scale.

Sounds great, right? Except the reality is that this initial feedback might simply be a false positive. Think of a false fire alarm or a false signal that you have COVID after a personal test. In the case of the incentive program, you might be compelled to immediately run with the new idea and put money, time, and resources toward something that won’t end up scaling. 

If you’re an organization, you want to try and weed out these false positives. There are several effective approaches for doing so. One simple way is to replicate ideas that show early promise. I recommend having at least three independent replications of the idea. Don’t just do one pilot study of a program or one soft launch of a product and call it a day if it seems successful. Do three trial tests. This will reduce the likelihood of a statistical error and weaken arguments driven by confirmation bias. 

2. Representativeness of the population

How far can your idea scale? Part of the answer depends on whom you are scaling to. No product is designed for everyone, everywhere, at all times. Knowing who your idea will work for, where, and when will help you figure out how far it can go. 

For example, I was part of a team that designed a preschool curriculum in a suburb of Chicago, which was meant to increase test scores for kids age 3–5. Parental support was a key part of the curriculum. We found that it worked better for Hispanic families than for white and Black ones. Why? Because Hispanic homes in the population we were working with were more likely to be multigenerational, with grandparents who could help kids learn when parents weren’t available. Who the program would scale for is critical to know before investing significant time and resources. If we were to invest resources to scale the approach, then it would pay off to do so for Hispanic families but would not for white and Black families. 

It’s important to have a clear-eyed view of whom your idea can scale for and how far it can scale, because it either gives you confidence to press ahead or the knowledge you need to adjust your expectations or approach. The only way to do this is by making sure your test groups at smaller scales reflect the larger population you’re aiming to reach. The key to root out such biases to make sure your early adopters are a random sample. If your idea doesn’t scale to everyone, that’s fine, but you’ll know early what your scaling ceiling is and not overshoot with unnecessary investments. In those cases, your next chore is to find solutions for those the original program didn’t work for—Black and white families in our early education program. 

3. Representativeness of the situation

Will the core driver of your success scale? If the driver is something unique or specific to a time and place, it may not be easily scalable. For our early childhood education program, hiring 30 excellent teachers might be easy, but hiring 30,000 excellent teachers is a whole different kettle of fish. Since people don’t scale well (i.e., they can’t be cloned), talent-centric ventures often don’t either; you can’t afford all the talent you need as you grow, so you hire fewer high-performers and quality suffers at scale—a cruel voltage drop. Or take the case of Early Head Start, which had trouble successfully scaling its program supporting early childhood development because it’s main ingredient—parents with extra time to devote to their children—was often unattainable in many homes. 

This is also why truly great restaurants rarely become chains. You can’t scale singular culinary genius and teaching it is fraught with difficulties too. If your core drivers can remain constant as you grow, then you can scale. Automated digital technologies are a good example of this. The basic mechanism of Instagram—take a picture, post it, others view it—works as well for its one billion users today as it did for its first 100 users when it launched.

4. Unintended consequences and negative spillovers

When designing your idea early on, you must anticipate unintended consequences and negative spillovers and look for ways to engineer positive spillovers. For instance, in our preschool, we discovered that children who weren’t in our program were nonetheless performing significantly better. It turned out that just from playing with the students who were in the program aided their development—a wonderfully scalable spillover. One that will lead to even higher voltage when the program scales. 

Market-wide impacts can also influence your idea at scale. General equilibrium effects, or natural readjustments of the market itself, are a main cause in business. I observed this when I was the chief economist at Uber. When we tried to raise driver pay through tipping or changing the driver rate card, existing drivers drove more hours and new drivers came into the market. These supply changes yielded a new equilibrium whereby driver wages did not change (even though wages change when only a handful of drivers are given higher rate cards). As you begin to scale, you must keep a close eye on unexpected dangers as well as unforeseen opportunities that result from spillovers. 

5. The supply side of scaling

It is imperative to determine how much the trajectory of costs change as you scale your idea, and if you or the market can bear this change. To ensure you don’t fall into the voltage drop of running out of money, you must account for two types of costs: 1) upfront fixed costs, like the one-time investment for the research and development to create a new product; and 2) your ongoing operating expenses. 

One interesting distinction between government and private scaling efforts is that the government usually focuses on the benefit profile of their program and what happens to that flow at scale (the first 4 vital signs), whereas firms usually focus on whether the idea has economies or diseconomies of scale. This means that the ideas that tend to be scaled by government might have different types of strengths and weaknesses compared to those scaled by private enterprises. 

Another key strategy is to create scaling models that don’t rely on top-tier talent; as you scale, finding and paying high performers will unduly eat into your budget. The solution is to create products that can give their full value to customers and other groups even with average performers delivering them. For example, our Chicago Heights curriculum was designed to scale with teachers that an ordinary community could hire, rather than exclusively superstar performers. This way it can still succeed in communities with limited talented pools or that can’t pay competitive salaries. That may not sound sexy, but it is scalable.

I advocate augmenting traditional experimental designs by bringing the warts and constraints that the idea faces at scale into the petri dish in the original testing.


The dominant approach to creating evidence-based social policy today is for the researcher to use the best of all inputs to give their idea “its best chance.” The results are then examined, the study is written up and published in a peer-reviewed journal, and no one is made aware that the researcher has conducted an efficacy test. Policymakers then, unwittingly, choose a certain number of published programs to scale, and the results are usually disappointing. The policymaker subsequently views the next generation of research with a more skeptical eye, and scientists lament how their work is being ignored. 

In my new book, given the ubiquity of such voltage drops and their five causes, I propose a different approach to generating evidence. Beyond exploring whether an idea works in the petri dish in the best case scenario with a select group of people, I advocate augmenting traditional experimental designs by bringing the warts and constraints that the idea faces at scale into the petri dish in the original testing. This is meant to move the question to: For whom, when, where, and why does my idea work? When such an approach is taken, we are better able to use science to advance truly scalable ideas, rather than use guesswork and gut feelings to choose which ideas should be scaled.