Reading List

The most recent articles from a list of feeds I subscribe to.

Conferences, Clarity, and Smokescreens

Before saying anything else, I'd like to thank the organisers of JSNation for inviting to speak in Amsterdam. I particularly appreciate the folks who were brave enough to disagree at the Q&A sessions afterwards. Engaged debate about problems we can see and evidence we can measure makes our work better.

The conference venue was lovely, and speakers were more than well looked after. Many of the JSNation talks were of exactly the sort I'd hope to see as our discipline belatedly confronts a lost decade, particularly Jeremias Menichelli's lighting talk. It masterfully outlined how many of the hacks we have become accustomed to are no longer needed, even in the worst contemporary engines. view-source:// on the demo site he made to see what I mean.

Vinicius Dallacqua's talk on LoAF was on-point, and the full JSNation line-up included knowledgeable and wise folks, including Jo, Charlie, Thomas, Barry, Nico, and Eva. There was also a strong set of accessibility talks from presenters I'm less familiar with, but whose topics were timely and went deeper than the surface. They even let me present a spicier topic than I think they might have been initially comfortable with.

All-in-all, JSNation was a lovely time, in good company, with a strong bent toward doing a great job for users. Recommended.

Day 21React Summit 2025 — could not have been more different. While I was in a parallel framework authors meeting for much of the morning,2 I did attend talks in the afternoon, studied the schedule, and went back through many more after the fact on the stream. Aside from Xuan Huang's talk on Lynx and Luca Mezzalira's talk on architecture, there was little in the program that challenged frameworkist dogma, and much that played to it.

This matters because conferences succeed by foregrounding the hot topics within a community. Agendas are curated to reflect the tides of debate in the zeitgeist, and can be read as a map of the narratives a community's leaders wish to see debated. My day-to-day consulting work, along with high-visibility industry data, shows that the React community is mired in a deep, measurable quality crisis. But attendees of React Summit who didn't already know wouldn't hear about it.

Near as I can tell, the schedule of React Summit mirrors the content of other recent and pending React conferences (1, 2, 3, 4, 5, 6) in that these are not engineering conferences; they are marketing events.

How can we tell the difference? The short answer is also a question: "who are we building for?"

The longer form requires distinguishing between occupations and professions.

In a 1912 commencement address, the great American jurist and antitrust reformer Louis Brandeis hoped that a different occupation would aspire to service:

The peculiar characteristics of a profession as distinguished from other occupations, I take to be these:

First. A profession is an occupation for which the necessary preliminary training is intellectual in character, involving knowledge and to some extent learning, as distinguished from mere skill.

Second. It is an occupation which is pursued largely for others and not merely for one's self.

Third. It is an occupation in which the amount of financial return is not the accepted measure of success.

In the same talk, Brandeis named Engineering a discipline already worthy of a professional distinction. Most software development can't share the benefit of the doubt, not matter how often "engineer" appears on CVs and business cards. If React Summit and co. are anything to go by, frontend is mired in the same ethical tar pit that cause Wharton, Kellogg, and Stanford grads to reliably experience midlife crises.3

It may seem slanderous to compare React conference agendas to MBA curricula, but if anything it's letting the lemon vendors off too easily. Conferences crystallise consensus about which problems matter, and React Summit succeeded in projecting a clear perspective, namely that it's time to party like it's 2013.

A patient waking from a decade-long coma would find the themes instantly legible. In no particular order: React is good because it is popular. There is no other way to evaluate framework choice, and that it's daft to try because "everyone knows React".4 Investments in React are simultaneously solid long-term choices, but also fragile machines in need of constant maintenance lest they wash away under the annual tax of breaking changes, toolchain instability, and changing solutions to problems React itself introduces. Form validation is not a solved problem, and in our glorious future, the transpilers compilers will save us.

Above all else, the consensus remains that SPAs are unquestionably a good idea, and that React makes sense because you need complex data and state management abstractions to make transitions between app sections seem fluid. And if you're worried about the serially terrible performance of React on mobile, don't worry; for the low, low price of capitulating to App Store gatekeepers, React Native has you covered.5

At no point would our theoretical patient risk learning that rephrasing everything in JSX is now optional thanks to React 19 finally unblocking interoperability via Web Components.6 Nor would they become aware that new platform APIs like cross-document View Transitions and the Navigation API invalidate foundational premises of the architectures that React itself is justified on. They wouldn't even learn that React hasn't solved the one problem it was pitched to address.

Conspicuously missing from the various "State Of" talks was discussion of the pressing and pervasive UX quality issues that are rampant in the React ecosystem.

Per the 2024 Web Almanac, less than half of sites earn passing grades on mobile, where most users are.
Per the 2024 Web Almanac, less than half of sites earn passing grades on mobile, where most users are.

We don't need to get distracted looking inside these results. Treating them as black boxes is enough. And at that level we can see that, in aggregate, JS-centric stacks aren't positively correlated with delivering good user-experiences.

2024's switch from FID to INP caused React (particularly Next and Gatsby) sites which already had low pass-rates to drop more than sites constructed on many other stacks.
2024's switch from FID to INP caused React (particularly Next and Gatsby) sites which already had low pass-rates to drop more than sites constructed on many other stacks.

This implies that organisations adopting React do not contain the requisite variety needed to manage the new complexity that comes from React ecosystem tools, practices, and community habits. Whatever the source, it is clearly a package deal. The result are systems that are out of control and behave in dynamically unstable ways relative to business goals.

The evidence is everywhere. React-based stacks frequently fail to deliver good experiences. And weren't "fluid user experiences" the point of the JS/SPA/React boondoggle?7

That this ambushes businesses time and again is a scandal.

But good luck finding solutions to, or even acknolwedgement of, that scandal on React conference agendas. The reality is that the more React spreads, the worse the results get despite the eye-watering sums spent on conversions away from functional "legacy" HTML-first approaches. Many at React Summit were happy to make these points to me in private, but not on the main stage. The JS-industrial-complex omertà is intense.

No speaker I heard connected the dots between this crisis and the moves of the React team in response to the emergence of comparative quality metrics. React Fiber (née "Concurrent"), React Server Components, the switch away from Create React App, and the React Compiler were discussed as logical next steps, rather than what they are: attempts to stay one step ahead of the law. Everyone in the room was implicitly expected use their employer's money to adopt all of these technologies, rather than reflect on why all of this has been uniquely necessary in the land of the Over Reactors.8

The treadmill is real, but even at this late date, developers are expected to take promises of quality and productivity at face value, even as they wade through another swamp of configuration cruft, bugs, and upgrade toil.

React cannot fail, it can only be failed.

And then there was the privilege bubble. Speaker after speaker stressed development speed, including the ability to ship to mobile and desktop from the same React code. The implications for complexity-management and user-experience were somewhat less of a focus. The most egregious example of the day came from Evan Bacon in his talk about Expo, in which he presented Burger King's website as an example of a brand successfully shipping. Here it is under WebPageTest.org's desktop setup:9


As you might expect, putting 75% of the 3.5MB JS payload (15MB unzipped) in the critical path does unpleasant things to the user experience, but none of the dizzying array of tools involved in constructing bk.com steered this team away from failure.10

The fact that Expo enables Burger King to ship a native app from the same codebase seems not to have prevented a statistically significant fraction of users from visiting the site on their mobile devices, where their slower CPUs struggle mightily:


The CrUX data alone is damning:


This sort of omnishambles is what I've come to expect from React-ecosystem experiences of every sort, and it's what folks mean when they say that "JavaScript broke the web and called it progress".

Is waiting 30 seconds for a loading spinner bad?<br>Asking for an industry.
Is waiting 30 seconds for a loading spinner bad?
Asking for an industry.

The other poster child for Expo is Bluesky, a site that also serves web and React Native from the same codebase. It's so bewilderingly laden with React-ish excesses that their engineering choices qualify as gifts-in-kind to Elon Musk and Mark Zuckerberg:


Why is Bluesky so slow? A huge, steaming pile of critical-path JS, same as Burger King:


Again, we don't need to look deeply into the black box to understand that there's something rotten about the compromises involved in choosing React Native + Expo + React Web. This combination clearly prevents teams from effectively managing performance or even adding offline resilience via Service Workers. Pinafore and Elk manage to get both right, providing great PWA experiences while being built on a comparative shoestring. It's possible to build a great social SPA experience, but maybe just not with React:

If we're going to get out of this mess, we need to stop conflating failure with success. The <em>entire point</em> of this tech was to deliver better user experiences, and so the essential job of management is to ask: does it?
If we're going to get out of this mess, we need to stop conflating failure with success. The entire point of this tech was to deliver better user experiences, and so the essential job of management is to ask: does it?

The unflattering comparisons are everywhere when you start looking. Tanner Linsley's talk on TanStack (not yet online) was, in essence, a victory lap. It promised high quality web software and better time-to-market, leaning on popularity contest results and unverifiable, untested claims about productivity to pitch the assembled. To say that this mode of argument is methodologically unsound is an understatement. Rejecting it is necessary if we're going to do engineering rather that marketing.

Popularity is not an accepted unit of engineering quality measurement.
Popularity is not an accepted unit of engineering quality measurement.

The TanStack website cites this social proof as an argument for why their software is great, but the proof of the pudding is in the eating:


The contrast grows stark as we push further outside the privilege bubble. Here are the same sites, using the same network configuration as before, but with the CPU emulation modelling a cheap Android instead:

An absolute rout. The main difference? The amount of JS each site sends, which is a direct reflection of values and philosophy.
An absolute rout. The main difference? The amount of JS each site sends, which is a direct reflection of values and philosophy.
Site Wire JS Decoded JS TBT (ms)
astro.build 11.1 kB 28.9 kB 23
hotwired.dev 1.8 kB 3.6 kB 0
11ty.dev 13.1 kB 42.2 kB 0
expo.dev 1,526.1 kB 5,037.6 kB 578
tanstack.com 1,143.8 kB 3,754.8 kB 366

Yes, these websites target at developers on fast machines. But so what? They speak to the values of their creators. And those values shine through the fog of marketing when we use objective quality measures. The same sorts of engineers who care to shave a few bytes of JS on a site for users on fast machines are the same sorts of folks who will care about the lived UX quality of their approach all the way down.

It is my long experience that cultures that claim "it's fine" to pay for a lot of JS up-front to gain (unquantified) benefits in another dimension almost never check to see if either side of the trade comes up good.

Programming-as-pop-culture is oppositional to the rigour required of engineering. When the folks talking loudest about "scale" and "high quality" and "delivery speed" without metrics or measurement keep delivering crap results for both businesses and users, we need to collectively recalibrate.

There were some bright spots at React Summit, though. A few brave souls tried to sneak perspective in through the side door, and I applaud their efforts:

If the Q&A sessions after my talk are any indication, Luca faced serious risk of being ignored as a heretic for putting this on a slide.
If the Q&A sessions after my talk are any indication, Luca faced serious risk of being ignored as a heretic for putting this on a slide.

If frontend aspires to be a profession11something we do for others, not just ourselves — then we need a culture that can learn to use statistical methods for measuring quality and reject the sorts of marketing that still dominates the React discourse.

And if that means we have to jettison React on the way, so be it.


  1. For attendees, JSNation and React Summit were separate events, although one could buy passes that provided access to both. My impression is that many did. As they were in the same venue, this may have simplified some logistics for the organisers, and it was a good way to structure content for adjacent, but not strictly overlapping, communities of interest.

  2. Again, my thanks to the organisers for letting me sit in on this meeting. As with much of my work, my goal was to learn about what's top of mind to the folks solving problems for developers in order to prioritise work on the Web Platform.

    Without giving away confidences from a closed-door meeting, I'll just say that it was refreshing to hear framework authors tell us that they need better HTML elements and that JSX's default implementations are scaling exactly as poorly ecosystem-wide as theory and my own experience suggest. This is down to React's devil-may-care attitude to memory.

    It's not unusual to see heavy GC stalls on the client as a result of Facebook's always-wrong assertion that browsers are magical and that CPU costs don't matter. But memory is a tricksy issue, and it it's a limiting factor on the server too.

    Lots to chew on from those hours, and I thank the folks who participated for their candor, which was likely made easier since nobody from the React team deigned to join.

  3. Or worse, don't.

    Luckily, some who experience the tug of conscience punch out and write about it. Any post-McKinsey tell-all will do, but Anand Giridharadas is good value for money in this genre.

  4. Circular logic is a constant in discussions with frameworkists. A few classics of the genre that got dusted off in conversations over the conference:

    • "The framework makes us more productive."

      Oh? And what's the objective evidence for that productivity gain?

      Surely, if it's large as frameworkists claim, economists would have noted the effects in aggregate statistics. But we have not seen that. Indeed, there's no credible evidence that we are seeing anything more than the bog-stock gains from learning in any technical feild, except the combinatorial complexity of JS frameworks may, in itself, reduce those gains.

      But nobody's running real studies that compare proficient jQuery developers to React developers under objective criteria, and so personal progression is frequently cited as evidence for collective gains, which is obviously nonsensical. It's just gossip.
    • "But we can hire for the framework."

      😮 sigh 😮‍💨
    • "The problem isn't React, it's the developers."

      Hearing this self-accusation offered at a React conference was truly surreal. In a room free of discussions about real engineering constraints, victim-blaming casts a shadow of next-level cluelessness. But frameworkists soldier on, no matter the damage it does to their argument. Volume and reptition seem key to pressing this line with a straight face.
  5. A frequently elided consequence of regulators scrutinising Apple's shocking "oversight" of its app store has been that Apple has relaxed restrictions around PWAs on iOS that were previously enforced often enough in the breach to warn businesses way. But that's over now. To reach app stores on Windows, iOS, and Android, you need is a cromulent website and PWABuilder.

    For most developers, the entire raison d'être for React Native is now kaput; entirely overcome by events. Not that you'd hear about it at an assemblage of Over Reactors.

  6. Instead of describing React's exclusive ownership of subtrees of the DOM, along with the introduction of a proprietary, brittle, and hard-to-integrate parallel lifecycle as a totalising framework that demands bespoke integration effort, the marketing term "composability" was substituted to describe the feeling of giving everything over to JSX-flavoured angle brackets every time a utility is needed.

  7. It has been nearly a decade since the failure of React to reliably deliver better user experiences gave rise to the "Developer Experience" bait-and-switch.

  8. Mark Erikson's talk was ground-zero for this sort of obfuscation. At the time of writing, the recording isn't up yet, but I'll update this post with analysis when it is. I don't want to heavily critique from my fallible memory.

  9. WPT continues to default desktop tests to a configuration that throttles to 5Mbps up, 1Mbps down, with 28ms of RTT latency added to each packet. All tests in this post use a somewhat faster configuration (9Mbps up and down) but with 170ms RTT to better emulate usage from marginal network locations and the effects of full pipes.

  10. I read the bundles so you don't have to.

    So what's in the main, 2.7MB (12.5MB unzipped) bk.com bundle? What follows is a stream-of-consciousness rundown as I read the pretty-printed text top-to-bottom. At the time of writing, it appears to include:

    • A sea of JS objects allocated by the output of a truly cursed "CSS-in-JS" system. As a reminder, "CSS-in-JS" systems with so-called "runtimes" are the slowest possible way to provide styling to web UI. An ominous start.
    • React Native Reanimated (no, I'm not linking to this garbage), which generates rAF-based animations on the web in The Year of Our Lord 2025, a full five years after Safari finally dragged its ass into the 2010s and implemented the Web Animation API. As a result, Renaimated is Jank City. Jank Town. George Clinton and the Parliment Jankidellic. DJ Janky Jeff. Janky Jank and the Janky Bunch. Ole Jankypants. The Undefeated Heavyweight Champion of Jank. You get the idea; it drops frames.
    • Redefinitions of the built-in CSS colour names, because at no point traversing the towering inferno of build tools was it possible to know that this web-targeted artefact would be deployed to, you know, browsers.
    • But this makes some sense, because the build includes React Native Web, which is exactly what it sounds like: a set of React components that emulate the web to provide a build target for React Native's re-creation of a subset of the layout that browsers are natively capable of, which really tells you everything you need to know about how teams get into this sort of mess.
    • Huge amounts of code duplication via inline strings that include the text of functions right next to the functions themselves. Yes, you're reading that right: some part of this toolchain is doubling up the code in the bundle, presumably for the benefit of a native debugger. Bro, do you even sourcemap? At this point it feels like I'm repeating myself, but none of this is necessary on the web, and none of the (many, many) compiler passes saw fit to eliminate this waste in a web-targeted build artefact.
    • Another redefinition of the built-in CSS colour names and values. In browsers that support them natively. I feel like I'm taking crazy pills.
    • A full copy of React, which is almost 10x larger than it needs to be in order to support legacy browsers and React Native.
    • Tens Hundreds of thousands of lines of auto-generated schema validation structures and repeated, useless getter functions for data that will never be validated on the client. How did this ungodly cruft get into the bundle? One guess, and it rhymes with "schmopallo".
    • Of course, no bundle this disastrous would be complete without multiple copies of polyfills for widely supported JS features like Object.assign(), class private fields, decorators, generators, spread, async iterators, and much more.
    • Inline'd WASM code, appearing as a gigantic JS array. No, this is not a joke.
    • A copy of Lottie. Obviously.
    • What looks to be the entire AWS Amplify SDK. So much for tree-shaking.
    • A userland implementation of elliptic curve cryptography primitives that are natively supported in every modern browser via Web Crypto.
    • Inline'd SVGs, but not as strings. No, that would be too efficient. They're inlined as React components.
    • A copy of the app's Web App Manifest, inline, as a string. You cannot make this up.

    Given all of this high-cost, low-quality output, it might not surprise you to learn that the browser's coverage tool reports that more than 75% of functions are totally unused after loading and clicking around a bit.

  11. I'll be the first to point out that what Brandeis is appealing to is distinct from credentialing. As a state-school dropout, that difference matters to me very personally, and it has not been edifying to see credentialism (in the form of dubious bootcamps) erode both the content and form of learning in "tech" over the past few years.

    That doesn't mean I have all the answers. I'm still uncomfortable with the term "professional" for all the connotations it now carries. But I do believe we should all aspire to do our work in a way that is compatible with Brandeis' description of a profession. To do otherwise is to endanger any hope of self-respect and even our social licence to operate.

New zine: The Secret Rules of the Terminal

Hello! After many months of writing deep dive blog posts about the terminal, on Tuesday I released a new zine called “The Secret Rules of the Terminal”!

You can get it for $12 here: https://wizardzines.com/zines/terminal, or get an 15-pack of all my zines here.

Here’s the cover:

the table of contents

Here’s the table of contents:

why the terminal?

I’ve been using the terminal every day for 20 years but even though I’m very confident in the terminal, I’ve always had a bit of an uneasy feeling about it. Usually things work fine, but sometimes something goes wrong and it just feels like investigating it is impossible, or at least like it would open up a huge can of worms.

So I started trying to write down a list of weird problems I’ve run into in terminal and I realized that the terminal has a lot of tiny inconsistencies like:

  • sometimes you can use the arrow keys to move around, but sometimes pressing the arrow keys just prints ^[[D
  • sometimes you can use the mouse to select text, but sometimes you can’t
  • sometimes your commands get saved to a history when you run them, and sometimes they don’t
  • some shells let you use the up arrow to see the previous command, and some don’t

If you use the terminal daily for 10 or 20 years, even if you don’t understand exactly why these things happen, you’ll probably build an intuition for them.

But having an intuition for them isn’t the same as understanding why they happen. When writing this zine I actually had to do a lot of work to figure out exactly what was happening in the terminal to be able to talk about how to reason about it.

the rules aren’t written down anywhere

It turns out that the “rules” for how the terminal works (how do you edit a command you type in? how do you quit a program? how do you fix your colours?) are extremely hard to fully understand, because “the terminal” is actually made of many different pieces of software (your terminal emulator, your operating system, your shell, the core utilities like grep, and every other random terminal program you’ve installed) which are written by different people with different ideas about how things should work.

So I wanted to write something that would explain:

  • how the 4 pieces of the terminal (your shell, terminal emulator, programs, and TTY driver) fit together to make everything work
  • some of the core conventions for how you can expect things in your terminal to work
  • lots of tips and tricks for how to use terminal programs

this zine explains the most useful parts of terminal internals

Terminal internals are a mess. A lot of it is just the way it is because someone made a decision in the 80s and now it’s impossible to change, and honestly I don’t think learning everything about terminal internals is worth it.

But some parts are not that hard to understand and can really make your experience in the terminal better, like:

  • if you understand what your shell is responsible for, you can configure your shell (or use a different one!) to access your history more easily, get great tab completion, and so much more
  • if you understand escape codes, it’s much less scary when cating a binary to stdout messes up your terminal, you can just type reset and move on
  • if you understand how colour works, you can get rid of bad colour contrast in your terminal so you can actually read the text

I learned a surprising amount writing this zine

When I wrote How Git Works, I thought I knew how Git worked, and I was right. But the terminal is different. Even though I feel totally confident in the terminal and even though I’ve used it every day for 20 years, I had a lot of misunderstandings about how the terminal works and (unless you’re the author of tmux or something) I think there’s a good chance you do too.

A few things I learned that are actually useful to me:

  • I understand the structure of the terminal better and so I feel more confident debugging weird terminal stuff that happens to me (I was even able to suggest a small improvement to fish!). Identifying exactly which piece of software is causing a weird thing to happen in my terminal still isn’t easy but I’m a lot better at it now.
  • you can write a shell script to copy to your clipboard over SSH
  • how reset works under the hood (it does the equivalent of stty sane; sleep 1; tput reset) – basically I learned that I don’t ever need to worry about remembering stty sane or tput reset and I can just run reset instead
  • how to look at the invisible escape codes that a program is printing out (run unbuffer program > out; less out)
  • why the builtin REPLs on my Mac like sqlite3 are so annoying to use (they use libedit instead of readline)

blog posts I wrote along the way

As usual these days I wrote a bunch of blog posts about various side quests:

people who helped with this zine

A long time ago I used to write zines mostly by myself but with every project I get more and more help. I met with Marie Claire LeBlanc Flanagan every weekday from September to June to work on this one.

The cover is by Vladimir Kašiković, Lesley Trites did copy editing, Simon Tatham (who wrote PuTTY) did technical review, our Operations Manager Lee did the transcription as well as a million other things, and Jesse Luehrs (who is one of the very few people I know who actually understands the terminal’s cursed inner workings) had so many incredibly helpful conversations with me about what is going on in the terminal.

get the zine

Here are some links to get the zine again:

As always, you can get either a PDF version to print at home or a print version shipped to your house. The only caveat is print orders will ship in August – I need to wait for orders to come in to get an idea of how many I should print before sending it to the printer.

The Hovercar Framework for Deliberate Product Design

You may be familiar with this wonderful illustration and accompanying blog post by Henrik Kniberg about good MVPs:

Henrik Kniberg's MVP illustration

It’s a very visual way to illustrate the age-old concept that that a good MVP is not the one developed in isolation over months or years, grounded on assumptions about user needs and goals, but one that delivers value to users as early as possible, so that future iterations can take advantage of the lessons learned from real users.

From Hovercar to Skateboard

I love Henrik’s metaphor so much, I have been using a similar system to flesh out product requirements and shipping goals, especially early on. It can be immediately understood by anyone who has seen Henrik’s illustration, and I find it can be a lot more pragmatic and flexible than the usual simple two tiered system (core requirements and stretch goals). Additionally, I find this fits nicely into a fixed time, variable scope development process, such as Shape Up.

🛹 The Skateboard aka the Pessimist’s MVP
What is the absolute minimum we can ship, if need be? Utilitarian, bare-bones, and somewhat embarrassing, but shippable — barely. Anything that can be flintstoned gets flintstoned.
🛴 The Scooter aka the Realist’s MVP
The minimum product that delivers value. Usable, but no frills. This is the target.
🚲 The Bicycle aka the Optimist’s MVP
Stretch goals — UX polish, “sprinkles of delight”, nonessential but high I/E features. Great if we get here, fine if we don’t.
🏍️ The Motorcycle
Post-launch highest priority items.
🚗 The Car
Our ultimate vision, taking current constraints into account.
🏎️ The Hovercar aka the North Star UI
The ideal experience — unconstrained by time, resources, or backwards compatibility. Unlikely to ship, but a guiding light for all of the above.

Please note that the concept of a North Star UI has no relation to the North Star Metric. While both serve as a guiding light for product decisions, and both are important, the North Star UI guides you in designing the product, whereas the North Star Metric is about evaluating success. To avoid confusion, I’ll refer to it as “North Star UI”, although it’s not about the UI per se, but the product vision on a deeper level.

The first three stages are much more concrete and pragmatic, as they directly affect what is being worked on. The more we go down the list, the less fleshed out specs are, as they need to allow room for customer input. This also allows us to outline future vision, without having to invest in it prematurely.

The most controversial of these is the last one: the hovercar, i.e. the North Star UI. It is the very antithesis of the MVP. The MVP describes what we can ship ASAP, whereas the North Star describes the most idealized goal, one we may never be able to ship.

It is easy to dismiss that as a waste of time, a purely academic exercise. “We’re all about shipping. Why would we spend time on something that may not even be feasible?” I hear you cry in Agile.

Stay with me for a moment, and please try to keep an open mind. Paradoxical as it may sound, fleshing out your North Star can actually save you time. How? Start counting.

Core Idea

At its core, this framework is about breaking down tough product design problems into three more manageable components:

  1. North Star: What is the ideal solution?
  2. Constraints: What prevents us from getting there right now?
  3. Compromises: How close can we reasonably get given these constraints?

One way to frame it is is that 2 & 3 are the product version of tech debt.[1]

It’s important to understand what constraints are fair game to ignore for 1 and which are not. I often call these ephemeral or situational constraints. They are constraints that are not fundamental to the product problem at hand, but relate to the environment in which the product is being built and could be lifted or change over time. Things like:

  • Engineering resources
  • Time
  • Technical limitations (within reason)
  • Performance
  • Backwards compatibility
  • Regulatory requirements

Unlike ephemeral constraints, certain requirements are part of the problem description and cannot be ignored. Some examples from the case studies below:

While these may be addressed differently in different solutions, it would be an oxymoron to have a North Star that did not take them into account.

Benefits

1. It makes hard product problems tractable

Nearly every domain of human endeavor has a version of divide and conquer: instead of solving a complex problem all at once, break it down into smaller, manageable components and solve them separately. Product design is no different.

This process really shines when you’re dealing with the kinds of tough product problems where at least two of these questions are hard, so breaking it down can do wonders for reducing complexity.

2. It makes the product design process robust and adaptable

By solving these components separately, our product design process becomes can more easily adapt to changes.

I have often seen “unimplementable” solutions become implementable down the line, due to changes in internal or external factors, or simply because someone had a lightbulb moment.

By addressing these components separately, when constraints get lifted all we need to reevaluate is our compromises. But without this modularization, our only solution is to go back to the drawing board. Unsurprisingly, companies often choose to simply miss out on the opportunity, because it’s cheaper (or seems cheaper) to do so.

3. It facilitates team alignment by making the implicit, explicit

Every shipping goal is derived from the North Star, like peeling layers off an onion. This is whether you realize it or not.

Whether you realize it or not, every shipping goal is always derived from the North Star, like peeling layers off an onion. In some contexts the process of breaking down a bigger shipping goal into milestones that can ship independently is even called layering.

The process is so ingrained, so automatic, that most product designers don’t realize they are doing it. They go from hovercar to car so quickly they barely realize the hovercar was there to begin with. Thinking about the North Star is taboo — who has time for daydreaming? We must ship, yesterday!

But the hovercar is fundamental. Without it, there is no skateboard — you can’t reduce the unknown. When designing it is not an explicit part of the process, the result is that the main driver of all product design decisions is something that can never be explicitly discussed and debated like any other design decision. In what universe is that efficient?

A skateboard might be a good MVP if your ultimate vision is a hovercar, but it would be a terrible minimum viable cruise ship — you might want to try a wooden raft for that.

A wooden raft, then a simple sailboat, then a speedboat, then a yacht, and finally a ship.

A skateboard may be a great MVP for a car, but a terrible MVP for a cruise ship.

Making the North Star taboo doesn’t make it disappear (when did that ever work?).

It just means that everyone is following a different version of it. And since MVPs are products of the North Star, this will manifest as difficulty reaching consensus at every step of the way.

The product team will disagree on whether to ship a skateboard or a wooden raft, then on whether to build a scooter or a simple sailboat, then on whether to work on a speedboat or a yacht, and so on. It will seem like there is so much disconnect that every decision is hard, but there is actually only one root disconnect that manifests as multiple because it is never addressed head on.

Two people arguing. One has a speech bubble with a skateboard, the other a speech bubble with a wooden raft. The first also has a thought bubble with a car, the second a thought bubble with a ship.

When the North Star is not clearly articulated, everyone has their own.

Here is a story that will sound familiar to many readers:

A product team is trying to design a feature to address a specific user pain point. Alice has designed an elegant solution that addresses not just the problem at hand, but several prevalent longstanding user pain points at once — an eigensolution. She is aware it would be a little trickier to implement than other potential solutions, but the increase in implementation effort is very modest, and easily offset by the tremendous improvement in user experience. She has even outlined a staged deployment strategy that allows it to ship incrementally, adding value and getting customer feedback earlier.

Excited, she presents her idea to the product team, only to hear engineering manager Bob dismiss it with “this is scope creep and way too much work, it’s not worth doing”. However, what Bob is actually thinking is “this is a bad idea; any amount of work towards it is a waste”. The design session is now derailed; instead of debating Alice’s idea on its merits, the discussion has shifted towards costing and/or reducing effort. But this is a dead end because the amount of work was never the real problem. In the end, Alice wants to be seen as a team player, so she backs off and concedes to Bob’s “simpler” idea, despite her worries that it is overfit to the very specific use case being discussed, and the product is now worse.

Arguing over effort feels safer and less confrontational than debating vision — but is often a proxy war. Additionally, it is not productive. If the idea is poor, effort is irrelevant. And once we know an idea is good and believe it to our core, we have more incentive to figure out implementation, which often proves to be easier than expected once properly investigated. Explicitly fleshing out the Hovercar strips away the noise and brings clarity.

When we answer the questions above in order and reach consensus on the North Star before moving on to the compromises, we know what is an actual design decision and what is a compromise driven by practical constraints. Articulating these separately, allows us to discuss them separately. It is very hard to evaluate tradeoffs collaboratively if you are not on the same page about what we are trading off and how much it’s worth. You need both the cost and the benefit to do a cost-benefit analysis!

Additionally, fleshing the North Star out separately ensures that everyone is on the same page about what is being discussed. All too often have I seen early design sessions where one person is discussing the skateboard, another the bicycle, and a third one the hovercar, no-one realizing that the reason they can’t reach consensus is that they are designing different things.

4. It can improve the MVP via user testing

Conventional wisdom is that we strip down the North Star to an MVP, ship that, then iterate based on user input. With that process, our actual vision never really gets evaluated and by the time we get to it, it has already changed tremendously.

But did you know you can actually get input from real users without writing a single line of code?

alt text

Believe it or not, you don’t need to wait until a UI is prototyped to user test it. You can even user test a low-fi paper prototype or even a wireframe. This is widely known in usability circles, yet somehow entirely unheard of outside the field. The user tells you where they would click or tap on every step, and you mock the UI’s response by physically manipulating the prototype or showing them a wireframe of the next stage.

Obviously, this works better for some types of products than others. It is notably hard to mock rich interactions or UIs with too many possible responses. But when it does work, its Impact/Effort ratio is very high; you get to see whether your core vision is on the right track, and adjust your MVP accordingly.

It can be especially useful when there are different perspectives within a team about what the North Star might be, or when the problem is so novel that every potential solution is low-confidence. No-one’s product intuition is always right, and there is no point in evaluating compromises if it turns out that even the “perfect” solution was not actually all that great.

5. It paves the way for future evolution

So far, we have discussed the merits of designing our North Star, assuming we will never be able to ship it. However, in many cases, simply articulating what the North Star is can bring it within reach. It’s not magic, just human psychology.

Once we have a North Star, we can use it to evaluate proposed solutions: How do they relate to it? Are they a milestone along a path that ends at the North Star? Do they actively prevent us from ever getting there? Prioritizing solutions that get us closer to the North Star can be a powerful momentum building tool.

Humans find it a lot easier to make one more step along a path they are already on, than to make the first step on an entirely new path. This is well-established in psychology and often used as a technique for managing depression or executive dysfunction. However, it applies on anything that involves humans — and that includes product design.

Once we’re partway there, it naturally begs the question: can we get closer? How much closer? Even if we can’t get all the way there, maybe we can close enough that the remaining distance won’t matter. And often, the closer you get, the more achievable the finish line gets.

In fact, sometimes simply reframing the North Star as a sequence of milestones rather than a binary goal can be all that is needed to make it feasible. For an example of this, check out the CSS Nesting case study below.

Case studies

In my 20 years of product design, I have seen ephemeral constraints melt away so many times I have learned to interpret “unimplementable” as “kinda hard; right now”. Two examples from my own experience that I find particularly relevant below, one around Survey UI, and one around a CSS language feature.

Context Chips and the Power of Engineering Momentum

Illustration of context chips

The case study is described at length in Context Chips in Survey Design: “Okay, but how does it feel?”. In a nutshell, the relevant bits are:

  • Originally, I needed to aggressively prioritize due to minimal engineering resources, which led me to design an extremely low-effort solution which still satisfied requirements.
  • The engineer hated the low-effort idea so much, he prototyped a much higher-effort solution in a day, backend and all. Previously, this would have been entirely out of the question.
  • Once I took the ephemeral constraints out of the question, I was able to design a much better, novel solution, but it got pushback on the basis of effort.
  • Prototyping it allowed us to user test it, which revealed it performed way better than alternatives.
  • Once user testing built engineering momentum and the implementation was more deeply looked into, it turned out it did not actually require as much effort as initially thought.

Here is a dirty little secret about software engineering (and possibly any creative pursuit): neither feasibility nor effort are fixed for a given task. Engineers are not automatons that will implement everything with the same energy and enthusiasm. They may implement product vision they disagree with, but you will be getting very poor ROI out of their time.

Investing the time and energy to get engineers excited can really pay dividends. When good engineers are excited, they become miracle workers.

In fact, engineering momentum is often, all that is needed to make the infeasible, feasible. It may seem hard to fit this into the crunch of OKRs and KPIs but it’s worth it; the difference is not small, it is orders of magnitude. Things that were impossible or insurmountable become feasible, and things that would normally take weeks or months get done in days.

One way to build engineering momentum is to demonstrate the value and utility of what is being built. All too often, product decisions are made in a vacuum, based on gut feelings and assumptions about user needs. Backing them up with data, such as usability testing sessions is an excellent way to demonstrate (and test!) their basis. When possible, having engineers observe user testing sessions firsthand can be much more powerful than secondhand reports.

Relaxed CSS Nesting and the Power of Evolution

Sometimes high effort things just take a lot of hard work and there is no way around it. Other times, feasibility is just one good idea away.

One of my favorite examples, and something I’m proud to have helped drive is the relaxed CSS Nesting syntax, now shipped in every browser. It is such an amazing case study on the importance of having an explicit and consensus-backed North Star UI [2].

In a nutshell, CSS nesting was a (then new) CSS syntax that let developers better organize their code through reducing repetition.

table.browser-support {
	border-collapse: collapse;
}
table.browser-support th,
table.browser-support td {
	border: 1px solid silver;
}
@media (width < 600px) {
	table.browser-support,
	table.browser-support tr,
	table.browser-support th,
	table.browser-support td {
		display: block;
	}
}
table.browser-support th {
	border: 0;
}
table.browser-support td {
	background: yellowgreen;
}
table.browser-support td:empty {
	background: red;
}
table.browser-support td > a {
	color: inherit;
}
table.browser-support {
	border-collapse: collapse;

	@media (width < 600px) {
		&, tr, th, td {
			display: block;
		}
	}

	th, td {
		border: 1px solid silver;
	}
	th {
		border: 0;
	}
	td {
		background: yellowgreen;

		&:empty {
			background: red;
		}

		> a {
			color: inherit;
		}
	}
}
Example of CSS code, with (right) and without (left) nesting. Which one is easier to read?

This is one of the rare cases where the North Star was well known in advance, since the syntax was already well established in developer tooling (CSS preprocessors). Instead, the big challenge was navigating the practical constraints, since CSS implemented in browsers has different performance characteristics, so a syntax that is feasible for tooling may be out of reach for a browser. In this case, the North Star syntax had been ruled out by browser engineers due to prohibitive parsing performance [3], so we had to design a different, more explicit syntax that could be parsed more efficiently.

At this point, it is important to note that CSS Nesting is a feature that is very heavily used once available. Conciseness and readability are paramount, especially when conciseness is the sole purpose of the feature in the first place!

Initial attempts for a syntax that satisfied these technical requirements introduced a lot of noise, making the syntax tedious to write and noisy to read. Even worse, these attempts were actively incompatible with the North Star syntax, as well as other parts of the language (namely, the @scope rule). This meant that even if the North Star syntax became feasible later, CSS would need to forever support syntax that would then have no purpose, and would only exist as a wart from the past, just like HTML doctypes.

Once Google became very keen to ship Nesting (driven by State of CSS 2022, which showed it as the top missing CSS feature), a small subset of the CSS Working Group, led by Elika Etemad and myself met to explore alternatives, and produced four competing proposals. The one that the group voted to adopt [4] was the one I designed explicitly to answer the question: If the North Star syntax is out of the question right now, what is the largest subset of it that is feasible?

Once we got consensus on this intermediate syntax, I started exploring whether we could get any closer to the 🌟, even proposing an algorithm that would reduce the number of cases that required the slower parsing to essentially an edge case. A few other WG members joined me, with my co-TAG member Peter Linss being most vocal.

This is a big advantage of North Star compatible designs: it is much easier to convince people to move a little further along on the path they are already on, than to move to a completely different path. With a bit of luck, you may even find yourself implementing an “infeasible” North Star without even realizing it, one little step at a time.

We initially faced a lot of resistance from browser engineers, until eventually a brilliant Google engineer, Anders Ruud and his team experimented with variations of my proposed algorithm and actually closed in on a way to implement the North Star syntax in Chrome. The rest, as they say, is history. 🌟

Conclusion

Hopefully by now you’re convinced about the value of investing time in reaching alignment on an explicit North Star that has buy-in from the entire product team.

A common misconception is that the North Star is a static goal that prevents you from adapting to new data, such as customer feedback. But often, your North Star will change a lot over time, and that’s okay. Having an initial destination does not take away your ability to course correct. That’s not giving up, it’s adapting.

And yes, it’s true that many product teams do use a vision-led approach — they just start from the car, not the hovercar. While that confers some of the benefits above, there is still an implicit reduction happening, because the hovercar is still there in the back of their mind.

Note that for this framework to be beneficial, it is important that everyone is on the same page and understands the steps, benefits, and goals of this approach. Co-designing a North Star with a team that sees the process as a pointless thought experiment will only add friction and will not confer any of these benefits. Also, this is a mindset that can only work when applied top-down. If you are not a decision-maker at your place of work and leadership is not on board, you will have a very hard time if you try to push this ad hoc, without first getting leadership buy-in. You can try sending them a link to this blog post!

If this post resonated, please share your own case studies in the comments. Or, if you decide to give this framework a try, I’d love to hear how it went!


  1. Indeed, looks like I’m not the first to draw a parallel between the two! ↩︎

  2. I even did an entire talk about it at Web Unleashed, with a lot more technical detail than what I have included here. ↩︎

  3. for any Compilers geeks out there that want all the deets: it required potentially unbounded lookahead since there is no fixed number of tokens a parser can read and be able to tell the difference between a selector and a declaration. ↩︎

  4. Originally dubbed “Lea’s proposal”, and later “Non-letter start proposal”, but became known as Option 3 from its position among the five options considered (including the original syntax). ↩︎

Link bug.

I added a links section to my website. Here’s how I made it, and why.

My First Impressions of Gleam

I’m looking for a new programming language to learn this year, and Gleam looks like the most fun. It’s an Elixir-like language that supports static typing.

I read the language tour, and it made sense to me, but I need to build something before I can judge a programming language well.

I’m sharing some notes on my first few hours using Gleam in case they’re helpful to others learning Gleam or to the team developing the language.