Reading List
The most recent articles from a list of feeds I subscribe to.
Construction Lines
I recently stumbled across The Oatmeal’s series on Creativity. While all of it is spot on, the part on erasers hit especially hard.
“There is a lot of shame associated with backpedaling; things like quitting your job, getting a divorce, or simply starting over are considered shameful.
But forward isn’t always progress.
And backward isn’t always regress.
Sometimes going down the wrong path isn’t a mistake — it’s a construction line.”
It was exactly what I needed to hear. You see, only a few days prior, Font Awesome and I had parted ways — the end of a short, but transformative chapter. I’m proud of what we built together, and grateful for what I learned along the way. But it was time to move on.
Jobs are a lot like relationships. They often start with infatuation — and end with the realization that you’re simply not compatible, and that’s no-one’s fault. Letting go always stings, even when it’s the right call. There’s always grief: when you’re not ready to move on, you grieve the bond; when you are, you grieve your expectations. But every ending leaves behind clarity — about who you are, and what makes you happy.
The pursuit of happiness
Today is my 39th birthday — and this summer marks 20 years since I first dipped my toes into this industry. Naturally, I’ve been doing a lot of reflection.
As is typical for ADHDers, I have done a ton of different things, and built a diverse skillset as a result. But what made me happiest? The list of highs went a bit like this:
- Entrepreneurship: Co-founding a startup and driving it to become a household name (in Greece — this was 2008!)
- Consulting: Being a full-time consultant, speaker, and author, traveling the world and jumping from one exciting gig to another
- Academia:[1] Pushing the boundaries of Human-Computer Interaction at MIT and teaching MIT CS students to care about people.
All had three things in common: autonomy, breadth, and impact.
These three things have been the biggest predictors of happiness for me — far more than income [2] or work-life balance, which are the usual suspects.
I used to aspire to work-life balance in the same way I aspired to frequent exercise — because it was good for me, not because it gave me joy. Eventually I realized that what makes me happy isn’t working less, it’s loving what I do, and feeling it matters. Working less cannot transform how your work makes you feel; it can only dampen the effects. But dilution doesn’t turn misery into joy — at best it just makes it tolerable. Don’t get me wrong, poor WLB can absolutely make you miserable; when long hours are an externally imposed expectation, not an internal drive fueled by passion. As with many things in life, it’s enthusiastic consent that makes all the difference.
Then there’s breadth. Most jobs try to box you in: PM or engineer? Scientist or practitioner? UX Researcher or designer? DevRel or standards? And I’m like…
Aurora is my spirit animal 🫶🏼 Source
It’s silly that people are forced to choose, and present themselves as less than to seem more attractive to hiring managers. To use myself as an example:
- Web architecture: I have designed several web technologies that have shipped across all browsers. I’ve spent 4 years in the TAG, reviewing new web technologies across the web platform, and eventually leading the Web Platform Design Principles effort.
- Product/usability/HCI: Prior to working as Product Lead at Font Awesome, I’ve earned a PhD at MIT in Human-Computer Interaction with a minor in Entrepreneurship & Innovation. I published peer reviewed papers in top-tier HCI conferences, and co-created/taught a course on usability & web technologies that is now a permanent subject. I have run user research for scientific and industry projects. I have started several open source projects, some used by millions. In the more distant past, I co-founded a (then) well-known social startup in my home country, Greece and ran product, engineering, and design for three years (six if you count pre-incorporation).
- DevRel: I’ve given over 100 conference talks, published a bestselling book on CSS, and was the first devrel hire at W3C back in 2012. I have built dozens of apps and polyfills that stimulated developer interest in new Web features and drove browser adoption.
Should I present myself as a web architecture expert, a usability/product person, an HCI researcher, or a Developer Advocate?
What a pointless dilemma if I ever saw one!
Combining skills across different areas is a strength to be celebrated, not a weakness to be swept under the rug. The crossover between skills is where the magic happens. Off the top of my head:
- Understanding usability principles has made me a far better web standards designer.
- Web standards work is product design on hard mode. After the impossibly hard constraints and tradeoffs you deal with when designing APIs for the Web platform, regular product problems seem like a cakewalk (we have versions? And we can actually change things?? And we have reliable metrics?!? And the stakeholders all work at the same company?!? 🤯).
- They often feed into each other: DevRel work made me a better communicator in everything I do. Usability made me a better speaker and educator[3], so a better developer advocate too.
- Leading the Web Platform Design Principles convinced me that explicit design principles are immensely useful for all product work.
- Web standards taught me that contrary to popular belief, you do not need a benevolent dictator to ship. But then you do need a good process. Consensus does not magically grow on trees, building consensus is its own art.
Lastly, impact does not have to be about solving world hunger or curing cancer. Making people’s lives a little better is meaningful impact too. It all boils down to:
You can achieve the same total impact by improving the lives of a few people a lot, or the lives of many people a little. For example, my work on web standards has been some of the most fulfilling work I’ve ever done. Its Individual Impact is small, but the Reach is millions, since all front-end developers out there use the same web platform.
What’s next?
Since consulting and entrepreneurship have been my happiness peaks, I figured I’d try them again. Yes, both at once, because after all, we’ve already established that WLB is a foreign concept 🤣
My apprentice Dmitry and I have been in high gear building some exciting things, which I hope to be able to share soon, and I feel more exhilarated than I have in years. I had missed drawing my own lines.
In parallel, I’m taking on select consulting work, so if you need help with certain challenges, or to level up your team around web architecture, CSS, or usability, get in touch.
Don’t get me wrong, I’m not closing the door to full-time roles. I know there are roles out there that value passion and offer the kind of autonomy, breadth, and impact that would let me thrive. It’s the ROI of digging through rubble to find them that gives me pause — as a product person at heart, I/E tradeoffs are top of mind. But if you have such a unicorn, I’m all ears.
I also finally took a few small steps to make my pro bono work financially sustainable, a long overdue todo item. Both pages still need work, but you can now support my writing via ko-fi[4], and my open source work via GitHub Sponsors. I made separate pages for my two most popular projects, Prism (nearing 1.8 billion total npm installs! 🤯) and Color.js. This is as much about prioritization as it is about sustainability: money is an excellent signal about what truly matters to people.
I don’t have a polished “next” to announce yet.
But I’m exactly where I need to be.
Sometimes the clearest lines are the ones drawn after you erase.
Many things wrong with academia, but the intellectual freedom is unparalleled, and it makes up for a lot. ↩︎
See also Alan Watts’ “What if money was no object?” — a classic, but still relevant. ↩︎
Teaching is absolutely a form of UI design — a UI that exposes your knowledge to students — the users. There are many similarities between how good educators design their material and how good UI designers design interfaces. ↩︎
Thanks Dan Abramov for the wording inspiration (with permission). These things are so hard. ↩︎
Style-observer: JS to observe CSS property changes, for reals

I cannot count the number of times in my career I wished I could run JS in response to CSS property changes, regardless of what triggered them: media queries, user actions, or even other JS.
Use cases abound. Here are some of mine:
- Implement higher level custom properties in components, where one custom property changes multiple others in nontrivial ways (e.g. a
--variant: danger
that sets 10 color tokens). - Polyfill missing CSS features
- Change certain HTML attributes via CSS (hello
--aria-expanded
!) - Set CSS properties based on other CSS properties without having to mirror them as custom properties
The most recent time I needed this was to prototype an idea I had for Web Awesome, and I decided this was it: I’d either find a good, bulletproof solution, or I would build it myself.
Spoiler alert: Oops, I did it again
A Brief History of Style Observers
The quest for a JS style observer has been long and torturous. Many have tried to slay this particular dragon, each getting us a little bit closer.
The earliest attempts relied on polling, and thus were also prohibitively slow.
Notable examples were ComputedStyleObserver
by Keith Clark in 2018
and StyleObserver
by PixelsCommander in 2019.
Jane Ori first asked “Can we do better than polling?” with her css-var-listener in 2019. It parsed the selectors of relevant CSS rules, and used a combination of observers and event listeners to detect changes to the matched elements.
Artem Godin was the first to try using transition events such as transitionstart
to detect changes, with his css-variable-observer in 2020.
In fact, for CSS properties that are animatable, such as color
or font-size
, using transition events is already enough.
But what about the rest, especially custom properties which are probably the top use case?
In addition to pioneering transition events for this purpose, Artem also concocted a brilliant hack to detect changes to custom properties:
he stuffed them into font-variation-settings
, which is animatable regardless of whether the axes specified corresponded to any real axes in any actual variable font, and then listened to transitions on that property.
It was brilliant, but also quite limited: it only supported observing changes to custom properties whose values were numbers (otherwise they would make font-variation-settings
invalid).
The next breakthrough came four years later, when Bramus Van Damme pioneered a way to do it “properly”, using the (then) newly Baseline transition-behavior: allow-discrete
after an idea by Jake Archibald.
His @bramus/style-observer was the closest we’ve ever gotten to a “proper” general solution.
Releasing his work as open source was already a great service to the community, but he didn’t stop there. He stumbled on a ton of browser bugs, which he did an incredible job of documenting and then filing. His conclusion was:
Right now, the only cross-browser way to observe Custom Properties with @bramus/style-observer is to register the property with a syntax of “
<custom-ident>
”. Note that<custom-ident>
values can not start with a number, so you can’t use this type to store numeric values.
Wait, what? That was still quite the limitation!
My brain started racing with ideas for how to improve on this. What if, instead of trying to work around all of these bugs at once, we detect them so we only have to work around the ones that are actually present?
World, meet style-observer
At first I considered just sending a bunch of PRs, but I wanted to iterate fast, and change too many things.
I took the fact that the domain observe.style
was available as a sign from the universe, and decided the time had come for me to take my own crack at this age-old problem, armed with the knowledge of those who came before me and with the help of my trusty apprentice Dmitry Sharabin (hiring him to work full-time on our open source projects is a whole separate blog post).
One of the core ways style-observer
achieves better browser support is that
it performs feature detection for many of the bugs Bramus identified.
This way, code can work around them in a targeted way, rather than the same code having to tiptoe around all possible bugs.
As a result, it basically works in every browser that supports transition-behavior: allow-discrete
,
i.e. 90% globally.
- Safari transition loop bug (#279012):
StyleObserver
detects this and works around it by debouncing. - Chrome unregistered transition bug (#360159391):
StyleObserver
detects this bug and works around it by registering the property, if unregistered. - Firefox no initial
transitionstart
bug (#1916214): By design,StyleObserver
does not fire its callback immediately (i.e. works more likeMutationObserver
than likeResizeObserver
). In browsers that do fire an initialtransitionstart
event, it is ignored. - In addition, while working on this, we found a couple more bugs.
Additionally, besides browser support, this supports throttling, aggregation, and plays more nicely with existing transitions.
Since this came out of a real need, to (potentially) ship in a real product, it has been exhaustively tested, and comes with a testsuite of > 150 unit tests (thanks to Dmitry’s hard work).
If you want to contribute, one area we could use help with is benchmarking.
That’s all for now! Try it out and let us know what you think!
Gotta end with a call to action, amirite?
- Docs: observe.style
- Repo: leaverou/style-observer
- NPM: style-observer
Context Chips in Survey Design: “Okay, but how does it _feel_?”
One would think that we’ve more or less figured survey UI out by now. Multiple choice questions, checkbox questions, matrix questions, dropdown questions, freeform textfields, numerical scales, what more could one possibly need?!
And yet, every time Google sponsored me to lead one of the State Of … surveys, and especially the inaugural State of HTML 2023 Survey, I kept hitting the same wall; I kept feeling that the established options for answering UIs were woefully inadequate for balancing the collection good insights with minimal friction for end-users.
The State Of surveys used a completely custom survey infrastructure, so I could often (but not always) convince engineering to implement new question UIs. After joining Font Awesome, I somehow found myself leading yet another survey, despite swearing never to do this again. 🥲 Alas, building a custom survey UI was simply not an option in this case; I had to make do with the existing options out there [1], so I felt this kind of pain to my core once again.
So what are these cases where the existing answering UIs are inadequate, and how could better ones help? I’m hoping this case study to be Part 1 of a series around how survey UI innovations can help balance tradeoffs between user experience and data quality, though this is definitely the one I’m most proud of, as it was such a bumpy ride, but it was all worth it in the end.
The Problem
For context, the body of State Of surveys is a series of “Feature questions”, which present the respondent with a certain web platform feature and ask if they had heard of it or used it. Feature questions look like this:
Respondents get a score in the end, based on how many of these they had heard of or used. Each survey had dozens of these questions. Based on initial estimates, State of HTML was going to have at least fifty.
Respondents love these questions. They learn about new things they may not have heard of, and get to test their knowledge. But also, from the survey designer’s perspective, they gamify a (very long) survey, increasing completion rates, and provide users incentive to share their score on social media, spreading the word.
One would expect that they also provide valuable data, yet browser vendors had repeatedly mentioned that this data was largely useless to them. Surveys were all about what people felt, not what they knew or had used — they had better ways to gauge those. Instead, the reason they funneled thousands into funding these surveys every year was the 1-2 pain points questions towards the end. That was it. Survey data on experience and awareness could be useful, but only if it was accompanied with subjective sentiment data: if they hadn’t used it or heard about it, were they interested? If they had used it, how did it feel?
As an attempt to address this feedback, a button that opened a freeform comment field had been introduced the year prior, but response rates were abysmally low, starting from 0.9% for the first question [2] and dropping further along the way. This was no surprise to me: freeform questions have a dramatically lower response rate than structured questions, and hidden controls get less interaction (“out of sight, out of mind”). But even if they had a high response rate, freeform comments are notoriously hard to analyze, especially when they are so domain specific.
Ideation
Essentially, the data we needed to collect was a combination of two variables: experience and sentiment. Collecting data on two variables is common in survey design, and typically implemented as a matrix question.
🤷 | 👍 | 👎 | |
---|---|---|---|
Never heard of it | |||
Heard of it | |||
Used it |
Indeed, if there were only a couple such questions, a matrix could have been acceptable. But …could you imagine filling out 50 of these?
An acceptable solution needed to add minimal friction for end-users: there were at least 50 such questions, so any increase in friction would quickly add up — even one extra click was pushing it. And we needed a sufficiently high response rate to have a good CI. But it also needed to facilitate quantitative data analysis. Oh, and all of that should involve minimal engineering effort, as the (tiny) engineering team was already stretched thin.
Did I hear anyone say overconstrained? 😅
Idea 1: Quick context
Initially, I took these constraints to heart. Misguided as it may have been, the comment field and the infrastructure around it already existed, so I designed a UI that revealed relevant positive/negative sentiment options using contextual progressive disclosure. These inserted predefined responses into the comment field with a single click.
Being a purely client-side interaction meant it could be implemented in a day, and it still kept end-user friction at bay: providing sentiment was optional and only required a single click.
In theory, quantitative data analysis was not optimally covered, as freeform responses are notoriously hard to analyze. However, based in the psychology of user behavior, I hypothesized that the vast majority of users would not edit these at all, a minority would append context, and an even tinier minority would actually edit the responses. This meant we could analyze them via simple string matching and only lose a few false negatives.
I was very proud of myself: I had managed to design a solution that satisfied all constraints, a feat that initially seemed impossible! Not to mention this design gently guided users towards using the comment field, which could motivate them to add even more context.
Yet, when I presented my mocks to the team, engineering hated it with a passion. The lead engineer (who was also the project founder) found the idea of turning a structured interaction into unstructured data deeply unsettling. So much it motivated him to implement a whole backend to store these followups properly, something I had initially thought was out of the question.
So now what? Back to the drawing board, but with one constraint lifted!
Ideas 2 & 3: Followups and sentiment radios


This new backend came with a UI proposal that raised red flags for both me and the Google PM I was collaborating with (one of the survey’s core stakeholders, but not the main one). Even seeing the followup UI required an extra click, so it was guaranteed to have a low response rate. It would have been better than the 0.9% of the comment field (clicking is easier than typing!), but still pretty low (I would estimate < 15%). And even when users were intrinsically motivated to leave feedback, two clicks and a popover was a steep price to pay.
Another idea came from the Google PM: sentiment radios. It was an attempt to simplify the interaction by framing it as a two step process: first experience, then sentiment, through radio buttons that slid down once a main answer was selected. However, I was very concerned that such a major UI shift after every single answer would quickly become overwhelming over the course of the survey.
Idea 4: Context chips
Back to the drawing board, I asked myself: if I had infinite engineering resources, what UI would I design? The biggest challenge was reducing friction. All ideas so far had required at least one extra (optional) click to select sentiment. Could we do better? What if users could select both experience and sentiment with a single click?
Guided by this, I designed a UI where selecting sentiment is done via “context chips” which are actually part of the answer, so clicking them also selects the answer they are accompanying, allowing users to express an answer across both variables with a single click, or just select the answer itself to express no sentiment. To reduce visual clutter, these only faded in on hover. Additionally, clicking on the selected chip a second time would deselect it, fixing a longstanding UX issue with radio buttons [3].
Over the course of designing this, I became so convinced it was the right solution, that I implemented a high fidelity prototype myself, complete with code that could be easily adapted to the infrastructure used by the survey app.
The context chips prototype on desktop and mobile.
There were so many things I loved about this design, even beyond the core idea of answering both variables with a single click. There were no layout shifts, the followups were in close proximity to the main answer, and the styling of the chips helped build a visual association to reduce friction even more as you go. I was not a huge fan of the mobile version, but I couldn’t think of a much better way to adapt this UI to mobile.
Early alternative concept that supported followups. This was deemed too complicated and was abandoned early on.
Reception of context chips was not what I had hoped at first. I had expected pushback due to the engineering effort needed, but folks also had other concerns: that users would find things appearing on hover distracting and feel “ambushed”, that the UI was too “weird”, that users would not discover the 1-click interaction and use it as a two-step process anyway, and that response rate would be low because these chips were not visible upfront.
Mini-feature questions: Context Chips + Checkboxes?
Around the same time as designing context chips, I had a relevant realization: we don’t actually need to know both awareness and usage for all features.
For old, widely supported features, awareness doesn’t matter, because even when it’s low, it has plateaued. And for features that are so new they have not yet been implemented in browsers, usage is largely meaningless. For these cases, each feature only has two states, and thus experience can be expressed with a checkbox! This would allow us to combine questions about multiple features in one, and we could still use context chips, albeit a little differently:
The mini features prototype on desktop and mobile.
While these could be used for questions that either discern usage or awareness, we decided to stick to the former, as there was a (valid) concern that having mini-feature questions whose checked state meant different things could be confusing and lead to errors. That way, only old, lower-priority features would be relegated to this template, and new features which tend to be higher priority for browser vendors would still get the full UI, comments and all. Instead, to improve the experience for cutting edge features, we introduced a “Not implemented” tag next to the “Used it” option.
One disadvantage of the mini feature UI is that due to the way context chips work, it is not possible to select sentiment for features you have not used: once you click on a chip, it also selects the feature, as if you had clicked on its label. I guess it could be possible to click on a chip and then uncheck the feature, but that would be a very weird interaction.
Idea 5: Existing 5-point question template
At this point, the lead engineer dredged up a question template that had been used in other surveys to ask about the respondent’s experience with various types of tooling. Instead of separating experience and sentiment, it used a 5-point scale where each answer except the first answered both questions.

The existing 5-point question template.
The eng lead was sold: zero engineering effort! The Google PM was also sold: 100% response rate! (since it was not possible to avoid expressing sentiment for features you had heard or used).
I had serious reservations.
- There are arguments for even numbered Likert scales (no neutral option), but these always involve scales of at least 4 points. If you force people to select between two states, positive or negative, you’re simply going to get garbage data. Neutral votes get pushed into positive votes, and the data around positive sentiment becomes useless.
- These did not allow users to express sentiment for features they had not heard of, despite these questions often including enough info for users to know whether they were interested.
- I was worried that increasing the number of upfront answers to 5 would increase cognitive load too much — and even scrolling distance, by 66%!
A UX researcher we were working with even did a heuristic evaluation that somewhat favored the 5-point template mainly on the basis of being a more familiar UI. The odds seemed stacked against context chips, but the upcoming usability testing could still tip the scales in their favor.
Usability Testing to the Rescue!
Despite the lead engineer being unconvinced about the merits of context chips and being adamant that even adapting my fully functional prototype was too much work, since the prototype existed, we decided to user test it against the 5-point question and see how it compared.
We ran a within-subjects usability study with 6 participants (no, they are not too few) recruited via social media. Half of the survey used the 5-point template, the other half context chips. The order of the conditions was randomized to avoid order effects.
In addition to their actual experience, we also collected subjective feedback at the end of the survey, showing them a screenshot of each answering UI and asking how each felt.
What worked well: Context Chips
I have run many usability studies in the last ten years, and I have never seen results as resounding as this one. So much that we unanimously agreed to switch to context chips after the 5th participant, because the scales were so tipped in favor of context chips that nothing that happened in the last session could have tipped them the other way.
The lead engineer observed some of the sessions, and this was instrumental in changing his mind. This was not a coincidence: when engineering is unconvinced that a certain UI is worth the implementation complexity, it can be a good strategy to have them observe usability testing sessions. Not only does it help prove the value to them, it also builds long-term user empathy, which makes future consensus easier. Given the unfortunate lack of HCI prioritization in Computer Science curricula, this may even be their first exposure to usability testing.
All of my concerns about the 5-point template were brought up by participants on their own accord, repeatedly:
- All participants really liked being able to express sentiment, and were vocal about their frustration when they could not express it.
- All but one participant (4/5) complained about being forced into selecting a sentiment when they had no opinion.
- Some participants even mentioned that the 5-point template felt overwhelming.
Furthermore, none of the concerns about context chips were validated:
- No-one found the chips appearing on hover distracting or felt “ambushed”.
- No-one struggled to understand how to use them.
- Everyone discovered the 1-click interaction pretty fast (typically within the first 2-3 questions). But interestingly, they still chose to use it as a two-step process for some of the questions, presumably to reduce cognitive load for difficult to answer questions by breaking down the decision into two smaller ones. The fact that this UI allowed users to make their own efficiency vs cognitive load tradeoffs was an advantage I had not even considered when designing it!
- Response rate was generally high — when people did not select sentiment, it was because they genuinely had no opinion, not because they couldn’t be bothered.
What worked okay: Mini-feature Questions
Mini-feature questions did successfully help cut down response time per feature by 75%, though this came at a cost: Once more, we saw that participants really wanted to express sentiment, and were frustrated when they couldn’t, which was the case for features they had not used. Regardless, we agreed that the tradeoff was worth it for the low-priority questions we were planning to use mini-features for.
What did not work: Context Chips on Mobile
A blind spot in our testing was that we did not test the UI on mobile. Usability tests were conducted remotely via video call, so it was a lot easier to get participants to use their regular computers. Additionally, stats for previous surveys showed that mobile use was a much smaller percentage in these surveys than for the web in general (~25%), so we did not prioritize it.
This was a mistake in itself: using current usage stats to inform prioritization may seem like a great idea, but it can be prone to reverse causality bias. Meaning, are people not using these surveys on mobile very much because they genuinely don’t need to, or because the experience on mobile is subpar? We often see this with accessibility too: people claim that it doesn’t matter because they don’t have users with disabilities, but often they don’t have users with disabilities because their site is not accessible!
But even if 75% of users genuinely preferred to take these surveys on desktop (which is plausible, since they are long and people often do them in increments), we should at least have done a few sessions on mobile — 25% is not negligible!
Once responses started coming in, we realized that participants had trouble understanding what the up and down arrows meant, since on mobile these are shown without labels until selected. This would have been an easy fix had it had been caught early, e.g. thumbs up/down icons could have been used instead. This was not a huge issue, as their purpose becomes clear when selected, but it definitely adds some friction.
Aftermath: Context Chips in the Wild
In our usability testing, we had seen a high response rate for sentiment (% of question respondents who selected sentiment), but that is no guarantee things will play out that way post-launch as well. When participants know they are being watched they are always more willing to engage and pay a lot more attention, no matter how much you emphasize that they should act naturally when briefing them. That’s not their failing; it’s simply human nature.
Indeed, sentiment response rates in the real-world were lower than those observed in the usability study, but still high — ranging from 24% to 59% and averaging 38% (with the same median) per question, meaning that out of every ten participants that answered each question, approximately four also provided a sentiment. This was more than enough to draw conclusions. In fact, context chips were deemed such a success they were later adopted by all other State Of surveys, even at the cost of continuity with previous years.
Against expectations, participants were just as likely to express sentiment for features they had never heard of, and in fact marginally more likely than for features they had simply heard of. In general response rates were pretty uniform across all experiences:
Experience | Sentiment response rate (average) | Sentiment response rate (median) |
---|---|---|
Never heard of it | 37.3% | 37.4% |
Heard of it | 37.3% | 36.9% |
Used it | 39.0% | 40.1% |
Overall | 37.6% | 37.9% |
Sentiment response rates overall and by experience (per question).
As we had observed in the user study as well, participants were far more likely to express positive rather than negative sentiment (possibly a case of acquiescence bias). Here are some interesting stats:
- The feature with the most negative sentiment overall across all experiences (
<model>
) still only had 10% of respondents expressing negative sentiment for it, and still had 2.4x more positive sentiment than negative (24.7% vs 10.4%)! - In contrast, the feature with the most positive sentiment (
<datalist>
) got a whopping 55% of respondents expressing positive sentiment (and only 4% negative). - In fact, even the feature with the least positive sentiment (Imperative Slot Assignment 🤔) still had way more positive sentiment (17.24%) than
<model>
had negative sentiment (10%)! - If we look at the ratio of positive over negative sentiment, it ranged from 64x (!) more positive than negative sentiment (45% vs 0.7% for landmark elements) to a mere 2.4x times more positive sentiment (24.7% vs 10.4% for
<model>
), and was 11x on average (8x median).
The analysis presented in this section includes data from 20K respondents, which was most of the whole dataset (around 90%) but not all, as it was done before the survey closed.
Results Visualization
Presenting all this data was another challenge. When you have two variables, ideally you want to be able to group results by either. E.g. you may be more interested in the total negative sentiment for a feature, or how many people had used it, and the results visualization should support both. How do we design a results display that facilitates this? Thankfully the visualizations for State Of surveys were already very interactive so interactivity was not out of the question.
I was no longer involved by then but consulted in a volunteer capacity. My main advice was to use proximity for clear visual grouping, and to use a consistent visual association for bars that represented the same bit of data, both of which they followed. This was the rendering they settled on for the results:
I think in terms of functionality this works really well. The visual design could be improved to communicate IA better and appear less busy at first glance, but given there is no dedicated designer in the team, I think they did a fantastic job.
Generalizability
While this UI was originally designed to collect sentiment about the selected option in a multiple choice question, I think it could be generalized to improving UX for other types of two variable questions. Generally, it can be a good fit when we have questions that collect data across two variables and:
- The second variable is optional and lower priority than the first
- The first variable is exclusive (single) choice, i.e. not checkbox questions.
The core benefit of this approach is the reduction in cognitive load. It is well established that matrix questions are more overwhelming. This design allows questions to initially appear like a simple multiple choice question, and only reveal the UI for the second variable upon interaction. Additionally, while matrix questions force participants to decide on both variables at once, this design allows them to make their own tradeoff of cognitive load vs efficiency, treating the UI as single step or two step as they see fit.
Another benefit of this design is that it allows for option labels to be context-dependent for the second variable, whereas a matrix limits you to a single column header. In the sentiment case, the labels varied depending on whether the feature had been used (“Positive experience” / “Negative experience”) or not (“Interested” / “Not interested”).
The more such questions a survey has, the bigger the benefits — if it’s only about a couple questions, it may not be worth the implementation complexity, though I’m hoping that survey software may eventually provide this out of the box so that this is no longer a tradeoff.
That said, matrix questions do have their benefits, when used appropriately, i.e. when the two requirements listed above are not met.
Matrix questions have an big edge when users need to make a single selection per row, especially when this may be the same answer for multiple rows, which means they can just tick down a whole column. Context chips do not allow users to build this positional association, as they are not aligned vertically. THey are placed right after the answer text, and thus their horizontal position varies per answer. To mitigate this, it can be useful to color-code them and maintain the same color coding consistently throughout the survey so that participants can build a visual association [4].
Context chips also enforce a clear prioritization across the two variables: the question is presented as a multiple choice question across the first variable, with the chips allowing the respondent to provide optional additional context across a second variable. Matrix questions allow presenting the two variables with the same visual weight, which could be desirable in certain cases. For example, there are cases when we don’t want to allow the respondent to provide an answer to the first variable without also providing an answer to the second.
Lessons Learned
In addition to any generalizable knowledge around survey design, I think this is also an interesting product management case study, and teaches us several lessons.
Never skimp on articulating the north star UI
Start any product design task by ignoring ephemeral constraints (e.g. engineering resources) and first reach consensus on what the optimal UI is, before you start applying constraints. Yes, you read that right. I want to write a whole post about the importance of north star UIs, because this is one of many cases over the course of my career where tight implementation constraints were magically lifted, either due to a change of mind, a change in the environment, or simply someone’s brilliant idea. Without consensus on what the north star UI is (or even a clear idea about it) you then have to go back to the drawing board when this happens.
User testing is also a consensus-building tool
You probably already know that usability testing is a great tool for improving user experience, but there is a second, more strategic hidden utility to it: consensus building.
I’ve been in way, way too many teams where UI decisions were made by hypothesizing about user behavior, which could not only be missing the mark, but also there is no way forwards for disagreements. What do you do, hypothesize harder? When there is user testing data, it is much harder to argue against it.
This is especially useful in convincing engineering that a certain UI is worth the implementation complexity, and having engineers observe usability testing sessions can be an educational experience for many.
Heuristic evaluations are not a substitute for usability testing
There are many things to like about heuristic evaluations and design reviews. They can be done by a usability expert alone and can uncover numerous issues that would have taken multiple rounds of usability testing, especially if they are also a domain expert. Fixing the low hanging fruit issues means user testing time can be spent more efficiently, uncovering less obvious problems. That are are also certain types of issues that can only be uncovered by a heuristic evaluation, such as _“death by a thousand paper cuts” type issues: small issues that are not a big deal on their own and are too small to observe in a user study, but add up to more friction.
However, a big downside of these is that they are inherently prone to bias. They can be excellent for finding problems that may have been overlooked and are often obvious once pointed out. However (unless the number of evaluators is large), they are not a good way to decide between alternatives.
Unlike Devographics, surveys are not FA’s core business, so the Impact/Effort tradeoff simply wasn’t there for a custom UI, at least at this point in time. I ended up going with Tally, mainly due to the flexibility of its conditional logic and its support for code injection (which among other things, allowed me to use FA icons — a whopping 120 different ones!). ↩︎
Meaning out of the people who responded to that question about their experience with a feature, only 0.9% left a comment. ↩︎
A radio group with all buttons off cannot be returned to that state by user interaction. ↩︎
The ~9% of people with atypical color vision won’t benefit, but that’s okay in this case, as color is used to add an extra cue, and not an essential part of the interface. ↩︎
Web Components are not Framework Components — and That’s Okay
Disclaimer: This post expresses my opinions, which do not necessarily reflect consensus by the whole Web Components community.
A blog post by Ryan Carniato titled “Web Components Are Not the Future” has recently stirred a lot of controversy. A few other JS framework authors pitched in, expressing frustration and disillusionment around Web Components. Some Web Components folks wrote rebuttals, while others repeatedly tried to get to the bottom of the issues, so they could be addressed in the future.
When you are on the receiving end of such an onslaught, the initial reaction is to feel threatened and become defensive. However, these kinds of posts can often end up shaking things up and pushing a technology forwards in the end. I have some personal experience: after I published my 2020 post titled “The failed promise of Web Components” which also made the rounds at the time, I was approached by a bunch of folks (Justin Fagnani, Gray Norton, Kevin Schaaf) about teaming up to fix the issues I described. The result of these brainstorming sessions was the Web Components CG which now has a life of its own and has become a vibrant Web Components community that has helped move several specs of strategic importance forwards.
As someone who deeply cares about Web Components, my initial response was also to push back. I was reminded of how many times I have seen this pattern before. It is common for new web platform features to face pushback and resistance for many years; we tend to compare them to current userland practices, and their ergonomics often fare poorly at the start. Especially when there is no immediately apparent 80/20 solution, making things possible tends to precede making them easy.
Web platform features operate under a whole different set of requirements and constraints:
- They need to last decades, not just until the next major release.
- They need to not only cater to the current version of the web platform, but anticipate its future evolution and be compatible with it.
- They need to be backwards compatible with the web as it was 20 years ago.
- They need to be compatible with a slew of accessibility and internationalization needs that userland libraries often ignore at first.
- They are developed in a distributed way, by people across many different organizations, with different needs and priorities.
Usually, the result is more robust, but takes a lot longer. That’s why I’ve often said that web standards are “product work on hard mode” — they include most components of regular product work (collecting user needs, designing ergonomic solutions, balancing impact over effort, leading without authority, etc.), but with the added constraints of a distributed, long-term, and compatibility-focused development process that would make most PMs pull their hair out in frustration and run screaming.
I’m old enough to remember this pattern playing out with CSS itself: huge pushback when it was introduced in the mid 90s. It was clunky for layout and had terrible browser support — “why fix something that wasn’t broken?” folks cried. Embarrassingly, I was one of the last holdouts: I liked CSS for styling, but was among the last to switch to floats for layout — tables were just so much more ergonomic! The majority resistance lasted until the mid '00s when it went from “this will never work” to “this was clearly the solution all along” almost overnight. And the rest, as they say, is history. 🙂
But the more I thought about this, the more I realized that — as often happens in these kinds of heated debates — the truth lies somewhere in the middle. Having used both several frameworks, and several web components, and having authored both web components (most of them experimental and unreleased) and even one framework over the course of my PhD, both sides do have some valid points.
Frankly, if framework authors were sold the idea that web components would be a compile target for their frameworks, and then got today’s WC APIs, I understand their frustration. Worse yet, if every time they tried to explain that this sucks as a compile target they were told “no you just don’t get it”, heck I’d feel gaslit too! Web Components are still far from being a good compile target for all framework components, but that is not a prerequisite for them being useful. They simply solve different problems.
Let me explain.
Not all component use cases are the same.
I think the crux of this debate is that the community has mixed two very different categories of use cases, largely because frameworks do not differentiate between them; “component” has become the hammer with which to hammer every nail. Conceptually, there are two core categories of components:
- Generalizable elements that extend what HTML can do and can be used in the same way as native HTML elements across a wide range of diverse projects. Things like tabs, rating widgets, comboboxes, dialogs, menus, charts, etc. Another way to think about these is “if resources were infinite, elements that would make sense as native HTML elements”.
- Reactive templating: UI modules that have project-specific purposes and are not required to make sense in a different project. For example, a font foundry may have a component to demo a font family with child components to demo a font family style, but the uses of such components outside the very niche font foundry use case are very limited.
Of course, it’s a spectrum; few things in life fit neatly in completely distinct categories.
For example an <html-demo>
component may be somewhat niche, but would be useful across any site that wants to demo HTML snippets
(e.g. a web components library, a documentation site around web technologies, a book teaching how to implement UI patterns, etc.).
But the fact that it’s a spectrum does not mean the distinction does not exist.
WCs primarily benefit the use case of generalizable elements that extend HTML, and are still painful to use for reactive templating. Fundamentally, it’s about the ratio of potential consumers to authors.
The huge benefit of Web Components is interoperability: you write it once, it works forever, and you can use it with any framework (or none at all). It makes no sense to fragment efforts to reimplement e.g. tabs or a rating widget separately for each framework-specific silo, it is simply duplicated busywork.
As a personal anecdote, a few weeks ago I found this amazing JSON viewer component, but I couldn’t use it because I don’t use React (I prefer Vue and Svelte). To this day, I have not found anything comparable for Vue, Svelte, or vanilla JS. This kind of fragmentation is sadly an everyday occurrence for most devs.
But when it comes to project-specific components, the importance of interop decreases: you typically pick a framework and stick to it across your entire project. Reusing project-specific components across different projects is not a major need, so the value proposition of interop is smaller.
Additionally, the ergonomics of consuming vs authoring web components are vastly different. Consuming WCs is already pretty smooth, and the APIs are largely there to demystify most of the magic of built-in elements and expose it to web components (with a few small gaps being actively plugged as we speak). However, authoring web components is a whole different story. Especially without a library like Lit, authoring WCs is still painful, tedious, and riddled with footguns. For generalizable elements, this is an acceptable tradeoff, as their potential consumers are a much larger group than their authors. As an extreme example of this, nobody complains about the ergonomics of implementing native elements in browsers using C++ or Rust. But when using components as a templating mechanism, authoring ergonomics are crucial, since the overlap between consumers and authors is nearly 100%.
This was the motivation behind this Twitter poll I posted a while back. I asked if people mostly consumed web components, used WCs that others have made, or both. Note that many people who use WCs are not aware of it, so the motivation was not to gauge adoption, but to see if the community has caught on to this distinction between use cases. The fact that > 80% of people who knowingly use web components are also web components authors is indicative of the problem. WCs are meant to empower folks to do more, not to be consumed by expert web developers who can also write them. Until this number becomes a lot smaller, Web Components will not have reached their full potential. This was one of the reasons I joined the Web Awesome project; I think that is the right direction for WCs: encapsulating complexity into beautiful, generalizable, customizable elements that give people superpowers by extending what HTML can do: they can be used by developers to author gorgeous UIs, designers to do more without having to learn JS, or even hobbyists that struggle with both (since HTML is the most approachable web platform language).
So IMO making it about frameworks vs web components is a false dichotomy. Frameworks already use native HTML elements in their components. Web components extend what native elements can do, and thus make crafting project-specific components easier across all frameworks (as well as no frameworks). I wonder if this narrative could resonate across both sides and reconcile them. Basically “yes, we may still need frameworks for nontrivial apps, but web components make their job easier” rather than pitting them against each other in a pointless comparison where everyone loses.
We will certainly eventually get to the point where web components are more ergonomic to author,
but we first need to get the low-level foundations right.
At this point the focus is still on making things possible rather than making them easy.
The last remaining pieces of the puzzle are things like
Reference Target for cross-root ARIA
or ElementInternals.type
to allow custom elements to become popover targets or submit buttons,
both of which saw a lot of progress at W3C TPAC last week.
After that, perhaps eventually web components will even become viable for reactive templating use cases;
things like the open-stylable
shadow roots proposal,
declarative elements, or DOM Parts
are some early beginnings in that direction,
and declarative shadow DOM paved the way for SSR (among other things).
Then, and only then, they may make sense as a compile target for frameworks.
However, that is quite far off.
And even if we get there, frameworks would still be useful for complex use cases,
as they do a lot more than let you use and define components.
Components are not even the best reuse mechanism for every project-specific use case — e.g. for list rendering, components are overkill compared to something like v-for
.
And by then frameworks will be doing even more.
It is by definition that frameworks are always a step ahead of the web platform,
not a failing of the web platform.
As Cory said, “Frameworks are a testbed for new ideas that may or may not work out.”.
The bottom line is, web components reduce the number of use cases where we need to reach for a framework, but complex large applications will likely still benefit from one. So how about we conclude that frameworks are useful, web components are also useful, stop fighting and go make awesome sh!t using whatever tools we find most productive?
Thanks to Michael Warren, Nolan Lawson, Cory LaViska, Steve Orvell, and others for their feedback on earlier drafts of this post.
Making the Web more Awesome — for everyone
Folks, I have some exciting news to share. 🤩
Today I start a new chapter in my career. After a decade at MIT, teaching and doing research at the intersection of usability and programming language design, I wrapped up my PhD two weeks ago (yes, I’m a Dr now! And damn right I will — once it actually sinks in) and today I start my new role as Product Lead at Font Awesome.

I will be evaluating user needs and improving product design and usability across all company products, with an emphasis on Web Awesome, the product we are launching early next year to revolutionize how Web UIs are built by using web components and CSS in ways you’ve never seen before. Beyond improving the products themselves (all of which include extensive free & open source versions), part of my role will utilize my web standards experience to collect web platform pain points from across the company and translating them to new and existing web standards proposals.
Yes, I know, it’s a match made in heaven. 😍
There is even a small chance I may have been the first to create an icon font for use in a web UI via @font-face
,
which would make it even more wonderfully poetic that I’m joining the company that has become synonymous with icon fonts on the Web.
However, it was not my MIT PhD that led me to this role, but an email from Dave Gandy (creator & CEO of Font Awesome) about Color.js, that turned into hours of chats, and eventually a job offer for a role I could not refuse, one that was literally molded around my skills and interests.
The role is not the only reason I’m excited to join Font Awesome, though. The company itself is a breath of fresh air: open source friendly (as Dave says, “literally the only reason we have Pro versions is that we need to sustain this somehow” 😅), already profitable (= no scrambling to meet VC demands by cramming AI features nobody wants into our products), fully remote, huge emphasis on work-life balance, and an interview process that did not feel like an interview — or even a process. In fact, they did not even want to look at my resume (despite my efforts 🤣). It is telling that in their 10 years of existence, not a single person has left the company, and they have never had to let anyone go. Moreover, it bridges the best of both worlds: despite having existed for a decade, branching out to new products[1] and markets gives it a startup-like energy and excitement.
I had been extremely selective in the job opportunities I pursued, so it took a while to find the perfect role. Having ADHD (diagnosed only last year — I want to write a blog post about that too at some point), I knew it was crucial to find a job I could be passionate about: ADHD folks are unstoppable machines in jobs they love (I have literally built my career by directing my hyperfocus to things that are actually productive), but struggle way more than neurotypicals in jobs they hate. It took a while, but when I started talking with Dave, I knew Font Awesome was it.
I’m still reeling from the mad rush of spending the past couple of months averaging 100-hour weeks to wrap up my PhD before starting, but I couldn’t be more excited about this new chapter.
I’m hoping to write a series of blog posts in the coming weeks about about my journey to this point. Things like:
- How I decided that academia was not for me — but persisted to the finish line anyway because I’m stubborn AF 😅
- How I realized that product work is my real calling, not software engineering per se (as much as I love both)
- How I used web technologies instead of LaTeX to write my PhD thesis (and print it to PDF for submission), with 11ty plus several open source plugins, many of which I wrote, an ecosystem I hope to one day free more people from the tyranny of LaTeX (which was amazing in the 70s, but its ergonomics are now showing their age).
But for now, I just wanted to share the news, and go off to make the web more awesome — for everyone. 🚀
My excitement grew even stronger when a week before my start date, I learned that 11ty (and its creator, Zach Leatherman) had also joined Font Awesome — I think at this point every tool I use regularly is officially Awesome 😅. Yes, this site is built on 11ty as well. And even my PhD thesis! ↩︎