Globe illustration by RawPixel Ltd on Flickr
Reading List
The most recent articles from a list of feeds I subscribe to.
Joining Sanity
Mid February I will start a new job: I am excited to join the developer relations team at Sanity!
I had my last day at W3C in October (look, my cool URI didn’t change) and set out to be mostly away from work for a couple of months. I guess I’m not great at this, as, besides another school lockdown here, I got fairly busy with existing projects. I did some WCAG audits, workshops, conference talks and… interviewing. I wanted to make sure to find the right place and took the time to do it. Having been a contractor for over a decade, it was my first time doing interviews, eh, ever.
My work at WAI taught me to consider the web in terms of systems like authoring/developer tools and user agents, as they impact the web in some very specific ways. It became clear to me I wanted my next role to be at a company that is heavily involved in somehow making the web better through tools like that. Whatever a better web looks like… more accessible, more secure, more privacy-aware… I got to talk to a number of different companies working on browsers, design system tooling, browser add-ons, web standards and content management tools. I could write about the job hunting process, but Mu-An already did that brilliantly.
The Sanity developer relations team made a great impression. The company has a cool product that solves some important problems, a friendly community (plus focus on keeping it healthy), lots of ecosystem, aaaand… there could be lots of new ecosystem possibilities. Sanity’s Year in 2021 blog post has a lot more detail on what Sanity is up to.
For the next few weeks, I will be finishing my current accessibility consulting projects with the Dutch government and Mozilla, go on holiday and then, get this new chapter started!
Originally posted as Joining Sanity on Hidde's blog.
The web doesn’t have version numbers
Like ‘Web 2.0’, ‘web3’ is a marketing term. There is no versioning system for the web. There are also lots of exciting uses for the web and problems with it outside the realm of ‘web3’.

It’s the later half of the 2000s. People had been building websites and online services for a while. Suddenly, everyone started using the phrase ‘Web 2.0’. It didn’t have one one clear definition, like ‘carrot’ or ‘dentist’. It referred to a bunch of things at once. Websites with ‘user generated content’, users tagging their data, software as a service, async JavaScript, widgets and open (!) APIs that allowed for mashups: sites that displayed content from other sites. New companies had names with less vowels, like Tumblr and Flickr, and there was RSS on everything. ‘You’ was made TIME Person of the Year. I’ve been returning to some stuff from that time, and it’s been interesting.
Many of the things that fit the above description of ‘Web 2.0’ were useful, they often still are on today’s web. We lost some, we kept some. But if we’re fair, ‘Web 2.0’ wasn’t some new iteration, a new version of something that was different before. It was largely reuse of existing web tech, like HTTP and XML. Exciting reuse, for sure. But a lot of it already existed in non-commerical forms before the phrase ‘Web 2.0’. Not everyone knows, but the first web browser, WorldWideWeb, was meant to be both viewer and editor. That’s quite ‘user generated’, I would say. So, what did Web 2.0 mean? ‘Web 2.0 is, of course, a piece of jargon, nobody even knows what it means’, Sir Tim Berners-Lee commented in an interview with IBM at the time.
Like ‘Web 2.0’, ‘web3’, and I’m not sure what’s with the removed space and dot, or the lowercase ‘w’, is just a marketing phrase. Definitions of ‘web3’ seem to be all over the place. From what I gathered, it is a vision of a web that runs largely on the blockchain, in order to make ‘owning’ assets better for people who create them and people who purchase them, by cutting out middlemen. This vision is not to be confused with the Semantic Web, which was also called Web 3.0, and discussed years before (see an article from 2006).
Here’s the thing. There is no institution that regularly releases new versions of the web, and recently happily announced this one. Instead, the phrase ‘web3’ was coined in 2014 by a co-inventor of a blockchain technology and since used by crypto asset enthusiasts and certain venture capitalist firms, for what is, some argue, close to ponzi schemes and, in its current form, very environment unfriendly. It also puts vulnerable people at risk (see Molly White ‘s web3 is going great for examples of those claims).
I’ll keep my issues with ‘web3’ for a later post, for now I just wanted to make the point that it’s unfair to claim a version number for the web for a specific set of innovations you happen to like. There are many ways the web evolves. Sometimes they involve the kinds of technology that ‘web3’ adepts use, but usually they don’t. These are some web innovations I like:
- New standards are written to harmonise how web tech works and invent new tech responsibly and across user agents, like CSS Grid Layout, WebAssembly and map/filter/reduce in JavaScript
- New companies start useful services, like payment integration services and ‘content as data’ headless CMSes
- Individuals start blogging on their personal sites, on which they own their content
- Governments and organisations roll out reliable, useful and accessible authentication services (like DigiD)
- Video conferencing companies bring their software to the browser
- Adobe brought Photoshop to the browser
- A couple of Dutch museums put all their entire catalogue of art online (eg Rijksmuseum, Stedelijk and Van Gogh Museum)
Maybe that list is a bit random, you probably have a list of your own. Many of these things are working just fine. I could personally go on and on about some very useful plans for the web. There are also lots of unsolved problems, like lack of web accessibility or Facebook’s business model. Cool things are planned for the web all the time and there are lots of problems that aren’t yet addressed. Most of the web is fine. At the same time, there are also plans and problems. Frankly, I don’t think we should use version numbers just to market a specific subset of plans and problems for the web. Especially not if that’s such a controversial subset.
The post The web doesn’t have version numbers was first posted on hiddedevries.nl blog | Reply via email
The web doesn’t have version numbers
Like ‘Web 2.0’, ‘web3’ is a marketing term. There is no versioning system for the web. There are also lots of exciting uses for the web and problems with it outside the realm of ‘web3’.

It’s the later half of the 2000s. People had been building websites and online services for a while. Suddenly, everyone started using the phrase ‘Web 2.0’. It didn’t have one one clear definition, like ‘carrot’ or ‘dentist’. It referred to a bunch of things at once. Websites with ‘user generated content’, users tagging their data, software as a service, async JavaScript, widgets and open (!) APIs that allowed for mashups: sites that displayed content from other sites. New companies had names with less vowels, like Tumblr and Flickr, and there was RSS on everything. ‘You’ was made TIME Person of the Year. I’ve been returning to some stuff from that time, and it’s been interesting.
Many of the things that fit the above description of ‘Web 2.0’ were useful, they often still are on today’s web. We lost some, we kept some. But if we’re fair, ‘Web 2.0’ wasn’t some new iteration, a new version of something that was different before. It was largely reuse of existing web tech, like HTTP and XML. Exciting reuse, for sure. But a lot of it already existed in non-commerical forms before the phrase ‘Web 2.0’. Not everyone knows, but the first web browser, WorldWideWeb, was meant to be both viewer and editor. That’s quite ‘user generated’, I would say. So, what did Web 2.0 mean? ‘Web 2.0 is, of course, a piece of jargon, nobody even knows what it means’, Sir Tim Berners-Lee commented in an interview with IBM at the time.
Like ‘Web 2.0’, ‘web3’, and I’m not sure what’s with the removed space and dot, or the lowercase ‘w’, is just a marketing phrase. Definitions of ‘web3’ seem to be all over the place. From what I gathered, it is a vision of a web that runs largely on the blockchain, in order to make ‘owning’ assets better for people who create them and people who purchase them, by cutting out middlemen. This vision is not to be confused with the Semantic Web, which was also called Web 3.0, and discussed years before (see an article from 2006).
Here’s the thing. There is no institution that regularly releases new versions of the web, and recently happily announced this one. Instead, the phrase ‘web3’ was coined in 2014 by a co-inventor of a blockchain technology and since used by crypto asset enthusiasts and certain venture capitalist firms, for what is, some argue, close to ponzi schemes and, in its current form, very environment unfriendly. It also puts vulnerable people at risk (see Molly White ‘s web3 is going great for examples of those claims).
I’ll keep my issues with ‘web3’ for a later post, for now I just wanted to make the point that it’s unfair to claim a version number for the web for a specific set of innovations you happen to like. There are many ways the web evolves. Sometimes they involve the kinds of technology that ‘web3’ adepts use, but usually they don’t. These are some web innovations I like:
- New standards are written to harmonise how web tech works and invent new tech responsibly and across user agents, like CSS Grid Layout, WebAssembly and map/filter/reduce in JavaScript
- New companies start useful services, like payment integration services and ‘content as data’ headless CMSes
- Individuals start blogging on their personal sites, on which they own their content
- Governments and organisations roll out reliable, useful and accessible authentication services (like DigiD)
- Video conferencing companies bring their software to the browser
- Adobe brought Photoshop to the browser
- A couple of Dutch museums put all their entire catalogue of art online (eg Rijksmuseum, Stedelijk and Van Gogh Museum)
Maybe that list is a bit random, you probably have a list of your own. Many of these things are working just fine. I could personally go on and on about some very useful plans for the web. There are also lots of unsolved problems, like lack of web accessibility or Facebook’s business model. Cool things are planned for the web all the time and there are lots of problems that aren’t yet addressed. Most of the web is fine. At the same time, there are also plans and problems. Frankly, I don’t think we should use version numbers just to market a specific subset of plans and problems for the web. Especially not if that’s such a controversial subset.
Originally posted as The web doesn’t have version numbers on Hidde's blog.
Twitter needs manual language selection
Lots of Twitterers speak languages that are not English. For people who read tweets that are not in English, it is important that these tweets are marked as such. I feel Twitter needs a feature for this.
It would be nice if, when writing a tweet, we could manually select which language the tweet is in, and that Twitter would use that information to set the appropriate lang
attribute on our content:
Sharing a controversial opinion on CSS frameworks in the Dutch language
Twitter is an authoring tool, for which the Authoring Tool Accessibility Guidelines recommend that “accessible content production is possible” (Guideline B.1.2).
The lang
attribute
Language attributes identify which language some web content is in. They are usually set on a page level, added to the HTML element:
<html lang="en">
Most developers don’t write these attributes often, the code often lives somewhere in a template that we don’t touch every day, or ever. But it’s an important attribute. Setting it correctly gets your page to pass one whole WCAG criterion (3.1.1 Language of page).
In some cases, we have to set language attributes on individual elements, too, like if some of our content is not in the page’s main language. On the website I built for the British-Taiwanese band Transition, we combine content in Mandarin with content in English on one page:
The Transition “Music” page
We picked en
as the main language and set it on the <html>
element. This meant we had to mark all Chinese content as zh
, in this case zh-TW
as it is specifically Mandarin as spoken in Taiwan. Of course, we could have written this the other way around, too. Usually we want to pick the language that’s most common on the page as the page’s language.
Setting a lang
attribute on parts of a page is its own WCAG criterion, too (3.1.2 Language of parts), by the way.
The user need
Setting the language is important for end users, like:
- people who use a screenreader to read out content on a page
- people who use a braille display
- people who end up seeing a default font (browsers can select these based on language)
- people who use software to translate content
- people who want to right click a word in our content to look it up in a dictionary
- people who use user stylesheets
The author need
There is also an author need, both for people who write content and for web developers.
Content editors
People who write content may get browser-provided spellcheckers. They will work better if they know what the content’s language is. I think Twitter.com has somehow turned browser spellcheck off, but there may be Twitter clients or indeed other authoring tools where this is relevant.
Web developers
Language attributes are important for web developers, too, as it allows them to use the :lang()
pseudo class in CSS more effectively.
Some CSS will behave differently based on languages. When you use hyphens: auto
, the browser needs to look up words in a dictionary to apply hyphenation correctly. It has to know the language for this.
With appropriate language attributes, you can also use CSS features like writing modes and typographic properties more effectively. See Hui Jing Chen’s deep dive into CSS for internationalisation for more details.
Automating and lang-maybe
Identifying languages can be automated. In fact, Twitter does this. When they recognise a tweet’s language, they add the relevant lang
attribute proactively. See for instance the European Commission chair’s multilingual tweets:
Twitter’s auto-added
lang
attributes in action
Yay! I think this is very cool (thanks ThainBBdl for pointing this out). The advances in natural language processing are really impressive.
Having said that, any automated system makes mistakes. Vadim Makeev shared:
Yes, sometimes they take my Russian tweets and render them as Bulgarian. It’s not just the lang, they also use some Cyrillic font variation that makes them harder to read.
It is safe to assume such mistakes will skew towards minority languages and miss subtleties that matter a lot to individual people, especially in areas where language is political.
On the one hand, I think it makes sense to deploy automated language identification. As there are a lot of users, Twitter can safely assume not everyone would set a language for all of their tweets. People might not know or care (insert sad face here), a fallback helps with that. On the other hand, if this tech exists, might it make more sense if a browser would deploy it rather than an individual website? Why not have the browser guess the content’s language, for every website and not just Twitter?
If browsers would do this, Twitter’s lang
attributes may get in the way. They kind of give the impression that this information is author-provided. This makes me wonder, should there be a way for Twitter to say their declaration is a guess? lang-maybe
?
Manual selection
Automated language detection probably works best if it complements manual selection. It could help provide a default choice or suggestion for manual selection, and work as a fallback. So, I’m still going to make the case for a method for users to specify a language manually.
A per-tweet manual language picker would be great as it can:
- give willing authors more control to avoid issues
- avoid that language identification benefits are only had by users of the majority languages that AI models are best trained for
- let authors express their specific intent
Summing up
For non-English tweets to meet WCAG, they need to have their language declared with a lang
atttribute. Twitter currently guesses languages, which is a great step in the right direction, but is likely of little help to speakers of minority languages. A manual selector would be a great way to complement the automation.
The post Twitter needs manual language selection was first posted on hiddedevries.nl blog | Reply via email
Twitter needs manual language selection
Lots of Twitterers speak languages that are not English. For people who read tweets that are not in English, it is important that these tweets are marked as such. I feel Twitter needs a feature for this.
It would be nice if, when writing a tweet, we could manually select which language the tweet is in, and that Twitter would use that information to set the appropriate lang
attribute on our content:
Sharing a controversial opinion on CSS frameworks in the Dutch language
Twitter is an authoring tool, for which the Authoring Tool Accessibility Guidelines recommend that “accessible content production is possible” (Guideline B.1.2).
The lang
attribute
Language attributes identify which language some web content is in. They are usually set on a page level, added to the HTML element:
Most developers don’t write these attributes often, the code often lives somewhere in a template that we don’t touch every day, or ever. But it’s an important attribute. Setting it correctly gets your page to pass one whole WCAG criterion (3.1.1 Language of page).
In some cases, we have to set language attributes on individual elements, too, like if some of our content is not in the page’s main language. On the website I built for the British-Taiwanese band Transition, we combine content in Mandarin with content in English on one page:
The Transition “Music” page
We picked en
as the main language and set it on the <html>
element. This meant we had to mark all Chinese content as zh
, in this case zh-TW
as it is specifically Mandarin as spoken in Taiwan. Of course, we could have written this the other way around, too. Usually we want to pick the language that’s most common on the page as the page’s language.
Setting a lang
attribute on parts of a page is its own WCAG criterion, too (3.1.2 Language of parts), by the way.
The user need
Setting the language is important for end users, like:
- people who use a screenreader to read out content on a page
- people who use a braille display
- people who end up seeing a default font (browsers can select these based on language)
- people who use software to translate content
- people who want to right click a word in our content to look it up in a dictionary
- people who use user stylesheets
The author need
There is also an author need, both for people who write content and for web developers.
Content editors
People who write content may get browser-provided spellcheckers. They will work better if they know what the content’s language is. I think Twitter.com has somehow turned browser spellcheck off, but there may be Twitter clients or indeed other authoring tools where this is relevant.
Web developers
Language attributes are important for web developers, too, as it allows them to use the :lang()
pseudo class in CSS more effectively.
Some CSS will behave differently based on languages. When you use hyphens: auto
, the browser needs to look up words in a dictionary to apply hyphenation correctly. It has to know the language for this.
With appropriate language attributes, you can also use CSS features like writing modes and typographic properties more effectively. See Hui Jing Chen’s deep dive into CSS for internationalisation for more details.
Automating and lang-maybe
Identifying languages can be automated. In fact, Twitter does this. When they recognise a tweet’s language, they add the relevant lang
attribute proactively. See for instance the European Commission chair’s multilingual tweets:
Twitter’s auto-added
lang
attributes in action
Yay! I think this is very cool (thanks ThainBBdl for pointing this out). The advances in natural language processing are really impressive.
Having said that, any automated system makes mistakes. Vadim Makeev shared:
Yes, sometimes they take my Russian tweets and render them as Bulgarian. It’s not just the lang, they also use some Cyrillic font variation that makes them harder to read.
It is safe to assume such mistakes will skew towards minority languages and miss subtleties that matter a lot to individual people, especially in areas where language is political.
On the one hand, I think it makes sense to deploy automated language identification. As there are a lot of users, Twitter can safely assume not everyone would set a language for all of their tweets. People might not know or care (insert sad face here), a fallback helps with that. On the other hand, if this tech exists, might it make more sense if a browser would deploy it rather than an individual website? Why not have the browser guess the content’s language, for every website and not just Twitter?
If browsers would do this, Twitter’s lang
attributes may get in the way. They kind of give the impression that this information is author-provided. This makes me wonder, should there be a way for Twitter to say their declaration is a guess? lang-maybe
?
Manual selection
Automated language detection probably works best if it complements manual selection. It could help provide a default choice or suggestion for manual selection, and work as a fallback. So, I’m still going to make the case for a method for users to specify a language manually.
A per-tweet manual language picker would be great as it can:
- give willing authors more control to avoid issues
- avoid that language identification benefits are only had by users of the majority languages that AI models are best trained for
- let authors express their specific intent
Summing up
For non-English tweets to meet WCAG, they need to have their language declared with a lang
atttribute. Twitter currently guesses languages, which is a great step in the right direction, but is likely of little help to speakers of minority languages. A manual selector would be a great way to complement the automation.
Originally posted as Twitter needs manual language selection on Hidde's blog.