Adding Webmention Support from Scratch

Saturday March 20, 4:31 PM

Dwayne Harris contact@dwayne.xyz · About 2,689 words

I added support for Webmentions to the site!

Webmentions is a protocol for websites (mostly blog type sites) to automatically notify each other of responses or mentions to each other. It’s not a service (but there are services that work with it), but something that can be implemented in blogging software. The end result for someone implementing Webmentions is two things:

A list of responses to a blog post (from anywhere else on the internet) shown on their blog post page.
Notifications automatically sent to the websites mentioned in their blog post when it’s published or edited.

It’s one of the parts of the IndieWeb idea.

If your website is built on Wordpress/Craft/Drupal/etc, you can install a plugin for Webmentions and start sending and receiving them automatically. But I’m not using any blog software, so I had to write this from scratch like with everything else on the website.

Here’s how I did it.

The Webmentions Endpoint

One of the things that needs to be done to get everything working is to set up an endpoint that other sites can send Webmentions to. That endpoint just accepts two www-form-urlencoded params:

target: The URL to the post on your website that’s being mentioned.
source: The URL to the post that’s doing the mentioning.

Here’s some of the handler code I wrote for it:

 1func webmentionHandler(w http.ResponseWriter, r *http.Request) {
 2	if r.Method != http.MethodPost {
 3		render.WrongMethod(w, r)
 4		return
 5	}
 6
 7	var (
 8		ctx    = r.Context()
 9		from   = r.FormValue("from")
10		source = r.FormValue("source")
11		target = r.FormValue("target")
12	)
13
14	pSource, err := url.ParseRequestURI(source)
15	if err != nil {
16		w.WriteHeader(http.StatusBadRequest)
17		render.Error(w, r, "Invalid source URL")
18		return
19	}
20
21	pTarget, err := url.ParseRequestURI(target)
22	if err != nil {
23		w.WriteHeader(http.StatusBadRequest)
24		render.Error(w, r, "Invalid target URL")
25		return
26	}
27
28	...
29}

I accept an extra from param for my own usage that I’ll talk about later. This code shows some validation on the URLs, but there are a few more checks afterwards to make sure the mention is valid, like making sure the target URL leads to an existing post in the database.

After reading the specs, I got the impression that this is the kind of endpoint that gets hit a lot (mostly because of spam/abuse) so it needs to be pretty resource effecient. The spec states that the endpoint should probably be async and store requests for processing later instead of trying to handle it immediately.

This handler stores the requests into a webmentions table in the database, but if it exists already, the checked date attribute is adjusted so it gets checked again later (in case the title/contents of one of the pages is updated at some point and the Webmention was resent because of it). I’ll talk about how these are stored in a minute.

Exposing the Endpoint

It doesn’t matter where the actual endpoint is since part of the protocol is having a standard way for other software to find the endpoint. There are a few ways to set that up:

Add a webmention link tag to your HTML page.
Send a Link header along with the page.

I did both:

HTML:

1<link rel="webmention" href="/webmention">

Go (on the blog post view handler):

1func viewHandler(w http.ResponseWriter, r *http.Request) {
2	w.Header().Set("Link", fmt.Sprintf(`<%s>; rel="webmention"`, utils.GetWebmentionURL()))
3	...
4}

Now if someone mentions one of my articles, and their software supports Webmentions, they’ll see the /webmentions endpoint in either the meta tag or Link header, and know to make the POST request there.

Processing Webmentions

After the webmentions are accepted from the endpoint, they have to be checked and verified.

Back when I wrote the RSS Reader feature, I created a basic Linux cron job to have it pull the list of RSS feeds every 10 minutes. I did something similar for this. I won’t explain cron here, but I’ll show you the two cron entries on my server right now:

 1# For example, you can run a backup of all your user accounts
 2# at 5 a.m every week with:
 3# 0 5 * * 1 tar -zcf /var/backups/home.tgz /home/
 4# 
 5# For more information see the manual pages of crontab(5) and cron(8)
 6# 
 7# m h  dom mon dow   command
 8
 9*/10 * * * * /usr/local/bin/reader.sh
10*/5 * * * *  /usr/local/bin/webmentions.sh

Each task does a basic HTTP request to the specific endpoint that handles that queue (either from the feeds table or webmentions one).

The Reader task runs every 10 minutes, and the webmentions one every 5 minutes. The webmentions task pulls the first 10 that need to be checked, makes a request to the source post for information, and updates the webmention entry in the database if necessary.

Here’s the database table for webmentions:

 1CREATE TABLE webmentions (
 2    id SERIAL PRIMARY KEY,
 3    post_id TEXT NOT NULL REFERENCES posts(id),
 4    link TEXT NOT NULL,
 5    permalink TEXT,
 6    title TEXT,
 7    summary TEXT,
 8    author_name TEXT,
 9    author_url TEXT,
10    author_email TEXT,
11    author_image TEXT,
12    is_like BOOLEAN DEFAULT FALSE,
13    is_repost BOOLEAN DEFAULT FALSE,
14    valid BOOLEAN DEFAULT FALSE,
15    junk BOOLEAN DEFAULT FALSE,
16    published TIMESTAMPTZ,
17    checked TIMESTAMPTZ,
18    updated TIMESTAMPTZ,
19    created TIMESTAMPTZ DEFAULT CURRENT_TIMESTAMP
20);

I’m not gonna go into too much detail about all of the columns, but I’ll talk about adding and validating the webmentions.

When the Webmention is added in the first place (through the /webmention endoint) the only things used are post_id, link, and checked, everything else is NULL or the default.

post_id: The target param will be the URL to one of my posts. I store it here.
link: The source param URL, which is the post that mentioned mine.
checked: The date the cron job last ran.

When the job runs, if any of the webmentions need to be checked (or rechecked), a HTTP request is made to link and then the response is scanned for microformats so the rest of the data in the database row can be filled out.

Microformats

Microformats is a standard for marking up your HTML with specific tags (CSS classes) that make it easier for programs/clients to pull data from web pages. Some examples of the classes you can add to your page:

p-name: The entry name or title
p-summary: The entry summary
e-content: The entry content
dt-updated: An HTML5 time tag to specify when the entry was updated
p-author: The entry author’s name
u-url: The permalink URL

If you want to use Webmentions on your blog, the blog software will add these tags/classes to each one of your posts. Then when you send a Webmention out to another blog, it can read these tags and display the response in a way that makes sense.

Since I wrote all the HTML for the website, I just added all these to my pages so that when one is checked after I send a Webmention, I can provide all the info the other blog needs to display it the way I want.

Here’s some of the Go template page for a post:

 1<div class="article spaced h-entry">
 2    <h1 class="article-title p-name">{{.post.Title}}</h1>
 3
 4    {{with .post.Created}}
 5        <i class="fad fa-clock"></i>
 6        <time class="dt-published" datetime="{{formatAsDatetime .}}">
 7            {{formatAsDate .}}
 8        </time>
 9    {{end}}
10
11    <p class="article-word-count animation-trigger">
12        {{template "_author"}}
13        &nbsp;&nbsp;&middot;&nbsp;&nbsp;
14
15        About
16        <span
17            class="custom-animation accent"
18            data-animation-name="animated-text"
19            data-animation-value="{{.post.GetWordCount}}">
20            {{formatAsNumber .post.GetWordCount}}
21        </span>
22        {{pluralizeWord .post.GetWordCount "word"}}
23    </p>
24
25    {{with .post.GetMainAttachment}}
26        <figure class="article-main">
27            <div role="img" aria-label="Article main image." class="image" style="background-image: url({{.ImageURL}}); padding-top: {{cssPercent .GetAspectRatio}}"></div>
28            {{if .URL}}
29                <figcaption>Image from <a href="{{.URL}}">{{.GetAttribution}}</a></figcaption>
30            {{end}}
31        </figure>
32    {{end}}
33
34    <div class="article-content e-content">{{.post.GetContent}}</div>
35</div>

So when the website scans the webmentions in the database, it expects the page at link to have some Microformat markup so it can save all that other info (author name/url, title, etc). If it doesn’t, it falls back to using an internal Go package I wrote back in November (called links) that reads and caches the meta tags of webpages I reference on the website.

One thing specified in the protocol is that Webmentions should be “verified” by making sure the text of the webpage actually contains the target url. Meaning that if a website is sending a webmention, we need to make sure it actually mentioned the website and someone isn’t just sending URLs that they want displayed on your web page.

So I made sure it does that. The result is stored in the valid column in the table.

Webmention Discovery and Posting

I already wrote about how the website exposes the /webmention endpoint for other blogs to find, but it also has to find the endpoints of each website that I mention too. The links package I mentioned earlier takes a URL and returns a struct that looks like this:

 1type LinkInfo struct {
 2	ID              string    `json:"id"`
 3	Title           string    `json:"title"`
 4	Description     string    `json:"description"`
 5	ImageURL        string    `json:"imageUrl"`
 6	FaviconURL      string    `json:"faviconUrl"`
 7	FaviconVerified bool      `json:"faviconVerified"`
 8	FeedURL         string    `json:"feedUrl"`
 9	Published       time.Time `json:"published"`
10	Updated         time.Time `json:"updated"`
11}

(FaviconVerified is not the best name for that field. It’s false if the favicon meta tag wasn’t found and the base URL plus /favicon.ico was used for FaviconURL without checking if it actually exists on the server.)

It either:

Makes the HTTP request, uses the golang.org/x/net/html HTML tokenizer to scan meta tags, and stores the results in a links table in the database.
If the link is already in the database (and isn’t expired) then it just returns that data.

Some of the code for checking the meta tags:

 1for key, value := range meta {
 2	switch key {
 3	case "og:title", "twitter:title", "title":
 4		title = value
 5	case "og:description", "twitter:description", "description":
 6		description = value
 7	case "og:image", "twitter:image":
 8		imageURL = value
 9	}
10}

To add support for Webmentions, I added a new field WebmentionURL to the struct and a new column in the database. Now it looks like this:

 1type LinkInfo struct {
 2	ID              string    `json:"id"`
 3	Title           string    `json:"title"`
 4	Description     string    `json:"description"`
 5	ImageURL        string    `json:"imageUrl"`
 6	FaviconURL      string    `json:"faviconUrl"`
 7	FaviconVerified bool      `json:"faviconVerified"`
 8	FeedURL         string    `json:"feedUrl"`
 9	WebmentionURL   string    `json:"webmentionUrl"`
10	Published       time.Time `json:"published"`
11	Updated         time.Time `json:"updated"`
12}

Now the package looks for either the Link header or webmentions meta tag to populate the WebmentionURL field.

This package is used a lot throughout the website. One of those times is when I’m writing a post and I click a Fetch Attachments button on the page. It will scan the draft for links and display the title and main page image (using this package) so I can choose which ones to add to the list of Attachments on the post page.

The attachment management part of my New Post page. — The attachment management part of my "New Post" page.

That means by the time I actually publish the post, I already have the webmention endpoints for anything I mentioned in the post stored already. At that point, sending the Webmention itself is trivial (it’s just sending a POST request to the endpoint with source and target params:

 1data := url.Values{}
 2data.Set("source", postURL)
 3data.Set("target", sourceURL)
 4
 5req, _ := http.NewRequestWithContext(ctx, http.MethodPost, webmentionURL, strings.NewReader(data.Encode()))
 6req.Header.Set("Content-Type", "application/x-www-form-urlencoded")
 7
 8res, err := http.DefaultClient.Do(req)
 9if err != nil {
10	logger.Logf("error sending webmention: %v", err)
11}
12
13if res.StatusCode != http.StatusAccepted && res.StatusCode != http.StatusOK {
14	logger.Logf("webmention not accepted: status code: %d", res.StatusCode)
15}

Displaying the Webmentions

So now Webmentions can be received, stored, checked, and sent, but they still need to be displayed on the post pages.

In this architecture, it’s an easy database query that queries webmentions on the post_id and valid columns, and joins it to the links table for all the “backup” information from the page meta tags that was previously stored.

I updated the “view post” pages to make that query and display any results in a new “Latest Webmentions” section on the page.

I also added a new Webmentions page for each post that displays a paginated list of all Webmentions for that post AND a section that lets you post a Webmention directly to the endpoint from a form.

The webmentions page for one of my articles.

Earlier in the article, I mentioned a from param that I accept in the /webmention endpoint. This new form sends a from value, and the endpoint redirects to a success page if that value is present. The Webmention spec states that the endpoint should just return a 202 status code on success, so that’s what happens if from is empty.

Third Party Services

There are a few services you can use to help build Webmention support or send Webmentions to your site from other services.

webmention.app automates the outgoing part of Webmentions. It checks all of the URLs on a given webpage (the blog post you wrote) and sends the Webmentions to each mentioned site for you.

Webmention.io takes care of most of the Webmention stuff so that you can use one of their links as a Webmention endpoint and then use Javascript they provide to display them on your page.

Bridgy is a service that connects to your social media accounts and then will send Webmentions to whatever Webmention endpoint you specify. For example, if someone responds to one of your tweets that has an article link, Bridgy will send that response (or retweet/like) as a Webmention for you so your site can automatically display it.

I’ve come across some really good posts from Monica Powell, Mark Groves, Max Böck, Sia Karamalegos, and others about using these services.

I don’t need to use any of the ones that help with implementation since I did all that myself. I’ve been thinking about using Bridgy to bring in Twitter and Mastodon replies, but:

I don’t get many social network replies or likes
I barely like social networks very much in the first place (which might be either a cause or effect of 1, I’m not sure)
I like the idea of this being about individual websites communicating with each other without companies like Twitter in the middle of it (which follows the general theme of the things I do with this website: increase independence from other companies/services)

So that’s it. 🙌🏾 The summary of the 10 commits I made for this (including some other small changes and fixes I did at the same time) is 105 changed files with 1,514 additions and 320 deletions. I started on March 12th and finished up on the 17th.

I don’t know what percentage of blog posts I mention supports Webmentions, and I have no idea how many other bloggers will reference my posts and have Webmention support on their websites. Maybe this won’t be used much at all. But it was definitely an interesting feature to work on.

Feedback

· 22 Likes

Last Updated

Monday April 12, 1:30 AM