Webmention Comments

Posted on January 26, 2024

After maybe 8 hours of distracted effort, my website now builds its comments from webmentions instead of using Cusdis. This was a bit of a feat, as I didn't even know what webmentions were before a couple weeks ago. For the similarly uninitiated, webmentions provide a notification protocol which establishes a communication standard. This protocol is one of several which link the various components of the fediverse together. I decided to go this route for two reasons: The first is that I'm not the biggest fan of Cusdis's decision to use iframes to display comments; it just seems unnecessary. The second reason is that Cusdis requires JavaScript, and it's disappointing that comments is the only thing on my site that uses it at this point. It can take up to a day for a comment to show up on a post with what I've managed to get working, but that's a small price to pay for completely removing JavaScript from my website as far as I'm concerned.

Other folks have written posts similar to what I'm writing about here, but starting from different points. Here's my starting point:

There are a few steps to get where we're going, so I'm actually going to use headings to delineate them for once! So you have an idea of what's in store, this is what we're looking at:

IndieAuth

IndieAuth serves as an identity provider (like Github, Google, Microsoft, etc.) and uses a domain you own as your identity. When you log in to a website or service via IndieAuth, you provide the name of the domain you want to authenticate as, and IndieAuth pulls the HTML down from that domain and combs through it for links with rel attributes set to me and href attributes that match authentication providers known to the website or service. Typically, Github, Twitter, an email address, and a GPG key will all work as authentication providers, but sometimes only a subset is supported. If a match is found, you're prompted to log in with the authentication provider, and your profile is read in order to find a website property that matches the domain you're trying to authenticate as. If this matches as well, then you are who you say you are, and your domain serves as your identity. I'm using Github as my authentication provider for now, but I'll probably switch to email pretty soon. Of course, an email address doesn't provide a profile with a website property, so a challenge is sent to that email in the form a short code that needs to be verified with the service you want access to, sort of like a multi-factor authentication code.

Setting up IndieAuth was easy since I already have links to all of my profiles on my website already, I just needed the rel attributes:

<a rel="me" href="https://github.com/AblatedSprocket" class="icon-link icon-dark" title="Github link" aria-label="follow on Github">

As you can see, you can put any other attributes in the anchor tag you want, it just needs the rel and a proper href. On the Github side, I also set the website field in my profile to https://nothingissimple.ablatedsprocket.com. Note that IndieAuth has no problem with subdomains, but adding a path to the end won't work.

Webmentions

After a fair bit of reading about Webmentions, I decided the easiest way to support them would be through webmention.io, which uses IndieAuth. Once logged in, Webmention.io gives you a URL to put in your website's HTML. This is mine:

<link rel="webmention" href="https://webmention.io/nothingissimple.ablatedsprocket.com/webmention" />

I just put this in my head template which gets put on all of my web pages, and that was presumably that. I couldn't verify that it had worked at the time since I didn't have a source of webmentions readily available. I took it on blind faith that all was well and moved on to Bridgy.

Bridgy

Mastodon doesn't speak Webmentions, so another service needs to be used to relay Mastodon toots to webmention.io. Enter Bridgy, which polls the accounts you configure for things to turn into webmentions (including but not limited to toots). With Mastodon specifically, you provide your instance and log in, and Bridgy asks for your website to associate to your Mastodon account. With that information, Bridgy will look for toots created by your Mastodon account containing links to the pages within your website's domain. When it finds a match, it looks for any favorites, boosts, links, or replies to said toot, and creates webmentions from them. Usually, this is the type of thing that I would have to spend some time debugging, but once I set it up, it started churning through my toots straight away. When Bridgy is first set up, it will only look at the 50 most recent toots, so posts may have to be re-tooted.

Pulling Down Webmentions

With webmentions getting created, I needed to fetch and save them so the script that publishes my website could parse them and build comments. Normally, I'd try to do this in Bash, but since I know Emacs will be available whenever I want to do this, I wrote the script in Elisp instead:

;;; comments.el --- Fetch webmentions from webmentions.io and save them

;;; Commentary:
;;
;; Webmentions with the 'reply' type are saved to a 'webmentions' directory
;; adjacent to this file.

;;; Code:
(let ((page 0)
      (json-plist
       (with-temp-buffer
         ;; Pull down all webmentions associated with the domain and parse them
         ;; as a plist.  We'll sort them out later.
         (url-insert-file-contents (concat "https://webmention.io/api/mentions?token="
                                           (getenv "WMTOKEN")
                                           "&domain=nothingissimple.ablatedsprocket.com"))
         (json-parse-buffer :object-type 'plist)))
      (comment-alist (list))
      ;; Store webmentions in a webmentions directory
      (base-dir (expand-file-name "webmentions" (file-name-directory load-file-name))))
  ;; Create webmentions directory if it doesn't exist
  (when (not (file-directory-p base-dir))
    (make-directory base-dir))
  ;; Tie each webmention to its owning post file path in an alist
  (while (not (seq-empty-p (plist-get json-plist :links)))
    (mapc (lambda (comment)
            ;; Only interested in saving replies for now.
            (let ((save-comment-p (string= "reply"
                                           (plist-get (plist-get comment
                                                                 :activity)
                                                      :type)))
                  (fname (expand-file-name (concat (file-name-base (plist-get comment :target))
                                                   ".json")
                                           base-dir)))
              (if (assoc fname comment-alist)
                  (push (when save-comment-p
                          comment)
                        (cadr (assoc fname
                                     comment-alist)))
                (push (cons fname `(,(when save-comment-p
                                       comment)))
                      comment-alist))))
          (plist-get json-plist :links))
    ;; Attempt to get the next page of data
    (setq page (1+ page))
    (setq json-plist (with-temp-buffer
                       (url-insert-file-contents (concat "https://webmention.io/api/mentions?token="
                                                         (getenv "WMTOKEN")
                                                         "&domain=nothingissimple.ablatedsprocket.com&page="
                                                         (number-to-string page)))
                       (json-parse-buffer :object-type 'plist))))
  ;; Write comments to respective files
  (dolist (item comment-alist)
    (let ((fname (car item))
          (comment-list (cadr item)))
      (write-region (json-serialize
                     (vconcat
                      (seq-filter 'identity
                                  comment-list)))
                    nil
                    fname))))
;;; comments.el ends here

I could put this in my publish script, but I want to keep comment tracking decoupled from publishing to facilitate building a scheduled job around this in Gitlab. I'll cover this more in the final section.

One key thing to point out here is that a JSON file is always saved if a post has any webmentions, even if I'm not interested in publishing any of them to the webpage. This provides me a mechanism to determine which posts I have tooted about. This will come into play when I update the site generator in the next section.

The second script is just a git routine to commit changes and push them to the repo when changes are present:

#!/usr/bin/env bash
#
# Commit, and push changes to GitLab
#
# Fail on error
set -e
# Accept self signed certificate in certificate chain
git config http.sslVerify false
# Set user identity
git config user.email "${GIT_USER_EMAIL:-$GITLAB_USER_EMAIL}"
git config user.name "${GIT_USER_NAME:-$GITLAB_USER_NAME}"
# GitLab works on a detached head
git config push.autoSetupRemote true
# Add remote to enable push
git remote add upstream "https://${GITLAB_USER_LOGIN}:${GITLAB_TOKEN}@${CI_SERVER_HOST}/${CI_PROJECT_PATH}.git" || true
CHANGES=$(git status --porcelain | wc -l)
if [ "$CHANGES" -gt "0" ]; then
    # Stage the changes
    git add .
    # Show the status of files that are about to be created, updated or deleted
    git status
    # Commit all changes
    git commit -m "Committing comments retrieved on $(date -Iminutes)"
    git push upstream HEAD:${CI_COMMIT_BRANCH}
fi

Most of this code is configuration; Gitlab is running all this in a container, so git needs to be configured before changes can be pushed into the repository.

Site Generator Updates

In case mentioning it half a dozen times in other posts hasn't been enough, I'll reiterate that my website is generated by an ox-publish script on top of a customized Ox-Slimhtml backend. I've already written a solution for Cusdis comments in the form of a post-processing filter applied to my post web pages, so I'm leveraging that here. It's just a matter of setting a variable:

(setq org-export-filter-final-output-functions '(ab-site-html-add-comment-filter))

I want my comments section to essentially be two pieces, so I've broken the filter itself out into two functions. The first piece is an announcement that the post can be commented on that gets tacked on to the web page after the closing main tag. This gets generated if the webmentions folder contains a JSON file whose name matches the post name, regardless of what's in it. It skips index pages; I'm not interested in putting comments on those for now:

(defun ab-site-html-add-comment-filter (output backend info)
  "Add comments to pages that have a corresponding file in 'webmentions'
directory.
OUTPUT is the text to be filtered.
BACKEND is a the backend used for export, as a symbol.
INFO is a plist used as a communication channel."
  (when-let* ((output-file (plist-get info :output-file))
              (comments-file (expand-file-name (concat "webmentions/"
                                                       (file-name-base output-file)
                                                       ".json")
                                               ab-site-root-dir))
              ;; The first criterion might not be necessary!
              (_ (and (not (string= "index.html" (file-name-nondirectory output-file)))
                      (file-exists-p comments-file))))
    (replace-regexp-in-string "</main>"
                              (concat "</main><aside class=\"container\"><h2>Comments</h2><p>You can use your fediverse account to reply to this post!</p>"
                                      (ab-site-html-generate-comments comments-file)
                                      "</aside>")
                              output)))

Ideally, there would be a link to the Mastodon post in the announcement, but I couldn't figure out a way to get that from the webmention.io API. The second piece of my comments section is the list of comments. The function parses the contents of the JSON file as a comments list, and if that list isn't empty, it generates the HTML. There are myriad ways to generate a comment section, I'm just using an ordered list:

(defun ab-site-html-generate-comments (filename)
  "Generate HTML comments from a JSON file located at FILENAME.  Only 'reply'
webmentions are included in the output"
  (let ((comments (with-temp-buffer
                    (insert-file-contents filename)
                    (json-parse-buffer :object-type 'plist))))
    (when (not (seq-empty-p comments))
      (concat "<ol class=\"comments-list\">"
              (mapconcat (lambda (comment)
                           (let* ((data (plist-get comment :data))
                                  (author (plist-get data :author)))
                             (concat "<li class=\"comment\"><div class=\"comment-header\"><img class=\"comment-pic\" src=\""
                                     (plist-get author :photo)
                                     "\" /><span class=\"comment-author\">"
                                     (plist-get author :name)
                                     "</span></div>"
                                     (plist-get data :content)
                                     "</li>")))
                         comments)
              "</ol>"))))

In practice, this process is made up of four steps:

  1. Publish a post to my website
  2. Publish a toot on Mastodon announcing the post
  3. Boost my own toot so brid.gy will make a webmention
  4. Wait for the schedule to pick up the webmention and add the comment section announcement

Ideally, step 3 wouldn't be necessary; it feels kind of taboo to boost my own post. I've spent enough time on this project for now, though. I'm tired.

Update Gitlab-CI

The scripts are finally done, it's time to incorporate them into my Gitlab job. Getting this figured out required a bit of mental gymnastics because I'm essentially trying to use one continuous integration file for two processes: managing comments and deploying the site. Let's look at the comments job first:

comments:
  stage: build
  rules:
    - if: $CI_PIPELINE_SOURCE == "schedule"
  script:
    - emacs --script comments.el
    - bash commit_comments.sh

What's not shown here is that in Gitlab, I built a schedule to kick off this CI every day. When Gitlab does this, it sets the $CI_PIPELINE_SOURCE to schedule. If I manually push changes to main, this variable is set to something else. I think it's push, but I'm not sure, the point is it's different. With the rule I've set, this Job will only run when kicked off by the schedule, so if I'm pushing a post, updating the publishing script, or updating a post, this job won't trigger.

The next job publishes the site, and this is where my experience with git forges was helpful:

pages:
  stage: deploy
  rules:
    - if: $CI_PIPELINE_SOURCE != "schedule"
  script:
    - emacs --script publish.el
  artifacts:
    paths:
      - public
    when: always

I was pretty proud of myself because I recognized ahead of time that if the comments job makes a commit, it will kick off a second build, as if I had made commits and pushed them to Gitlab myself. I don't want to do that. The good news is, if there are no new comments then no commit gets made, and no second build is triggered.

Gitlab provides a surprising amount of environment variables and rule options for .gitlab-ci files that can yield expressive builds where it is easy to determine when and why a build is run. I toyed with some of the rule options, but at this point I was running out of steam for this project. While the intention behind my rules could be clearer, they're simple which is good enough for now. I did have to make a few mental diagrams to make sure I had all of my committing scenarios covered, but it's live and seems to work.

Barring .gitlab-ci rules, I still have a couple of grievances with this workflow that I've mentioned: I don't want to have to boost my own toots to generate comments sections and I would prefer if I could provide a link to the toot announcing a post. I would love to hear if anyone has any ideas on how to achieve either of these goals.