Emacs, scripting and anything text oriented.

Zero HTML Validation Errors!

Kaushal Modi

How I fixed my site content and went down from 46 HTML validations errors down to 0!

This is a post in the “HTML5 Validator” series.

2022-06-14Saving Python pip dependencies
2022-06-05Zero HTML Validation Errors!
2022-06-01Offline HTML5 Validator

In my previous HTML5 Validator post, I mentioned:

I was a bit disappointed to see validation errors on my site, but then it wasn’t too bad .. 46 errors.

But they truly say ..

Ignorance is bliss.

So .. once I realized that my site had 46 validation errors, I lost that bliss .. and I couldn’t rest easy — I had to fix them all 😁.

This post summarizes the categories of those errors and how I fixed  The fixes mentioned in this post refer to changes in Org mode content. But you should be able to derive equivalent fixes for Markdown too. them all.

1 Avoid duplicate heading id attributes #

"file:/public/notes/​string-fns-nim-vs-python/index.html":636.34-636.46: error: Duplicate ID "notes".
"file:/public/notes/​nim-fmt/index.html":300.34-300.52: error: Duplicate ID "older-issue".
"file:/public/notes/​nim-fmt/index.html":306.20-306.33: error: Duplicate ID "floats".
"file:/public/page/6/index.html":29.39-29.54: error: Duplicate ID "fnref:1".

Errors with above kind of signatures were fixed by,

  1. Converting headings to description lists

    I had a bunch of generic headings like “Notes” and “Older Issue” in some of my posts. After taking a second look at those, it made more sense to convert those to description lists. So in Org mode, I converted headings like * Notes to description lists - Notes ::.

  2. Setting CUSTOM_ID heading property

    For the cases, where the headings needed to be left as so, their IDs were uniquified by setting their CUSTOM_ID property. For example, below fixed the Duplicate ID “floats” errors.

     ** Precision
     ..
     *** Floats
    +:PROPERTIES:
    +:CUSTOM_ID: precision-floats
    +:END:
     ..
     ** Type (only for numbers)
     ..
     *** Floats
    +:PROPERTIES:
    +:CUSTOM_ID: type-floats
    +:END:
    
    Code Snippet 1: Using CUSTOM_ID property to uniquify heading ID's
  3. Prevent footnote links in post summaries

    This issue was due to me not being conscious about how the footnote references work in a post versus on a page outside that post’s context. The issue was caused by footnote references getting into the post summaries parsed by Hugo, which will then show up on the list pages.

    The fix was simple — Edit the post summaries so that they don’t contain any footnote references.

2 Remove inline <style> elements #

"file:/public/​grep-po/index.html":51.139-51.145: error: Element "style" not allowed as child of element "div" in this context. (Suppressing further errors from this subtree.)
"file:/public/​how-do-i-write-org-mode/index.html":23.194-23.200: error: Element "style" not allowed as child of element "div" in this context. (Suppressing further errors from this subtree.)

Errors with above kind of signatures were fixed by,

  1. Avoiding export of raw <style> elements in the Markdown content

    I figured out which functions were responsible for injecting <style> elements in Markdown content and then advised them to stop that. After applying these advises, I lost the in-content rules for CSS classes .org-center and .csl-entry. So I put those rules directly in this website’s CSS.

    (defun modi/org-blackfriday-center-block (_center-block contents info)
      (let* ((class "org-center"))
        (format "<div class=\"%s\">%s\n\n%s\n</div>"
                class (org-blackfriday--extra-div-hack info) contents)))
    (advice-add 'org-blackfriday-center-block :override #'modi/org-blackfriday-center-block)
    
    (defun modi/org-cite-csl-render-bibliography (bib-str)
      (replace-regexp-in-string "<style>\\.csl-entry[^<]+</style>" "" bib-str))
    (advice-add 'org-cite-csl-render-bibliography :filter-return #'modi/org-cite-csl-render-bibliography)
    
    Code Snippet 2: Emacs-Lisp advices to prevent <style> elements in exports
  2. Removing unnecessary micro-styling

    I found a single case, where an inline CSS rule was defined in content for CSS class .repr-type for a table. I just removed that without affecting the looks of that rendered table too much.

3 Ensure that all images have captions or alt attributes #

"file:/public/​hugo-use-goat-code-blocks-for-ascii-diagrams/index.html":24.130-24.255: error: An "img" element must have an "alt" attribute, except under certain conditions. For details, consult guidance on providing text alternatives for images.

Errors with above kind of signatures were easily fixed by ensuring that all images had captions  Thankfully, there were only two images that were missing captions. . The Hugo figure shortcode adds the caption to the alt attribute if the alt is not specified separately.

As an example, here’s how I fixed the above error:

+ #+name: fig__disproportionate_box_drawing
+ #+caption: Disproportionate box drawing characters
[[file:images/​hugo-use-goat-code-blocks-for-ascii-diagrams/ascii-diagram-rendered-in-plain-text-code-block.png]]
Code Snippet 3: A git diff showing addition of caption to an image
"file:/public/​using-emacs-advice-to-silence-messages-from-functions/index.html":151.687-151.732: error: Start tag "a" seen but an element of the same type was already open.
"file:/public/​using-emacs-advice-to-silence-messages-from-functions/index.html":151.748-151.751: error: Stray end tag "a".
"file:/public/​auto-count-100daystooffload-posts/index.html":114.446-114.475: error: Start tag "a" seen but an element of the same type was already open.
"file:/public/​auto-count-100daystooffload-posts/index.html":114.446-114.475: error: End tag "a" violates nesting rules.
"file:/public/​auto-count-100daystooffload-posts/index.html":114.526-114.529: error: Stray end tag "a".

While the hyperlinks in headings work well, they created invalid HTML in the Hugo-generated TOC. So while these errors were created technically because of a bug in Hugo  It’s really cool when you end up finding a bug in an upstream project while trying to fix the errors in your own thing 😎. , I wanted to fix these errors on my end as soon as I can.

I reviewed the errors, and this is all it took to get rid of them all:

  1. Remove manually inserting hyperlinks in headings

     show two methods of finding sources of any printed messages.
    -***** Using plain-old /grep/ or [[https://github.com/BurntSushi/ripgrep][/rg/]]
    +***** Using plain-old /grep/ or /ripgrep (rg)/
     This method is pretty easy (but not robust) to use if the search
     ..
     Org source directory and search for the /"org-babel-exp process .."/
    -string ..
    +string using [[https://github.com/BurntSushi/ripgrep][~rg~]] ..
    
    Code Snippet 4: Removing hyperlink from a heading
  2. Remove Org Radio links that created links in headings

    Here, a Org heading happened to contain the string “Day count”, which was also an an Org Radio link in that post. While ideally that shouldn’t have mattered, I removed that radio link to get around this Hugo bug.

     .. /Just may be/. But regardless, I am already enjoying writing once
    -again, and it's great to see the <<<Day count>>> (counting up to 100)
    +again, and it's great to see the Day count (counting up to 100)
     increase with each new post!
    
    Code Snippet 5: Removing an Org Radio link

Validation Ignores #

Above fixes fixed 43 out of 46 errors, but the remaining 3 were unfixable.

"file:/public/notes/​plantuml/index.html":114.474-114.678: error: Attribute "title" not allowed on element "a" at this point.

This error was caused by hyperlinks in inline SVG elements. These SVG elements are created by PlantUML. The hyperlinks in SVG feature works great, and as these are generated by PlantUML, I chose to just ignore these errors.

I ignored this error by adding the --ignore-re 'notes/plantuml.*Attribute.*title.*not allowed' switch to the html5validator command.

6 Ignore files not expected to serve HTML content #

"file:/public/googleFOO.html":1.1-1.52: error: Non-space characters found without seeing a doctype first. Expected "<!DOCTYPE html>".

The googleFOO.html file here is not a valid HTML file. It’s a just a cookie file that was used by Google to verify that I own this domain.

This error was masked by adding the --ignore 'googleFOO' switch to the html5validator command.

Summary #

Once I fixed the 43 errors by tweaking the Org mode content, and added those two ignores, I had zero validation errors! 🎉

If you are interested in the fix details, here are the commits.