Emacs, scripting and anything text oriented.

Offline HTML5 Validator

Kaushal Modi

Validate your website offline — It’s just one curl command away.

This is a post in the “HTML5 Validator” series.

2022-06-14Saving Python pip dependencies
2022-06-05Zero HTML Validation Errors!
2022-06-01Offline HTML5 Validator

I have been using the online HTML5 Validator https://html5.validator.nu/ for few years now. I have a link at the bottom of each post to validate that page’s HTML. For an example, see the html5 validator link at the bottom of this post.

But it didn’t occur to me until now to look for a way to do the same validation offline! Offline validation would be useful so that I can look at any HTML generation problem before I deploy the website. So I looked for a solution online, and of course it’s answered on StackOverflow 😄.

It turns out that same Nu HTML5 Validator project that provides the online validation service, also provides a Java application as well as pre-compiled binaries for Linux, Windows and MacOS for offline use!

To use the Java .jar file, you need to have at least Java 8 installed on your system. But you don’t need to have any version of Java installed if you use the pre-compiled binary instead. See its documentation for more details.

Using the .jar #

Requires at least Java 8

Download
Download the latest vnu.jar from the project’s GitHub Releases section.
curl -ROLs https://github.com/validator/validator/releases/download/latest/vnu.jar
Run
Below command runs the validator only on the HTML files in the public/  If you are using Hugo, the hugo command will publish the HTML files in the public/ directory by default. directory. See its Usage documentation for more details.
java -jar vnu.jar --skip-non-html --errors-only public/

For my usecase, if I don’t provide the --skip-non-html --errors-only switches, the output is too noisy.

Using pip install #

If you do not want to manually download the .jar, there’s a Python wrapper available to do the same for you: html5validator.

Install
  pip install --user html5validator
Run
  html5validator --root public/
It seems like this Python wrapper implicitly passes --skip-non-html --errors-only to the Java app. So those are not needed when running html5validator. But on the flip side, it needs the --root switch when specifying the directory to run the script on.

Note that you still need to have at least Java 8 installed when running this Python app too, because it downloads and run the same .jar behind the scenes.

Using pre-compiled binary #

If your system doesn’t have the required Java version, you can use the pre-compiled binary instead.

Download & Extract
Download and extract the vnu.<OS>.zip for your OS from the same Releases section. Here, I am showing how to do that on Linux:
curl -ROLs https://github.com/validator/validator/releases/download/latest/vnu.linux.zip
unzip vnu.linux.zip

The extracted binary path will be vnu-runtime-image/bin/vnu.

Run
The run options will be the exact same; just that you will be running the binary directly instead of running through java.
vnu-runtime-image/bin/vnu --skip-non-html --errors-only public/

Results #

<2022-06-05>
This website now has zero validation errors! 🎉 All the errors listed in the collapsed log below are now resolved. See my Zero HTML Validation Errors! post on how I did that.

I was a bit disappointed to see validation errors on my site, but then it wasn’t too bad .. 52 46 errors:

Some I already fixed
These 6 errors were fixed in this commit.
"file:/public/​getting-started-with-texlive/index.html":6.2198-6.2206: error: Element "package" not allowed as child of element "li" in this context. (Suppressing further errors from this subtree.)
"file:/public/​getting-started-with-texlive/index.html":6.2256-6.2259: error: End tag "li" implied, but there were open elements.
"file:/public/​getting-started-with-texlive/index.html":6.2198-6.2206: error: Unclosed element "package".
"file:/public/​getting-started-with-texlive/index.html":6.2307-6.2315: error: Element "package" not allowed as child of element "li" in this context. (Suppressing further errors from this subtree.)
"file:/public/​getting-started-with-texlive/index.html":6.2316-6.2319: error: End tag "li" implied, but there were open elements.
"file:/public/​getting-started-with-texlive/index.html":6.2307-6.2315: error: Unclosed element "package".
Some I can probably fix
  "file:/public/notes/​nim/index.html":472.480-472.486: error: Element "style" not allowed as child of element "div" in this context. (Suppressing further errors from this subtree.)
  "file:/public/notes/​nim/index.html":1359.221-1359.231: error: Duplicate ID "log".
  ..
Some need to be ignored
  "file:/public/​google4a938eaf9bbacbcd.html":1.1-1.52: error: Non-space characters found without seeing a doctype first. Expected "<!DOCTYPE html>".
  "file:/public/​google4a938eaf9bbacbcd.html":1.1-1.52: error: Element "head" is missing a required instance of child element "title".
  ..
And the rest would be out of my scope to fix
  "file:/public/​auto-count-100daystooffload-posts/index.html":114.446-114.475: error: Start tag "a" seen but an element of the same type was already open.
  "file:/public/​auto-count-100daystooffload-posts/index.html":114.446-114.475: error: End tag "a" violates nesting rules.
  "file:/public/​auto-count-100daystooffload-posts/index.html":114.526-114.529: error: Stray end tag "a".
  ..

Expand the below drawer if you’d like to see the full log with 46 errors:

Output of running html5validator --root public/
"file:/public/​google4a938eaf9bbacbcd.html":1.1-1.52: error: Non-space characters found without seeing a doctype first. Expected "<!DOCTYPE html>".
"file:/public/​google4a938eaf9bbacbcd.html":1.1-1.52: error: Element "head" is missing a required instance of child element "title".
"file:/public/notes/​plantuml/index.html":114.474-114.678: error: Attribute "title" not allowed on element "a" at this point.
"file:/public/notes/​nim-fmt/index.html":300.34-300.52: error: Duplicate ID "older-issue".
"file:/public/notes/​nim-fmt/index.html":306.20-306.33: error: Duplicate ID "floats".
"file:/public/notes/​nim-fmt/index.html":339.34-339.52: error: Duplicate ID "older-issue".
"file:/public/notes/​nim-fmt/index.html":341.320-341.335: error: Duplicate ID "integers".
"file:/public/notes/​nim-fmt/index.html":341.494-341.507: error: Duplicate ID "floats".
"file:/public/notes/​nim-fmt/index.html":385.34-385.48: error: Duplicate ID "strings".
"file:/public/notes/​string-fns-nim-vs-python/index.html":134.34-134.46: error: Duplicate ID "notes".
"file:/public/notes/​string-fns-nim-vs-python/index.html":239.34-239.46: error: Duplicate ID "notes".
"file:/public/notes/​string-fns-nim-vs-python/index.html":269.34-269.46: error: Duplicate ID "notes".
"file:/public/notes/​string-fns-nim-vs-python/index.html":336.34-336.46: error: Duplicate ID "notes".
"file:/public/notes/​string-fns-nim-vs-python/index.html":513.106-513.118: error: Duplicate ID "notes".
"file:/public/notes/​string-fns-nim-vs-python/index.html":588.106-588.118: error: Duplicate ID "notes".
"file:/public/notes/​string-fns-nim-vs-python/index.html":636.34-636.46: error: Duplicate ID "notes".
"file:/public/notes/​string-fns-nim-vs-python/index.html":731.34-731.46: error: Duplicate ID "notes".
"file:/public/notes/​string-fns-nim-vs-python/index.html":797.34-797.46: error: Duplicate ID "notes".
"file:/public/notes/​string-fns-nim-vs-python/index.html":846.34-846.46: error: Duplicate ID "notes".
"file:/public/notes/​string-fns-nim-vs-python/index.html":894.34-894.46: error: Duplicate ID "notes".
"file:/public/notes/​string-fns-nim-vs-python/index.html":920.34-920.46: error: Duplicate ID "notes".
"file:/public/notes/​string-fns-nim-vs-python/index.html":942.34-942.46: error: Duplicate ID "notes".
"file:/public/notes/​string-fns-nim-vs-python/index.html":1012.34-1012.46: error: Duplicate ID "notes".
"file:/public/notes/​string-fns-nim-vs-python/index.html":1074.34-1074.46: error: Duplicate ID "notes".
"file:/public/notes/​nim/index.html":472.480-472.486: error: Element "style" not allowed as child of element "div" in this context. (Suppressing further errors from this subtree.)
"file:/public/notes/​nim/index.html":1359.221-1359.231: error: Duplicate ID "log".
"file:/public/notes/​nim/index.html":2364.130-2364.149: error: Duplicate ID "named-tuples".
"file:/public/notes/​nim/index.html":2403.149-2403.172: error: Duplicate ID "anonymous-tuples".
"file:/public/notes/​nim/index.html":3370.284-3370.303: error: Duplicate ID "installation".
"file:/public/notes/​nim/index.html":3410.354-3410.373: error: Duplicate ID "installation".
"file:/public/notes/​nim/index.html":5033.34-5033.52: error: Duplicate ID "older-issue".
"file:/public/notes/​nim/index.html":5376.34-5376.45: error: Duplicate ID "json".
"file:/public/notes/​nim/index.html":6066.64-6066.81: error: Duplicate ID "references".
"file:/public/bits/​plantuml-version/index.html":7.37-7.94: error: An "img" element must have an "alt" attribute, except under certain conditions. For details, consult guidance on providing text alternatives for images.
"file:/public/​auto-count-100daystooffload-posts/index.html":114.446-114.475: error: Start tag "a" seen but an element of the same type was already open.
"file:/public/​auto-count-100daystooffload-posts/index.html":114.446-114.475: error: End tag "a" violates nesting rules.
"file:/public/​auto-count-100daystooffload-posts/index.html":114.526-114.529: error: Stray end tag "a".
"file:/public/​generics-not-exactly-in-systemverilog/index.html":118.232-118.238: error: Element "style" not allowed as child of element "div" in this context. (Suppressing further errors from this subtree.)
"file:/public/​grep-po/index.html":51.139-51.145: error: Element "style" not allowed as child of element "div" in this context. (Suppressing further errors from this subtree.)
"file:/public/​how-do-i-write-org-mode/index.html":23.194-23.200: error: Element "style" not allowed as child of element "div" in this context. (Suppressing further errors from this subtree.)
"file:/public/​hugo-use-goat-code-blocks-for-ascii-diagrams/index.html":24.130-24.255: error: An "img" element must have an "alt" attribute, except under certain conditions. For details, consult guidance on providing text alternatives for images.
"file:/public/​hugo-modules-getting-started/index.html":6.865-6.871: error: Element "style" not allowed as child of element "div" in this context. (Suppressing further errors from this subtree.)
"file:/public/​using-emacs-advice-to-silence-messages-from-functions/index.html":151.687-151.732: error: Start tag "a" seen but an element of the same type was already open.
"file:/public/​using-emacs-advice-to-silence-messages-from-functions/index.html":151.748-151.751: error: Stray end tag "a".
"file:/public/page/6/index.html":29.39-29.54: error: Duplicate ID "fnref:1".
"file:/public/page/6/index.html":49.169-49.184: error: Duplicate ID "fnref:1".

Conclusion #

It was really easy to download the run the vnu application using Java, the standalone Linux binary and also through the html5validator Python wrapper.

After my quick trials, I think I will use the html5validator approach more because,

  1. It works as I expect will the least number of switches.
  2. I am able to redirect the output using html5validator --root public/ > validate.log. I tried the same using the vnu.jar and Linux compiled vnu binary, but the error log redirection didn’t work with those.