Emacs, scripting and anything text oriented.

Golang Quirk: Number-strings starting with "0" are Octals

Kaushal Modi

Someone in the Golang team thought that it would be a good idea to consider all numbers (represented as strings) starting with “0” as Octals.. so “010” is actually 8.. Really?

This is a post in the “Golang Octals” series.

2018-04-23Follow-up: Golang Quirk: Number-strings starting with "0" are Octals
2018-04-18Golang Quirk: Number-strings starting with "0" are Octals

The aim of this post is to make a Golang quirk more of a common knowledge, with an ulterior motive to eventually get it fixed upstream, somehow..

Disclaimer: I don’t code in Go lang. So I could very well be wrong in saying problem with Golang vs problem with specifically strconv package.

From my perspective, I see strconv as an internal Go package that any (most of?) Go coder would use to do string → int conversions. If so, I don’t grasp the rationale behind why the strconv developers would make this strange decision.. strange because in normal languages like Python, int("010") returns 10.

Problem #

I learned about this issue for the first time from this Hugo Discourse thread. The synopsis is that someone is retrieving US city zip-codes from a Hugo front-matter variable, and then using some conditional logic based on the last 2 digits.

So the code was:

<!-- Value of .Params.cityZipCode is "75009" -->
{{ if (int (last 2 .Params.cityZipCode)) eq 1 }}er{{ else }}e{{ end }}

The logic is simple.. Get the last two characters of .Params.cityZipCode, which would be "09", convert that string to a number (int), and check if it is 1.

But of-course that didn’t work:

unable to cast “09” of type string to int

Cause #

Later, as I learn, that’s because of the ParseInt function from the strconv package. There it says (emphasis mine):

func ParseInt(s string, base int, bitSize int) (i int64, err error)

ParseInt interprets a string s in the given base (0, 2 to 36) and bit size (0 to 64) and returns the corresponding value i.

If base == 0, the base is implied by the string’s prefix: base 16 for "0x", base 8 for "0", and base 10 otherwise. For bases 1, below 0 or above 36 an error is returned.

    Again.. What was the Golang team thinking?!

Workaround #

This led me to update the int function documentation for Hugo with an ugly workaround:

{{ int ("00987" | strings.TrimLeft "0") }}

Resurgence #

This problem of number-strings beginning with “0” considered as octals resurfaced recently in Hugo issue #4628.

Though, the reported error did not make it evident that that was the problem:

INFO 2018/04/15 18:49:36 found taxonomies: map[string]string{"category":"categories", "manufacturerletter":"manufacturerletters", "manufacturer":"manufacturers", "featured":"featured", "tag":"tags"}
panic: interface conversion: interface {} is float64, not int

goroutine 50 [running]:

github.com/gohugoio/hugo/hugolib.(*Site).assembleTaxonomies(0xc4204ce2c0)
	/go/src/github.com/gohugoio/hugo/hugolib/site.go:1545 +0xee2

The issue reporter had 1500+ content files, and one or more of those files caused this uncaught exception (which is a separate issue, and is planned to be fixed in Hugo) to happen. So I had to spend quite some time doing “forensic debug”1 to understand what caused that “interface conversion: interface {} is float64, not int”.

Debug #

  • The exception was thrown at this line (highlighted below):

    for _, p := range s.Pages {
        vals := p.getParam(plural, !s.Info.preserveTaxonomyNames)
        weight := p.getParamToLower(plural + "_weight")
        if weight == nil {
            weight = 0
        }
        if vals != nil {
            if v, ok := vals.([]string); ok {
                for _, idx := range v {
                    x := WeightedPage{weight.(int), p}
    
  • So it was evident that one of the taxonomy weight (manufacturers_weight in this case) values wasn’t getting casted to int.

  • So I grepped for anything non-int in those values, like ., ,, e or E, but found nothing.

  • Then doing rg ':\s[0-9]{7,}(\.[0-9]+)*$' in the content files, I saw that there were 4 files that had oddly high weight values like 4611000, and wondered if that was somehow the problem. But that wasn’t it either.

  • When I deleted all the manufacturers_weight lines in those 1500+ files, the error went away.

    find . -name "*.md" -print0 | xargs -0 sed -i '/manufacturers_weight:.*/d'
    
  • So then I restored all of those deleted lines, and started deleting them again, this time in progression ..

    • First deleting all the lines with values with 7 or more digits.. Error still present.
    • Then deleting all lines with values with 6 digits.. Error still present.
    • .. Error still present.
    • Finally when I deleted all lines with values with 4 digits, the error went away!
    • But by now, I had modified about 700 files in this process!
  • I had almost given up on debugging this further, when I decided to give the git diff one last glance.. and I found the pattern..

        .. the freaking leading 0’s in some of those manufacturers_weight values!

I had a strong gut feeling that those zeros were the problem. So I once again restored the deleted lines in all the content files, typed out the below2 with confidence ..

find . -name "*.md" -exec grep -P 'manufacturers_weight: 0[0-9]+' -l {} \; -exec sed -r -i 's/(manufacturers_weight: )0([0-9]+)/\1\2/' {} \;

    .. and that error was of course gone! 🎉

This ended up with just 16 modified files with a diff like this:

...
modified   content/movements/b/buren/buren-04.en.md
@@ -12,7 +12,7 @@ image: "Buren_04.jpg"
 movementlistkey: "buren"
 caliberkey: "04"
 manufacturers: ["buren"]
-manufacturers_weight: 04
+manufacturers_weight: 4
 categories: ["movements","movements_b","movements_b_buren_en"]
 widgets:
   relatedmovements: true
modified   content/movements/c/citizen/citizen-0153.de.md
@@ -12,7 +12,7 @@ image: "Citizen_0153.jpg"
 movementlistkey: "citizen"
 caliberkey: "0153"
 manufacturers: ["citizen"]
-manufacturers_weight: 0153
+manufacturers_weight: 153
 categories: ["movements","movements_c","movements_c_citizen"]
 widgets:
   relatedmovements: true
...

So that provided that issue originator a workaround so that they can at least get their site built.

But I hope that this 0-leading octal absurdity gets fixed at the root level — People should once again say with confidence, as they learned as kids, that “010” is the same thing as “10”.

Next Steps? #

  • Hugo fixes this issue (4628) on its end by not making this exception go uncaught, and instead let the user know that they magically added a non-int-castable octal value in their content in X file on Y line.

  • The Golang team gives some serious thought to this stupid (sorry about that) annoying decision:

    If base == 0, the base is implied by the string’s prefix: base 16 for "0x", base 8 for "0" ..

§

  1. I call this “forensic debug” because I don’t know Go, and how and where to add debug statements within the hugo source code. So my approach was to figure out which content file/line caused that error. ↩︎

  2. That command finds all the .md files in the current directory, returns a list of file names wherein the manufactureres_weight value begins with 0 using grep, and then surgically remove the leading zeros just in those short-listed files using sed↩︎