Emacs, scripting and anything text oriented.

Golang Quirk: Number-strings starting with "0" are Octals

 
Kaushal Modi

Someone in the Golang team thought that it would be a good idea to consider all numbers (represented as strings) starting with “0” as Octals.. so “010” is actually 8.. Really?

This is a post in the “Golang Octals” series.

2018-04-23Follow-up: Golang Quirk: Number-strings starting with "0" are Octals
2018-04-18Golang Quirk: Number-strings starting with "0" are Octals

The aim of this post is to make a Golang quirk more of a common knowledge, with an ulterior motive to eventually get it fixed upstream, somehow..

Disclaimer: I don’t code in Go lang. So I could very well be wrong in saying problem with Golang vs problem with specifically strconv package.

From my perspective, I see strconv as an internal Go package that any (most of?) Go coder would use to do string → int conversions. If so, I don’t grasp the rationale behind why the strconv developers would make this strange decision.. strange because in normal languages like Python, int("010") returns 10.

Problem #

I learned about this issue for the first time from this Hugo Discourse thread. The synopsis is that someone is retrieving US city zip-codes from a Hugo front-matter variable, and then using some conditional logic based on the last 2 digits.

So the code was:

<!-- Value of .Params.cityZipCode is "75009" -->
{{ if (int (last 2 .Params.cityZipCode)) eq 1 }}er{{ else }}e{{ end }}

The logic is simple.. Get the last two characters of .Params.cityZipCode, which would be "09", convert that string to a number (int), and check if it is 1.

But of-course that didn’t work:

unable to cast “09” of type string to int

Cause #

Later, as I learn, that’s because of the ParseInt function from the strconv package. There it says (emphasis mine):

func ParseInt(s string, base int, bitSize int) (i int64, err error)

ParseInt interprets a string s in the given base (0, 2 to 36) and bit size (0 to 64) and returns the corresponding value i.

If base == 0, the base is implied by the string’s prefix: base 16 for "0x", base 8 for "0", and base 10 otherwise. For bases 1, below 0 or above 36 an error is returned.

    Again.. What was the Golang team thinking?!

Workaround #

This led me to update the int function documentation for Hugo with an ugly workaround:

{{ int ("00987" | strings.TrimLeft "0") }}

Resurgence #

This problem of number-strings beginning with “0” considered as octals resurfaced recently in Hugo issue #4628.

Though, the reported error did not make it evident that that was the problem:

INFO 2018/04/15 18:49:36 found taxonomies: map[string]string{"category":"categories", "manufacturerletter":"manufacturerletters", "manufacturer":"manufacturers", "featured":"featured", "tag":"tags"}
panic: interface conversion: interface {} is float64, not int

goroutine 50 [running]:

github.com/gohugoio/hugo/hugolib.(*Site).assembleTaxonomies(0xc4204ce2c0)
	/go/src/github.com/gohugoio/hugo/hugolib/site.go:1545 +0xee2

The issue reporter had 1500+ content files, and one or more of those files caused this uncaught exception (which is a separate issue, and is planned to be fixed in Hugo) to happen. So I had to spend quite some time doing “forensic debug”1 to understand what caused that “interface conversion: interface {} is float64, not int”.

Debug #

  • The exception was thrown at this line (highlighted below):

        for _, p := range s.Pages {
            vals := p.getParam(plural, !s.Info.preserveTaxonomyNames)
            weight := p.getParamToLower(plural + "_weight")
            if weight == nil {
                weight = 0
            }
            if vals != nil {
                if v, ok := vals.([]string); ok {
                    for _, idx := range v {
                        x := WeightedPage{weight.(int), p}
        
  • So it was evident that one of the taxonomy weight (manufacturers_weight in this case) values wasn’t getting casted to int.

  • So I grepped for anything non-int in those values, like ., ,, e or E, but found nothing.

  • Then doing rg ':\s[0-9]{7,}(\.[0-9]+)*$' in the content files, I saw that there were 4 files that had oddly high weight values like 4611000, and wondered if that was somehow the problem. But that wasn’t it either.

  • When I deleted all the manufacturers_weight lines in those 1500+ files, the error went away.

    find . -name "*.md" -print0 | xargs -0 sed -i '/manufacturers_weight:.*/d'
  • So then I restored all of those deleted lines, and started deleting them again, this time in progression ..

    • First deleting all the lines with values with 7 or more digits.. Error still present.
    • Then deleting all lines with values with 6 digits.. Error still present.
    • .. Error still present.
    • Finally when I deleted all lines with values with 4 digits, the error went away!
    • But by now, I had modified about 700 files in this process!
  • I had almost given up on debugging this further, when I decided to give the git diff one last glance.. and I found the pattern..

        .. the freaking leading 0’s in some of those manufacturers_weight values!

I had a strong gut feeling that those zeros were the problem. So I once again restored the deleted lines in all the content files, typed out the below2 with confidence ..

find . -name "*.md" -exec grep -P 'manufacturers_weight: 0[0-9]+' -l {} \; -exec sed -r -i 's/(manufacturers_weight: )0([0-9]+)/\1\2/' {} \;

    .. and that error was of course gone! 🎉

This ended up with just 16 modified files with a diff like this:

...
modified   content/movements/b/buren/buren-04.en.md
@@ -12,7 +12,7 @@ image: "Buren_04.jpg"
 movementlistkey: "buren"
 caliberkey: "04"
 manufacturers: ["buren"]
-manufacturers_weight: 04
+manufacturers_weight: 4
 categories: ["movements","movements_b","movements_b_buren_en"]
 widgets:
   relatedmovements: true
modified   content/movements/c/citizen/citizen-0153.de.md
@@ -12,7 +12,7 @@ image: "Citizen_0153.jpg"
 movementlistkey: "citizen"
 caliberkey: "0153"
 manufacturers: ["citizen"]
-manufacturers_weight: 0153
+manufacturers_weight: 153
 categories: ["movements","movements_c","movements_c_citizen"]
 widgets:
   relatedmovements: true
...

So that provided that issue originator a workaround so that they can at least get their site built.

But I hope that this 0-leading octal absurdity gets fixed at the root level — People should once again say with confidence, as they learned as kids, that “010” is the same thing as “10”.

Next Steps? #

  • Hugo fixes this issue (4628) on its end by not making this exception go uncaught, and instead let the user know that they magically added a non-int-castable octal value in their content in X file on Y line.
  • The Golang team gives some serious thought to this stupid (sorry about that) annoying decision:

    If base == 0, the base is implied by the string’s prefix: base 16 for "0x", base 8 for "0" ..

§

  1. I call this “forensic debug” because I don’t know Go, and how and where to add debug statements within the hugo source code. So my approach was to figure out which content file/line caused that error. [return]
  2. That command finds all the .md files in the current directory, returns a list of file names wherein the manufactureres_weight value begins with 0 using grep, and then surgically remove the leading zeros just in those short-listed files using sed. [return]

Versions used: go 1.10.1 , hugo 0.39

If you have written a response to this, enter your response post's URL below.

Or, you can send a "comment" webmention (it's OK if you don't know what that means). When asked about your website on an IndieAuth login screen, simply type https://commentpara.de.

Markdown Support**bold**, _italics_, ~~strikethrough~~, [descr](link), `monospace`, ```LANG\nline1\nline2\n``` (Yep, multi-line code blocks too, with syntax highlighting!), auto-hyperlinking.

Webmentions #

Mentioned by Kaushal Modi on Tue Apr 24, 2018 00:10 EDT
Follow-up: Golang Quirk: Number-strings starting with "0" are Octals
—Published on Mon Apr 23, 2018

First of all—I get it. Golang is not the only language that has this odd behavior related to octals. But following the foot-steps of ancestor languages in this particular aspect does not mean that Golang is doing the Right Thing™.

I got many …

Mentioned by Intruder on Wed Apr 18, 2018 23:32 EDT
Golang quirk? I've been using this "quirk" in my pentest courses for 2 decades now. It isn't new, and suggests that particular lesson is still just as valid today... scripter.co/golang-quirk-n…
Comment by Kaushal Modi on Wed Apr 18, 2018 19:33 EDT

You missed my point, I just mentioned Python because you gave it as an example.

Ah, OK. I understand. I have done a lot of parsing and int/string conversions in Perl, Python, Emacs-Lisp, Matlab, and never faced this issue. The Go templates probably just happened to introduce me to that common trait then :)

Mentioned by Cool Go on Wed Apr 18, 2018 19:23 EDT
Golang Quirk: Number-strings starting with “0” are Octals scripter.co/golang-quirk-n…
Mentioned by Go News on Wed Apr 18, 2018 19:22 EDT
Golang Quirk: Number-strings starting with "0" are Octals scripter.co/golang-quirk-n… #reddit
Comment by Anonymous on Wed Apr 18, 2018 19:11 EDT
You missed my point, I just mentioned Python because you gave it as an example. What I was trying to say is that it’s not something uncommon, octal numbers behave that way in C, C++, Java, JavaScript, Ruby, PHP, etc.
Comment by Kaushal Modi on Wed Apr 18, 2018 18:47 EDT

@Anonymous

You may want to try something like this with Python .. print 017

You are correct, though that was true only in Python 2.x.

Thankfully you now need to use the 0o or 0O prefix for octal literals in Python 3.x. Someone wrote PEP 3127 for exactly the same confusion I went through in this post. The integer casting works as expected (int("017") -> 17) in Python 2.x and 3.x though.

Comment by Anonymous on Wed Apr 18, 2018 18:27 EDT
You may want to try something like this with Python before calling it a Golang Quirk….: print 017