Golang Quirk: Number-strings starting with "0" are Octals
— Kaushal ModiSomeone in the Golang team thought that it would be a good idea to consider all numbers (represented as strings) starting with “0” as Octals.. so “010” is actually 8.. Really?
This is a post in the “Golang Octals” series.
2018-04-23 | Follow-up: Golang Quirk: Number-strings starting with "0" are Octals |
2018-04-18 | Golang Quirk: Number-strings starting with "0" are Octals |
The aim of this post is to make a Golang quirk more of a common knowledge, with an ulterior motive to eventually get it fixed upstream, somehow..
Disclaimer: I don’t code in Go lang. So I could very well be wrong
in saying problem with Golang vs problem with specifically
strconv
package.
From my perspective, I see strconv
as an internal Go package that
any (most of?) Go coder would use to do string → int conversions. If
so, I don’t grasp the rationale behind why the strconv
developers
would make this strange decision.. strange because in normal
languages like Python, int("010")
returns 10
.
Problem #
I learned about this issue for the first time from this Hugo Discourse thread. The synopsis is that someone is retrieving US city zip-codes from a Hugo front-matter variable, and then using some conditional logic based on the last 2 digits.
So the code was:
<!-- Value of .Params.cityZipCode is "75009" -->
{{ if (int (last 2 .Params.cityZipCode)) eq 1 }}er{{ else }}e{{ end }}
The logic is simple.. Get the last two characters of
.Params.cityZipCode
, which would be "09"
, convert that string to a
number (int
), and check if it is 1
.
But of-course that didn’t work:
unable to cast “09” of type string to int
Cause #
Later, as I learn, that’s because of the ParseInt
function from the
strconv
package. There it says (emphasis mine):
func ParseInt(s string, base int, bitSize int) (i int64, err error)
ParseInt
interprets a strings
in the givenbase
(0, 2 to 36) andbit
size (0 to 64) and returns the corresponding valuei
.If
base == 0
, thebase
is implied by the string’s prefix: base 16 for"0x"
, base 8 for"0"
, and base 10 otherwise. For bases 1, below 0 or above 36 an error is returned.
Again.. What was the Golang team thinking?!
Workaround #
This led me to update the int
function documentation for Hugo with
an ugly workaround:
{{ int ("00987" | strings.TrimLeft "0") }}
Resurgence #
This problem of number-strings beginning with “0” considered as octals resurfaced recently in Hugo issue #4628.
Though, the reported error did not make it evident that that was the problem:
INFO 2018/04/15 18:49:36 found taxonomies: map[string]string{"category":"categories", "manufacturerletter":"manufacturerletters", "manufacturer":"manufacturers", "featured":"featured", "tag":"tags"}
panic: interface conversion: interface {} is float64, not int
goroutine 50 [running]:
github.com/gohugoio/hugo/hugolib.(*Site).assembleTaxonomies(0xc4204ce2c0)
/go/src/github.com/gohugoio/hugo/hugolib/site.go:1545 +0xee2
The issue reporter had 1500+ content files, and one or more of those files caused this uncaught exception (which is a separate issue, and is planned to be fixed in Hugo) to happen. So I had to spend quite some time doing “forensic debug”1 to understand what caused that “interface conversion: interface {} is float64, not int”.
Debug #
The exception was thrown at this line (highlighted below):
for _, p := range s.Pages { vals := p.getParam(plural, !s.Info.preserveTaxonomyNames) weight := p.getParamToLower(plural + "_weight") if weight == nil { weight = 0 } if vals != nil { if v, ok := vals.([]string); ok { for _, idx := range v { x := WeightedPage{weight.(int), p}
So it was evident that one of the taxonomy weight (
manufacturers_weight
in this case) values wasn’t getting casted toint
.So I grepped for anything non-int in those values, like
.
,,
,e
orE
, but found nothing.Then doing
rg ':\s[0-9]{7,}(\.[0-9]+)*$'
in the content files, I saw that there were 4 files that had oddly high weight values like 4611000, and wondered if that was somehow the problem. But that wasn’t it either.When I deleted all the
manufacturers_weight
lines in those 1500+ files, the error went away.find . -name "*.md" -print0 | xargs -0 sed -i '/manufacturers_weight:.*/d'
So then I restored all of those deleted lines, and started deleting them again, this time in progression ..
- First deleting all the lines with values with 7 or more digits.. Error still present.
- Then deleting all lines with values with 6 digits.. Error still present.
- .. Error still present.
- Finally when I deleted all lines with values with 4 digits, the error went away!
- But by now, I had modified about 700 files in this process!
I had almost given up on debugging this further, when I decided to give the
git diff
one last glance.. and I found the pattern.... the freaking leading 0’s in some of those
manufacturers_weight
values!
I had a strong gut feeling that those zeros were the problem. So I once again restored the deleted lines in all the content files, typed out the below2 with confidence ..
find . -name "*.md" -exec grep -P 'manufacturers_weight: 0[0-9]+' -l {} \; -exec sed -r -i 's/(manufacturers_weight: )0([0-9]+)/\1\2/' {} \;
.. and that error was of course gone! 🎉
This ended up with just 16 modified files with a diff like this:
...
modified content/movements/b/buren/buren-04.en.md
@@ -12,7 +12,7 @@ image: "Buren_04.jpg"
movementlistkey: "buren"
caliberkey: "04"
manufacturers: ["buren"]
-manufacturers_weight: 04
+manufacturers_weight: 4
categories: ["movements","movements_b","movements_b_buren_en"]
widgets:
relatedmovements: true
modified content/movements/c/citizen/citizen-0153.de.md
@@ -12,7 +12,7 @@ image: "Citizen_0153.jpg"
movementlistkey: "citizen"
caliberkey: "0153"
manufacturers: ["citizen"]
-manufacturers_weight: 0153
+manufacturers_weight: 153
categories: ["movements","movements_c","movements_c_citizen"]
widgets:
relatedmovements: true
...
So that provided that issue originator a workaround so that they can at least get their site built.
But I hope that this 0-leading octal absurdity gets fixed at the root level — People should once again say with confidence, as they learned as kids, that “010” is the same thing as “10”.
Next Steps? #
Hugo fixes this issue (4628) on its end by not making this exception go uncaught, and instead let the user know that they magically added a non-
int
-castable octal value in their content in X file on Y line.The Golang team gives some serious thought to this
stupid(sorry about that) annoying decision:If
base == 0
, thebase
is implied by the string’s prefix: base 16 for"0x"
, base 8 for"0"
..
I call this “forensic debug” because I don’t know Go, and how and where to add debug statements within the
hugo
source code. So my approach was to figure out which content file/line caused that error. ↩︎That command finds all the
.md
files in the current directory, returns a list of file names wherein themanufactureres_weight
value begins with0
usinggrep
, and then surgically remove the leading zeros just in those short-listed files usingsed
. ↩︎