<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en-us"><generator uri="https://gohugo.io/" version="0.101.0">Hugo</generator><title type="html">string on A Scripter's Notes</title><subtitle type="html">Emacs, scripting and anything text oriented.</subtitle><link href="https://scripter.co/tags/string/" rel="alternate" type="text/html" title="HTML"/><link href="https://scripter.co/tags/string/index.xml" rel="alternate" type="application/rss+xml" title="RSS"/><link href="https://scripter.co/tags/string/atom.xml" rel="self" type="application/atom+xml" title="Atom"/><link href="https://scripter.co/tags/string/jf2feed.json" rel="alternate" type="application/jf2feed+json" title="jf2feed"/><updated>2026-04-22T08:24:58-04:00</updated><author><name>Kaushal Modi</name><email>kaushal.modi@gmail.com</email></author><id>https://scripter.co/tags/string/</id><entry><title type="html">grep -Po</title><link href="https://scripter.co/grep-po/?utm_source=atom_feed" rel="alternate" type="text/html"/><link href="https://scripter.co/golang-quirk-number-strings-starting-with-0-are-octals/?utm_source=atom_feed" rel="related" type="text/html" title='  Golang Quirk: Number-strings starting with "0" are Octals   '/><link href="https://scripter.co/generics-not-exactly-in-systemverilog/?utm_source=atom_feed" rel="related" type="text/html" title="Generics (not exactly) in SystemVerilog"/><link href="https://scripter.co/sidenotes-using-ox-hugo/?utm_source=atom_feed" rel="related" type="text/html" title="Sidenotes using ox-hugo"/><link href="https://scripter.co/sidenotes-using-only-css/?utm_source=atom_feed" rel="related" type="text/html" title="Sidenotes using only CSS"/><link href="https://scripter.co/notes/string-fns-nim-vs-python/?utm_source=atom_feed" rel="related" type="text/html" title="String Functions: Nim vs Python"/><id>https://scripter.co/grep-po/</id><author><name>Kaushal Modi</name></author><published>2022-02-16T21:34:00-05:00</published><updated>2022-02-16T21:34:00-05:00</updated><content type="html"><![CDATA[<blockquote>Using <code>grep</code> to do substring extraction in shell scripts.</blockquote><div class="ox-hugo-toc toc">
<div class="heading">Table of Contents</div>
<ul>
<li><a href="#grep-po-problem-statement">Problem statement</a></li>
<li><a href="#solution-using-grep-po">Solution using <code>grep -Po</code></a></li>
<li><a href="#arriving-to-this-solution">Arriving to this solution</a></li>
<li><a href="#summary">Summary</a></li>
</ul>
</div>
<!--endtoc-->
<p>I like <a href="https://en.wikipedia.org/wiki/Regular_expression">regular expressions</a>
<span class="sidenote-number"><small class="sidenote">
I recommend using <a href="https://regex101.com/">https://regex101.com/</a> to practice regular
expressions of different flavors (PCRE2, PCRE, Python, etc.) whether
or not you are new to using <abbr aria-label=" regular expression" tabindex=0>regex</abbr>.
</small></span>
as they allow me to be concise and specific about what I need to
search.</p>
<p>And I have liked using regular expressions for many years, ever since
I learned Perl about fifteen years back. I am writing this post as I
am remembering the delight I felt when I realized that I can use the
familiar Perl regular expressions to do string parsing in shell
scripts. I am not exactly sure, but I probably learned about this
<code>grep -Po</code> trick from <em>stackexchange</em> (<a href="#citeproc_bib_item_1">camh, 2011</a>).</p>

<h2 id="grep-po-problem-statement">Problem statement&nbsp;<a class="headline-hash no-text-decoration" href="#grep-po-problem-statement">#</a></h2>


<p>I could be parsing a log file with a line like <code>web report: https://foo.bar/detail.html</code> and I need to extract the
<code>https://foo.bar</code> part to a shell script variable.</p>

<h2 id="solution-using-grep-po">Solution using <code>grep -Po</code>&nbsp;<a class="headline-hash no-text-decoration" href="#solution-using-grep-po">#</a></h2>


<div class="note">
<p>This solution requires a GNU <code>grep</code> version supporting <code>-P</code>, that&rsquo;s
compiled with <code>libpcre</code>.
<span class="sidenote-number"><small class="sidenote">
<em>GNU grep</em> gained the PCRE (<code>-P</code>) feature back <a href="https://git.savannah.gnu.org/cgit/grep.git/commit/?id=05860b2d966701a5a9f70a650d32b30ae2612eeb">in 2000</a>.
</small></span>
Also I have never come across a system or
used one that did not have such a <code>grep</code> version installed.</p>
</div>
<p>I&rsquo;ll throw the solution out here and then dig into the details.</p>
<p><a id="code-snippet--grepPo-example"></a></p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-shell" data-lang="shell"><span class="line"><span class="cl"><span class="nb">echo</span> <span class="s2">&#34;def\nabc&#34;</span> <span class="p">|</span> grep -Po <span class="s1">&#39;a\K.(?=c)&#39;</span> <span class="c1"># =&gt; b</span>
</span></span></code></pre></div><div class="src-block-caption">
  <span class="src-block-number"><a href="#code-snippet--grepPo-example">Code Snippet 1</a>:</span>
  Extracting "b" from "abc" using <code>grep -Po</code>
</div>
<p>The <em>grep</em> switches used here are:</p>
<dl>
<dt><code>-P</code></dt>
<dd>Use (P)erl regular expressions. This allows us to use the
<a href="https://www.regular-expressions.info/lookaround.html"><em>look around</em> regex</a> syntax like <code>(?=..)</code> and special characters like
<code>\K</code> (<a href="#citeproc_bib_item_2">“perlre - Perl regular expressions,” n.d.</a>).</dd>
<dt><code>-o</code></dt>
<dd>Print only the matched portion to the (o)utput</dd>
</dl>

<h2 id="arriving-to-this-solution">Arriving to this solution&nbsp;<a class="headline-hash no-text-decoration" href="#arriving-to-this-solution">#</a></h2>


<p>Now I&rsquo;ll start with a basic example and build up to the <a href="#code-snippet--grepPo-example">above
solution</a>.</p>
<dl>
<dt>Problem</dt>
<dd>Let&rsquo;s say I have this text with two lines &ldquo;def&rdquo; and &ldquo;abc&rdquo;
and I want<span class="org-target" id="org-target--wanted-grep-output"></span> to output whatever character is between &ldquo;a&rdquo; and &ldquo;c&rdquo;.</dd>
</dl>
<!--listend-->
<ul>
<li>
<p>Below, the regular expression for matching any character between &ldquo;a&rdquo;
and &ldquo;c&rdquo; ( <code>'a.c'</code> ) is correct, but that will output the whole input
because the <em>grep</em> of that regex succeeded.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-shell" data-lang="shell"><span class="line"><span class="cl"><span class="nb">echo</span> <span class="s2">&#34;def\nabc&#34;</span> <span class="p">|</span> grep <span class="s1">&#39;a.c&#39;</span> <span class="c1"># =&gt; def\nabc</span>
</span></span></code></pre></div></li>
<li>
<p>Now we add the <em>grep</em> <code>-o</code> switch so that it outputs only the
matched portion. As the regex is <code>'a.c'</code>​, the <code>-o</code> switch will
output every part of the input that matched that. So the output is
&ldquo;abc&rdquo;. It&rsquo;s still not what we <a href="#org-target--wanted-grep-output">wanted</a>.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-shell" data-lang="shell"><span class="line"><span class="cl"><span class="nb">echo</span> <span class="s2">&#34;def\nabc&#34;</span> <span class="p">|</span> grep -o <span class="s1">&#39;a.c&#39;</span> <span class="c1"># =&gt; abc</span>
</span></span></code></pre></div></li>
<li>
<p>Now we bring in the powerful Perl regex feature <em>positive
lookahead</em>.
<span class="sidenote-number"><small class="sidenote">
Positive lookahead is used when you want to match something <span class="underline">only
if</span> it&rsquo;s followed by something else. It&rsquo;s syntax looks like <code>q(?=u)</code>
where that expression matches if a <code>q</code> is followed by a <code>u</code>, without
making the <code>u</code> part of the match &ndash; <a href="https://www.regular-expressions.info/lookaround.html">reference</a>.
</small></span>
But this is still not exactly what we want because &ldquo;a&rdquo; is still
considered as part of the match. Now the output is &ldquo;ab&rdquo;.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-shell" data-lang="shell"><span class="line"><span class="cl"><span class="nb">echo</span> <span class="s2">&#34;abc&#34;</span> <span class="p">|</span> grep -Po <span class="s1">&#39;a.(?=c)&#39;</span> <span class="c1"># =&gt; ab</span>
</span></span></code></pre></div></li>
<li>
<p>We only need a special character that marks a point in the regex
that tells &ldquo;don&rsquo;t consider anything before this as part of the
match&rdquo;. The <code>\K</code> special construct described in the <a href="https://perldoc.perl.org/perlre#Lookaround-Assertions">Perl regular
expressions doc</a> as:</p>
<blockquote>
<p>There is a special form of this construct, called <code>\K</code> (available
since Perl 5.10.0), which causes the regex engine to &ldquo;keep&rdquo;
everything it had matched prior to the <code>\K</code> and not include it in
matched string. This effectively provides non-experimental
variable-length lookbehind of any length.</p>
</blockquote>
<p>And, thus we have the final solution:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-shell" data-lang="shell"><span class="line"><span class="cl"><span class="nb">echo</span> <span class="s2">&#34;abc&#34;</span> <span class="p">|</span> grep -Po <span class="s1">&#39;a\K.(?=c)&#39;</span> <span class="c1"># =&gt; b</span>
</span></span></code></pre></div></li>
</ul>

<h2 id="summary">Summary&nbsp;<a class="headline-hash no-text-decoration" href="#summary">#</a></h2>


<p>Taking the example from the <a href="#grep-po-problem-statement">problem statement</a>, this will work:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl"><span class="nv">string</span><span class="o">=</span><span class="s2">&#34;web report: https://foo.bar/detail.html&#34;</span>
</span></span><span class="line"><span class="cl"><span class="nv">substring</span><span class="o">=</span><span class="k">$(</span>grep -Po <span class="s1">&#39;web report:\s*\K.*?(?=/detail\.html)&#39;</span> <span class="o">&lt;&lt;&lt;</span> <span class="s2">&#34;</span><span class="si">${</span><span class="nv">string</span><span class="si">}</span><span class="s2">&#34;</span><span class="k">)</span>
</span></span><span class="line"><span class="cl"><span class="nb">echo</span> <span class="s2">&#34;</span><span class="si">${</span><span class="nv">substring</span><span class="si">}</span><span class="s2">&#34;</span>
</span></span></code></pre></div><div class="highlight"><pre tabindex="0" class="chroma"><code class="language-text" data-lang="text"><span class="line"><span class="cl">https://foo.bar
</span></span></code></pre></div>
<h2 id="references">References&nbsp;<a class="headline-hash no-text-decoration" href="#references">#</a></h2>


<div class="csl-bib-body">
  <div class="csl-entry"><a id="citeproc_bib_item_1"></a>camh. (2011). Can grep output only specified groupings that match? [Website]. In <i>Unix stackexchange</i>. <a href="https://unix.stackexchange.com/a/13472/57923">https://unix.stackexchange.com/a/13472/57923</a></div>
  <div class="csl-entry"><a id="citeproc_bib_item_2"></a>perlre - Perl regular expressions. (n.d.). [Website]. In <i>Perldoc 5.34.0</i>. Retrieved February 16, 2022, from <a href="https://perldoc.perl.org/perlre">https://perldoc.perl.org/perlre</a></div>
</div>
]]></content><category scheme="https://scripter.co/categories/unix" term="unix" label="unix"/><category scheme="https://scripter.co/categories/shell" term="shell" label="shell"/><category scheme="https://scripter.co/tags/grep" term="grep" label="grep"/><category scheme="https://scripter.co/tags/regex" term="regex" label="regex"/><category scheme="https://scripter.co/tags/string" term="string" label="string"/><category scheme="https://scripter.co/tags/perl" term="perl" label="perl"/><category scheme="https://scripter.co/tags/100daystooffload" term="100daystooffload" label="100DaysToOffload"/></entry><entry><title type="html">Golang Quirk: Number-strings starting with "0" are Octals</title><link href="https://scripter.co/golang-quirk-number-strings-starting-with-0-are-octals/?utm_source=atom_feed" rel="alternate" type="text/html"/><link href="https://scripter.co/notes/string-fns-nim-vs-python/?utm_source=atom_feed" rel="related" type="text/html" title="String Functions: Nim vs Python"/><link href="https://scripter.co/installing-go-toolchain/?utm_source=atom_feed" rel="related" type="text/html" title="Installing go toolchain"/><id>https://scripter.co/golang-quirk-number-strings-starting-with-0-are-octals/</id><author><name>Kaushal Modi</name></author><published>2018-04-18T16:29:00-04:00</published><updated>2018-04-18T16:29:00-04:00</updated><content type="html"><![CDATA[<blockquote>Someone in the Golang team thought that it would be a good idea to
consider all numbers (represented as strings) starting with &ldquo;0&rdquo; as
Octals.. so &ldquo;010&rdquo; is actually 8.. Really?</blockquote><div class="ox-hugo-toc toc">
<div class="heading">Table of Contents</div>
<ul>
<li><a href="#problem">Problem</a></li>
<li><a href="#cause">Cause</a></li>
<li><a href="#workaround">Workaround</a></li>
<li><a href="#resurgence">Resurgence</a></li>
<li><a href="#debug">Debug</a></li>
<li><a href="#next-steps">Next Steps?</a></li>
</ul>
</div>
<!--endtoc-->
<p>The aim of this post is to make a Golang quirk more of a common
knowledge, with an ulterior motive to eventually get it fixed
upstream, somehow..</p>
<div class="note">
<p><strong>Disclaimer</strong>: I don&rsquo;t code in Go lang. So I could very well be wrong
in saying <em>problem with Golang</em> vs <em>problem with specifically
<code>strconv</code> package</em>.</p>
</div>
<p>From my perspective, I see <code>strconv</code> as an internal Go package that
any (most of?) Go coder would use to do <em>string → int</em> conversions. If
so, I don&rsquo;t grasp the rationale behind why the <code>strconv</code> developers
would make this strange decision.. strange because in <em>normal</em>
languages like Python, <code>int(&quot;010&quot;)</code> returns <code>10</code>.</p>

<h2 id="problem">Problem&nbsp;<a class="headline-hash no-text-decoration" href="#problem">#</a></h2>


<p>I learned about this issue for the first time from <a href="https://discourse.gohugo.io/t/unable-to-cast-09-of-type-string-to-int/9614/6?u=kaushalmodi">this Hugo Discourse
thread</a>. The synopsis is that someone is retrieving US city zip-codes
from a Hugo front-matter variable, and then using some conditional
logic based on the last 2 digits.</p>
<p>So the code was:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-go-html-template" data-lang="go-html-template"><span class="line"><span class="cl"><span class="c">&lt;!-- Value of .Params.cityZipCode is &#34;75009&#34; --&gt;</span>
</span></span><span class="line"><span class="cl"><span class="cp">{{</span><span class="w"> </span><span class="k">if</span><span class="w"> </span><span class="o">(</span><span class="nx">int</span><span class="w"> </span><span class="o">(</span><span class="nx">last</span><span class="w"> </span><span class="nx">2</span><span class="w"> </span><span class="na">.Params.cityZipCode</span><span class="o">))</span><span class="w"> </span><span class="k">eq</span><span class="w"> </span><span class="nx">1</span><span class="w"> </span><span class="cp">}}</span>er<span class="cp">{{</span><span class="w"> </span><span class="k">else</span><span class="w"> </span><span class="cp">}}</span>e<span class="cp">{{</span><span class="w"> </span><span class="k">end</span><span class="w"> </span><span class="cp">}}</span>
</span></span></code></pre></div><p>The logic is simple.. Get the last two characters of
<code>.Params.cityZipCode</code>, which would be <code>&quot;09&quot;</code>, convert that string to a
number (<code>int</code>), and check if it is <code>1</code>.</p>
<p>But <em>of-course</em> that didn&rsquo;t work:</p>
<blockquote>
<p>unable to cast &ldquo;09&rdquo; of type string to int</p>
</blockquote>

<h2 id="cause">Cause&nbsp;<a class="headline-hash no-text-decoration" href="#cause">#</a></h2>


<p>Later, as I learn, that&rsquo;s because of the <a href="https://golang.org/pkg/strconv/#ParseInt"><code>ParseInt</code> function from the
<code>strconv</code></a> package. There it says (emphasis mine):</p>
<blockquote>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-go" data-lang="go"><span class="line"><span class="cl"><span class="kd">func</span> <span class="nf">ParseInt</span><span class="p">(</span><span class="nx">s</span> <span class="kt">string</span><span class="p">,</span> <span class="nx">base</span> <span class="kt">int</span><span class="p">,</span> <span class="nx">bitSize</span> <span class="kt">int</span><span class="p">)</span> <span class="p">(</span><span class="nx">i</span> <span class="kt">int64</span><span class="p">,</span> <span class="nx">err</span> <span class="kt">error</span><span class="p">)</span>
</span></span></code></pre></div><p><code>ParseInt</code> interprets a string <code>s</code> in the given <code>base</code> (0, 2 to 36)
and <code>bit</code> size (0 to 64) and returns the corresponding value <code>i</code>.</p>
<p>If <code>base == 0</code>, the <code>base</code> is implied by the string&rsquo;s prefix: base 16
for <code>&quot;0x&quot;</code>, <strong>base 8 for <code>&quot;0&quot;</code></strong>, and base 10 otherwise. For bases 1,
below 0 or above 36 an error is returned.</p>
</blockquote>
<div class="verse">
<p>    Again.. What was the Golang team thinking?!<br /></p>
</div>

<h2 id="workaround">Workaround&nbsp;<a class="headline-hash no-text-decoration" href="#workaround">#</a></h2>


<p>This led me to update the <a href="https://gohugo.io/functions/int/"><code>int</code> function</a> documentation for Hugo with
an ugly workaround:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-go-html-template" data-lang="go-html-template"><span class="line"><span class="cl"><span class="cp">{{</span><span class="w"> </span><span class="nx">int</span><span class="w"> </span><span class="o">(</span><span class="s">&#34;00987&#34;</span><span class="w"> </span><span class="o">|</span><span class="w"> </span><span class="nx">strings</span><span class="na">.TrimLeft</span><span class="w"> </span><span class="s">&#34;0&#34;</span><span class="o">)</span><span class="w"> </span><span class="cp">}}</span>
</span></span></code></pre></div>
<h2 id="resurgence">Resurgence&nbsp;<a class="headline-hash no-text-decoration" href="#resurgence">#</a></h2>


<p>This problem of <em>number-strings beginning with &ldquo;0&rdquo; considered as
octals</em> resurfaced recently in Hugo issue #<a href="https://github.com/gohugoio/hugo/issues/4628">4628</a>.</p>
<p>Though, the reported error did not make it evident that that was the
problem:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-text" data-lang="text"><span class="line"><span class="cl">INFO 2018/04/15 18:49:36 found taxonomies: map[string]string{&#34;category&#34;:&#34;categories&#34;, &#34;manufacturerletter&#34;:&#34;manufacturerletters&#34;, &#34;manufacturer&#34;:&#34;manufacturers&#34;, &#34;featured&#34;:&#34;featured&#34;, &#34;tag&#34;:&#34;tags&#34;}
</span></span><span class="line"><span class="cl">panic: interface conversion: interface {} is float64, not int
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">goroutine 50 [running]:
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">github.com/gohugoio/hugo/hugolib.(*Site).assembleTaxonomies(0xc4204ce2c0)
</span></span><span class="line"><span class="cl">	/go/src/github.com/gohugoio/hugo/hugolib/site.go:1545 +0xee2
</span></span></code></pre></div><p>The issue reporter had 1500+ content files, and one or more of those
files caused this uncaught exception (<em>which is a separate issue, and
is planned to be fixed in Hugo</em>) to happen. So I had to spend quite
some time doing &ldquo;forensic debug&rdquo;<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> to understand what caused that
<em>&ldquo;interface conversion: interface {} is float64, not int&rdquo;</em>.</p>

<h2 id="debug">Debug&nbsp;<a class="headline-hash no-text-decoration" href="#debug">#</a></h2>


<ul>
<li>
<p>The exception was thrown at <a href="https://github.com/gohugoio/hugo/blob/74520d2cfd39bb4428182e26c57afa9df83ce7b5/hugolib/site.go#L1545">this line</a> (highlighted below):</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-go" data-lang="go"><span class="line"><span class="cl"><span class="k">for</span> <span class="nx">_</span><span class="p">,</span> <span class="nx">p</span> <span class="o">:=</span> <span class="k">range</span> <span class="nx">s</span><span class="p">.</span><span class="nx">Pages</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="nx">vals</span> <span class="o">:=</span> <span class="nx">p</span><span class="p">.</span><span class="nf">getParam</span><span class="p">(</span><span class="nx">plural</span><span class="p">,</span> <span class="p">!</span><span class="nx">s</span><span class="p">.</span><span class="nx">Info</span><span class="p">.</span><span class="nx">preserveTaxonomyNames</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="nx">weight</span> <span class="o">:=</span> <span class="nx">p</span><span class="p">.</span><span class="nf">getParamToLower</span><span class="p">(</span><span class="nx">plural</span> <span class="o">+</span> <span class="s">&#34;_weight&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="nx">weight</span> <span class="o">==</span> <span class="kc">nil</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="nx">weight</span> <span class="p">=</span> <span class="mi">0</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="nx">vals</span> <span class="o">!=</span> <span class="kc">nil</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="nx">v</span><span class="p">,</span> <span class="nx">ok</span> <span class="o">:=</span> <span class="nx">vals</span><span class="p">.([]</span><span class="kt">string</span><span class="p">);</span> <span class="nx">ok</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">            <span class="k">for</span> <span class="nx">_</span><span class="p">,</span> <span class="nx">idx</span> <span class="o">:=</span> <span class="k">range</span> <span class="nx">v</span> <span class="p">{</span>
</span></span><span class="line hl"><span class="cl">                <span class="nx">x</span> <span class="o">:=</span> <span class="nx">WeightedPage</span><span class="p">{</span><span class="nx">weight</span><span class="p">.(</span><span class="kt">int</span><span class="p">),</span> <span class="nx">p</span><span class="p">}</span>
</span></span></code></pre></div></li>
<li>
<p>So it was evident that one of the taxonomy weight
(<code>manufacturers_weight</code> in this case) values wasn&rsquo;t getting casted
to <code>int</code>.</p>
</li>
<li>
<p>So I grepped for anything <em>non-int</em> in those values, like <code>.</code>, <code>,</code>,
<code>e</code> or <code>E</code>, but found nothing.</p>
</li>
<li>
<p>Then doing <code>rg ':\s[0-9]{7,}(\.[0-9]+)*$'</code> in the content files, I
saw that there were 4 files that had oddly high weight values like
4611000, and wondered if that was somehow the problem. But that
wasn&rsquo;t it either.</p>
</li>
<li>
<p>When I deleted <strong>all</strong> the <code>manufacturers_weight</code> lines in those 1500+
files, the error went away.</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-shell" data-lang="shell"><span class="line"><span class="cl">find . -name <span class="s2">&#34;*.md&#34;</span> -print0 <span class="p">|</span> xargs -0 sed -i <span class="s1">&#39;/manufacturers_weight:.*/d&#39;</span>
</span></span></code></pre></div></li>
<li>
<p>So then I restored all of those deleted lines, and started deleting
them again, this time in progression ..</p>
<ul>
<li>First deleting all the lines with values with 7 or more
digits.. <em>Error still present</em>.</li>
<li>Then deleting all lines with values with 6 digits.. <em>Error still present</em>.</li>
<li>.. <em>Error still present</em>.</li>
<li>Finally when I deleted all lines with values with 4 digits, the
error went away!</li>
<li>But by now, I had modified about 700 files in this process!</li>
</ul>
</li>
<li>
<p>I had almost given up on debugging this further, when I decided to
give the <code>git diff</code> one last glance.. and I found the pattern..</p>
<div class="verse">
<p>    .. the <em>freaking</em> leading 0&rsquo;s in some of those <code>manufacturers_weight</code> values!<br /></p>
</div>
</li>
</ul>
<p>I had a strong gut feeling that those zeros <strong>were</strong> the problem. So I
once again restored the deleted lines in all the content files,
typed out the below<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup> with confidence ..</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-shell" data-lang="shell"><span class="line"><span class="cl">find . -name <span class="s2">&#34;*.md&#34;</span> -exec grep -P <span class="s1">&#39;manufacturers_weight: 0[0-9]+&#39;</span> -l <span class="o">{}</span> <span class="se">\;</span> -exec sed -r -i <span class="s1">&#39;s/(manufacturers_weight: )0([0-9]+)/\1\2/&#39;</span> <span class="o">{}</span> <span class="se">\;</span>
</span></span></code></pre></div><div class="verse">
<p>    .. and that error was of course gone! 🎉<br /></p>
</div>
<p>This ended up with just 16 modified files with a diff like this:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-diff" data-lang="diff"><span class="line"><span class="cl">...
</span></span><span class="line"><span class="cl">modified   content/movements/b/buren/buren-04.en.md
</span></span><span class="line"><span class="cl"><span class="gu">@@ -12,7 +12,7 @@ image: &#34;Buren_04.jpg&#34;
</span></span></span><span class="line"><span class="cl"><span class="gu"></span> movementlistkey: &#34;buren&#34;
</span></span><span class="line"><span class="cl"> caliberkey: &#34;04&#34;
</span></span><span class="line"><span class="cl"> manufacturers: [&#34;buren&#34;]
</span></span><span class="line"><span class="cl"><span class="gd">-manufacturers_weight: 04
</span></span></span><span class="line"><span class="cl"><span class="gd"></span><span class="gi">+manufacturers_weight: 4
</span></span></span><span class="line"><span class="cl"><span class="gi"></span> categories: [&#34;movements&#34;,&#34;movements_b&#34;,&#34;movements_b_buren_en&#34;]
</span></span><span class="line"><span class="cl"> widgets:
</span></span><span class="line"><span class="cl">   relatedmovements: true
</span></span><span class="line"><span class="cl">modified   content/movements/c/citizen/citizen-0153.de.md
</span></span><span class="line"><span class="cl"><span class="gu">@@ -12,7 +12,7 @@ image: &#34;Citizen_0153.jpg&#34;
</span></span></span><span class="line"><span class="cl"><span class="gu"></span> movementlistkey: &#34;citizen&#34;
</span></span><span class="line"><span class="cl"> caliberkey: &#34;0153&#34;
</span></span><span class="line"><span class="cl"> manufacturers: [&#34;citizen&#34;]
</span></span><span class="line"><span class="cl"><span class="gd">-manufacturers_weight: 0153
</span></span></span><span class="line"><span class="cl"><span class="gd"></span><span class="gi">+manufacturers_weight: 153
</span></span></span><span class="line"><span class="cl"><span class="gi"></span> categories: [&#34;movements&#34;,&#34;movements_c&#34;,&#34;movements_c_citizen&#34;]
</span></span><span class="line"><span class="cl"> widgets:
</span></span><span class="line"><span class="cl">   relatedmovements: true
</span></span><span class="line"><span class="cl">...
</span></span></code></pre></div><p>So that provided that issue originator a workaround so that they can
at least get their site built.</p>
<p>But I hope that this <em>0-leading octal</em> absurdity gets fixed at the
root level &mdash; People should once again say with confidence, as they
learned as kids, that &ldquo;010&rdquo; is the same thing as &ldquo;10&rdquo;.</p>

<h2 id="next-steps">Next Steps?&nbsp;<a class="headline-hash no-text-decoration" href="#next-steps">#</a></h2>


<ul>
<li>
<p>Hugo fixes this issue (<a href="https://github.com/gohugoio/hugo/issues/4628">4628</a>) on its end by not making this exception
go uncaught, and instead let the user know that they magically added
a non-<code>int</code>-castable <em>octal</em> value in their content in X file on Y
line.</p>
</li>
<li>
<p>The Golang team gives some serious thought to this <del>stupid</del> <em>(sorry
about that)</em> annoying decision:</p>
<blockquote>
<p>If <code>base == 0</code>, the <code>base</code> is implied by the string&rsquo;s prefix: base 16
for <code>&quot;0x&quot;</code>, <strong>base 8 for <code>&quot;0&quot;</code></strong> ..</p>
</blockquote>
</li>
</ul>
<div class="center"><b>§</b></div>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>I call this &ldquo;forensic debug&rdquo; because I don&rsquo;t know Go, and how
and where to add debug statements within the <code>hugo</code> source code. So my
approach was to figure out which content file/line caused that error.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>That command finds all the <code>.md</code> files in the current
directory, returns a list of file names wherein the
<code>manufactureres_weight</code> value begins with <code>0</code> using <code>grep</code>, and then
surgically remove the leading zeros just in those short-listed files
using <code>sed</code>.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content><category scheme="https://scripter.co/categories/hugo" term="hugo" label="hugo"/><category scheme="https://scripter.co/series/golang-octals" term="golang-octals" label="Golang Octals"/><category scheme="https://scripter.co/tags/golang" term="golang" label="golang"/><category scheme="https://scripter.co/tags/octal" term="octal" label="octal"/><category scheme="https://scripter.co/tags/quirk" term="quirk" label="quirk"/><category scheme="https://scripter.co/tags/strconv" term="strconv" label="strconv"/><category scheme="https://scripter.co/tags/zero" term="zero" label="zero"/><category scheme="https://scripter.co/tags/string" term="string" label="string"/><category scheme="https://scripter.co/tags/sed" term="sed" label="sed"/><category scheme="https://scripter.co/tags/find" term="find" label="find"/><category scheme="https://scripter.co/tags/grep" term="grep" label="grep"/><category scheme="https://scripter.co/tags/go-template" term="go-template" label="go-template"/></entry></feed>