<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>python on foosel.net</title><link>https://foosel.net/tags/python/</link><description>Recent content in python on foosel.net</description><generator>Hugo</generator><language>en-us</language><copyright>Gina Häußge (foosel)</copyright><lastBuildDate>Fri, 28 Jul 2023 00:00:00 +0000</lastBuildDate><atom:link href="https://foosel.net/tags/python/feed.xml" rel="self" type="application/rss+xml"/><item><title>TIL: How to make MkDocs support site_url relative URLs</title><link>https://foosel.net/til/2023-07-27-how-to-make-mkdocs-support-siteurl-relative-urls/</link><pubDate>Thu, 27 Jul 2023 00:00:00 +0000</pubDate><guid>https://foosel.net/til/2023-07-27-how-to-make-mkdocs-support-siteurl-relative-urls/</guid><description>&lt;p&gt;I&amp;rsquo;m currently &lt;em&gt;finally&lt;/em&gt; back on converting the &lt;a href="https://octoprint.org"&gt;OctoPrint&lt;/a&gt; docs to using Markdown and &lt;a href="https://mkdocs.org"&gt;MkDocs&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Since I have some images in the docs that I want to be able to reference without having to use relative URLs (&lt;code&gt;../../../../images/&lt;/code&gt;),
especially since that would tie things in OctoPrint&amp;rsquo;s source tree structure too close to things in its documentation tree structure that
might or might not end up being in a different repository in the future, I needed a way to use absolute URLs here (&lt;code&gt;/images/&lt;/code&gt;). But
since the docs will most likely also end up being hosted on a version specific subpath of &lt;code&gt;docs.octoprint.org&lt;/code&gt;, just using
(host) absolute URLs would not work either and break.&lt;/p&gt;</description><content:encoded><![CDATA[<p>I&rsquo;m currently <em>finally</em> back on converting the <a href="https://octoprint.org">OctoPrint</a> docs to using Markdown and <a href="https://mkdocs.org">MkDocs</a>.</p>
<p>Since I have some images in the docs that I want to be able to reference without having to use relative URLs (<code>../../../../images/</code>),
especially since that would tie things in OctoPrint&rsquo;s source tree structure too close to things in its documentation tree structure that
might or might not end up being in a different repository in the future, I needed a way to use absolute URLs here (<code>/images/</code>). But
since the docs will most likely also end up being hosted on a version specific subpath of <code>docs.octoprint.org</code>, just using
(host) absolute URLs would not work either and break.</p>
<p>There&rsquo;s <a href="https://github.com/mkdocs/mkdocs/issues/1592">several</a> <a href="https://github.com/mkdocs/mkdocs/issues/192">issues</a> on the MkDocs issue
tracker about workflow problems caused by this, but the suggested workarounds like using <a href="https://mkdocs-macros-plugin.readthedocs.io/">macros</a>
to prefix a variable&rsquo;s contents to related URLs didn&rsquo;t work for me due to me also heavily relying on <a href="https://foosel.net/mkdocstrings.github.io/">mkdocstrings</a>,
and anything contained in docstrings is not processed by macros<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup>.</p>
<p>So I got the idea to implement a minimal MkDocs plugin that would just turn all URLs contained in <code>href</code> and <code>src</code> attributes
that are prefixed with a custom schema <code>site:</code> schema into <em>site relative</em> URLs, with this effect. Example:</p>
<table>
  <thead>
      <tr>
          <th>URL</th>
          <th>site_url</th>
          <th>resulting URL</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><code>site:images/foo.png</code></td>
          <td><code>https://docs.octoprint.org/</code></td>
          <td><code>/images/foo.png</code></td>
      </tr>
      <tr>
          <td><code>site:images/foo.png</code></td>
          <td><code>https://docs.octoprint.org/1.9.x/</code></td>
          <td><code>/1.9.x/images/foo.png</code></td>
      </tr>
  </tbody>
</table>
<p>Using a <a href="https://www.mkdocs.org/user-guide/configuration/#hooks">hook</a> I could register a callback for the
<a href="https://www.mkdocs.org/dev-guide/plugins/#on_page_content"><code>on_page_content</code></a> event that would then replace all URLs as needed
in the generated page HTML.</p>
<p>And this is the resulting <code>site_urls.py</code>:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-python" data-lang="python"><span style="display:flex;"><span><span style="color:#f92672">import</span> logging
</span></span><span style="display:flex;"><span><span style="color:#f92672">import</span> urllib.parse
</span></span><span style="display:flex;"><span><span style="color:#f92672">import</span> re
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#f92672">import</span> mkdocs.plugins
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>log <span style="color:#f92672">=</span> logging<span style="color:#f92672">.</span>getLogger(<span style="color:#e6db74">&#34;mkdocs&#34;</span>)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>SITE_URLS_REGEX <span style="color:#f92672">=</span> re<span style="color:#f92672">.</span>compile(<span style="color:#e6db74">r</span><span style="color:#e6db74">&#39;(href|src)=&#34;site:([^&#34;]+)&#34;&#39;</span>, re<span style="color:#f92672">.</span>IGNORECASE)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#a6e22e">@mkdocs.plugins.event_priority</span>(<span style="color:#ae81ff">50</span>)
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">def</span> <span style="color:#a6e22e">on_page_content</span>(html, page, config, files):
</span></span><span style="display:flex;"><span>    site_url <span style="color:#f92672">=</span> config[<span style="color:#e6db74">&#34;site_url&#34;</span>]
</span></span><span style="display:flex;"><span>    path <span style="color:#f92672">=</span> urllib<span style="color:#f92672">.</span>parse<span style="color:#f92672">.</span>urlparse(site_url)<span style="color:#f92672">.</span>path
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">if</span> <span style="color:#f92672">not</span> path:
</span></span><span style="display:flex;"><span>        path <span style="color:#f92672">=</span> <span style="color:#e6db74">&#34;/&#34;</span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">if</span> <span style="color:#f92672">not</span> path<span style="color:#f92672">.</span>endswith(<span style="color:#e6db74">&#34;/&#34;</span>):
</span></span><span style="display:flex;"><span>        path <span style="color:#f92672">+=</span> <span style="color:#e6db74">&#34;/&#34;</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">def</span> <span style="color:#a6e22e">_replace</span>(<span style="color:#66d9ef">match</span>):
</span></span><span style="display:flex;"><span>        param <span style="color:#f92672">=</span> <span style="color:#66d9ef">match</span><span style="color:#f92672">.</span>group(<span style="color:#ae81ff">1</span>)
</span></span><span style="display:flex;"><span>        url <span style="color:#f92672">=</span> <span style="color:#66d9ef">match</span><span style="color:#f92672">.</span>group(<span style="color:#ae81ff">2</span>)
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">if</span> url<span style="color:#f92672">.</span>startswith(<span style="color:#e6db74">&#34;/&#34;</span>):
</span></span><span style="display:flex;"><span>            url <span style="color:#f92672">=</span> url[<span style="color:#ae81ff">1</span>:]
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>        log<span style="color:#f92672">.</span>info(<span style="color:#e6db74">f</span><span style="color:#e6db74">&#34;Replacing site:</span><span style="color:#e6db74">{</span>match<span style="color:#f92672">.</span>group(<span style="color:#ae81ff">2</span>)<span style="color:#e6db74">}</span><span style="color:#e6db74"> with </span><span style="color:#e6db74">{</span>path<span style="color:#e6db74">}{</span>url<span style="color:#e6db74">}</span><span style="color:#e6db74">...&#34;</span>)
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">return</span> <span style="color:#e6db74">f</span><span style="color:#e6db74">&#39;</span><span style="color:#e6db74">{</span>param<span style="color:#e6db74">}</span><span style="color:#e6db74">=&#34;</span><span style="color:#e6db74">{</span>path<span style="color:#e6db74">}{</span>url<span style="color:#e6db74">}</span><span style="color:#e6db74">&#34;&#39;</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">return</span> SITE_URLS_REGEX<span style="color:#f92672">.</span>sub(_replace, html)
</span></span></code></pre></div><p>that I&rsquo;ve registered as a hook in my <code>mkdocs.yaml</code> like this:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-yaml" data-lang="yaml"><span style="display:flex;"><span><span style="color:#f92672">hooks</span>:
</span></span><span style="display:flex;"><span>  - <span style="color:#ae81ff">site_urls.py</span>
</span></span></code></pre></div><p>Seems to work just fine, both for images and links! 😄</p>
<p><strong>Update 2023-07-28</strong>: I&rsquo;ve now published this as a proper plugin on PyPI, see <a href="https://pypi.org/project/mkdocs-site-urls/">mkdocs-site-urls</a>.
With that, all you need to do - given you are already on MkDocs 1.5 or newer - is installing the plugin via <code>pip install mkdocs-site-urls</code> and
then adding this to your <code>mkdocs.yaml</code>:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-yaml" data-lang="yaml"><span style="display:flex;"><span><span style="color:#f92672">plugins</span>:
</span></span><span style="display:flex;"><span>  - <span style="color:#ae81ff">site-urls</span>
</span></span></code></pre></div><div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>As I found out when I wanted to add a <code>version_added</code> macro, which simply didn&rsquo;t render.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content:encoded></item><item><title>TIL: How to grep a log for multiline errors</title><link>https://foosel.net/til/2023-02-01-how-to-grep-a-log-for-multline-errors/</link><pubDate>Wed, 01 Feb 2023 00:00:00 +0000</pubDate><guid>https://foosel.net/til/2023-02-01-how-to-grep-a-log-for-multline-errors/</guid><description>&lt;p&gt;I just found myself in the position to have to &lt;code&gt;grep&lt;/code&gt; an OctoPrint log file for error log entries with attached Python stack traces. I wanted to not only get the starting line where the exception log output starts, but the full stack trace up until the next regular log line.&lt;/p&gt;
&lt;p&gt;The format of the lines in &lt;code&gt;octoprint.log&lt;/code&gt; is a simple &lt;code&gt;%(asctime)s - %(name)s - %(levelname)s - %(message)s&lt;/code&gt;, so a log with an error and attached exception looks like this:&lt;/p&gt;</description><content:encoded><![CDATA[<p>I just found myself in the position to have to <code>grep</code> an OctoPrint log file for error log entries with attached Python stack traces. I wanted to not only get the starting line where the exception log output starts, but the full stack trace up until the next regular log line.</p>
<p>The format of the lines in <code>octoprint.log</code> is a simple <code>%(asctime)s - %(name)s - %(levelname)s - %(message)s</code>, so a log with an error and attached exception looks like this:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-plain" data-lang="plain"><span style="display:flex;"><span>2023-01-30 17:50:45,704 - octoprint.events.fire - DEBUG - Firing event: Disconnecting (Payload: None)
</span></span><span style="display:flex;"><span>2023-01-30 17:50:45,704 - octoprint.events - DEBUG - Sending action to &lt;bound method PrinterStateConnection._onEvent of &lt;octoprint.server.util.sockjs.PrinterStateConnection object at 0x000001635CB6EE50&gt;&gt;
</span></span><span style="display:flex;"><span>2023-01-30 17:50:45,705 - octoprint.plugin - DEBUG - Calling on_event on action_command_notification
</span></span><span style="display:flex;"><span>2023-01-30 17:50:45,705 - octoprint.plugin - DEBUG - Calling on_event on action_command_prompt
</span></span><span style="display:flex;"><span>2023-01-30 17:50:45,705 - octoprint.plugin - DEBUG - Calling on_event on announcements
</span></span><span style="display:flex;"><span>2023-01-30 17:50:45,705 - octoprint.plugin - DEBUG - Calling on_event on file_check
</span></span><span style="display:flex;"><span>2023-01-30 17:50:45,706 - octoprint.plugin - DEBUG - Calling on_event on firmware_check
</span></span><span style="display:flex;"><span>2023-01-30 17:50:45,706 - octoprint.plugin - DEBUG - Calling on_event on pluginmanager
</span></span><span style="display:flex;"><span>2023-01-30 17:50:45,706 - octoprint.plugin - DEBUG - Calling on_event on softwareupdate
</span></span><span style="display:flex;"><span>2023-01-30 17:50:45,706 - octoprint.plugin - DEBUG - Calling on_event on tracking
</span></span><span style="display:flex;"><span>2023-01-30 17:50:45,711 - octoprint.plugin - DEBUG - Calling on_event on mqtt
</span></span><span style="display:flex;"><span>2023-01-30 17:50:45,732 - octoprint.events.fire - DEBUG - Firing event: Disconnected (Payload: None)
</span></span><span style="display:flex;"><span>2023-01-30 17:50:45,735 - octoprint.events - DEBUG - Sending action to &lt;function Server.run.&lt;locals&gt;.&lt;lambda&gt; at 0x000001635BDB2CA0&gt;
</span></span><span style="display:flex;"><span>2023-01-30 17:50:45,750 - octoprint.events - ERROR - Got an exception while sending event Disconnected (Payload: None) to &lt;function Server.run.&lt;locals&gt;.&lt;lambda&gt; at 0x000001635BDB2CA0&gt;
</span></span><span style="display:flex;"><span>Traceback (most recent call last):
</span></span><span style="display:flex;"><span>  File &#34;C:\Devel\OctoPrint\OctoPrint\src\octoprint\events.py&#34;, line 197, in _work
</span></span><span style="display:flex;"><span>    listener(event, payload)
</span></span><span style="display:flex;"><span>  File &#34;C:\Devel\OctoPrint\OctoPrint\src\octoprint\server\__init__.py&#34;, line 1212, in &lt;lambda&gt;
</span></span><span style="display:flex;"><span>    octoprint.events.Events.DISCONNECTED, lambda e, p: run_autorefresh()
</span></span><span style="display:flex;"><span>                                                       ^^^^^^^^^^^^^^^^^
</span></span><span style="display:flex;"><span>  File &#34;C:\Devel\OctoPrint\OctoPrint\src\octoprint\server\__init__.py&#34;, line 1195, in run_autorefresh
</span></span><span style="display:flex;"><span>    autorefresh.stop()
</span></span><span style="display:flex;"><span>    ^^^^^^^^^^^^^^^^
</span></span><span style="display:flex;"><span>AttributeError: &#39;RepeatedTimer&#39; object has no attribute &#39;stop&#39;
</span></span><span style="display:flex;"><span>2023-01-30 17:50:45,753 - octoprint.events - DEBUG - Sending action to &lt;bound method PrinterStateConnection._onEvent of &lt;octoprint.server.util.sockjs.PrinterStateConnection object at 0x000001635CB6EE50&gt;&gt;
</span></span><span style="display:flex;"><span>2023-01-30 17:50:45,755 - octoprint.plugin - DEBUG - Calling on_event on action_command_notification
</span></span><span style="display:flex;"><span>2023-01-30 17:50:45,756 - octoprint.server.util.sockjs - DEBUG - Socket message held back until permissions cleared, added to backlog: plugin
</span></span><span style="display:flex;"><span>2023-01-30 17:50:45,758 - octoprint.plugins.action_command_notification - INFO - Notifications cleared
</span></span><span style="display:flex;"><span>2023-01-30 17:50:45,758 - octoprint.plugin - DEBUG - Calling on_event on action_command_prompt
</span></span><span style="display:flex;"><span>2023-01-30 17:50:45,758 - octoprint.plugin - DEBUG - Calling on_event on announcements
</span></span><span style="display:flex;"><span>2023-01-30 17:50:45,759 - octoprint.plugin - DEBUG - Calling on_event on file_check
</span></span><span style="display:flex;"><span>2023-01-30 17:50:45,759 - octoprint.plugin - DEBUG - Calling on_event on firmware_check
</span></span><span style="display:flex;"><span>2023-01-30 17:50:45,759 - octoprint.server.util.sockjs - DEBUG - Socket message held back until permissions cleared, added to backlog: plugin
</span></span><span style="display:flex;"><span>2023-01-30 17:50:45,763 - octoprint.plugin - DEBUG - Calling on_event on pluginmanager
</span></span><span style="display:flex;"><span>2023-01-30 17:50:45,764 - octoprint.plugin - DEBUG - Calling on_event on softwareupdate
</span></span><span style="display:flex;"><span>2023-01-30 17:50:45,764 - octoprint.plugin - DEBUG - Calling on_event on tracking
</span></span></code></pre></div><p>What I now wanted is for <code>grep</code> to spit out just the <code>ERROR</code> line and the attached stack trace:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-plain" data-lang="plain"><span style="display:flex;"><span>2023-01-30 17:50:45,750 - octoprint.events - ERROR - Got an exception while sending event Disconnected (Payload: None) to &lt;function Server.run.&lt;locals&gt;.&lt;lambda&gt; at 0x000001635BDB2CA0&gt;
</span></span><span style="display:flex;"><span>Traceback (most recent call last):
</span></span><span style="display:flex;"><span>  File &#34;C:\Devel\OctoPrint\OctoPrint\src\octoprint\events.py&#34;, line 197, in _work
</span></span><span style="display:flex;"><span>    listener(event, payload)
</span></span><span style="display:flex;"><span>  File &#34;C:\Devel\OctoPrint\OctoPrint\src\octoprint\server\__init__.py&#34;, line 1212, in &lt;lambda&gt;
</span></span><span style="display:flex;"><span>    octoprint.events.Events.DISCONNECTED, lambda e, p: run_autorefresh()
</span></span><span style="display:flex;"><span>                                                       ^^^^^^^^^^^^^^^^^
</span></span><span style="display:flex;"><span>  File &#34;C:\Devel\OctoPrint\OctoPrint\src\octoprint\server\__init__.py&#34;, line 1195, in run_autorefresh
</span></span><span style="display:flex;"><span>    autorefresh.stop()
</span></span><span style="display:flex;"><span>    ^^^^^^^^^^^^^^^^
</span></span><span style="display:flex;"><span>AttributeError: &#39;RepeatedTimer&#39; object has no attribute &#39;stop&#39;
</span></span></code></pre></div><p>For this I needed a way to set <code>grep</code> to match multiple lines and do a (non-matching) look ahead for the end. It turns out that the secret to success here is to treat the whole input as one line, use Perl compatible regex mode, and make sure to set the multiline flag. After some fiddling around on <a href="https://regex101.com/r/qYOrnT/1">regex101.com</a> and reading up on <a href="https://perldoc.perl.org/perlre#Extended-Patterns">Perl&rsquo;s regex options</a><sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup>, I came up with the following:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-plain" data-lang="plain"><span style="display:flex;"><span>grep -Pazo &#39;(?m)^\N+\- ERROR \-\N*\n(^\N*?\n)*?(?=\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2},\d{3} \- )&#39; octoprint.log
</span></span></code></pre></div><p>Let&rsquo;s walk through this:</p>
<ul>
<li><code>-P</code> enables Perl compatible regex mode</li>
<li><code>-a</code> enables text mode</li>
<li><code>-z</code> turns all newlines into null bytes and thus treats the whole input as a single line for finding matches<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup></li>
<li><code>-o</code> only outputs the matched part of the line (otherwise we&rsquo;d get the whole file printed out)</li>
<li><code>(?m)</code> enables multiline mode</li>
<li><code>^\N+\- ERROR \-\N*\n</code> matches the first line of the error, which is the one that starts with the timestamp and package and contains the word <code>ERROR</code></li>
<li><code>(^\N*?\n)*?</code> non-greedily matches all following lines of the error, which are anything but a newline followed by a newline</li>
<li><code>(?=\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2},\d{3} \- )</code> is a positive look-ahead that matches a line starting with a timestamp again, which signifies the end of the error&rsquo;s lines</li>
</ul>
<p>Hooray, it works 🥳:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-plain" data-lang="plain"><span style="display:flex;"><span>❯ grep -Pazo &#39;(?m)^\N+\- ERROR \-\N*\n(^\N*?\n)*?(?=\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2},\d{3} \- )&#39; octoprint.log
</span></span><span style="display:flex;"><span>2023-01-30 17:50:45,750 - octoprint.events - ERROR - Got an exception while sending event Disconnected (Payload: None) to &lt;function Server.run.&lt;locals&gt;.&lt;lambda&gt; at 0x000001635BDB2CA0&gt;
</span></span><span style="display:flex;"><span>Traceback (most recent call last):
</span></span><span style="display:flex;"><span>  File &#34;C:\Devel\OctoPrint\OctoPrint\src\octoprint\events.py&#34;, line 197, in _work
</span></span><span style="display:flex;"><span>    listener(event, payload)
</span></span><span style="display:flex;"><span>  File &#34;C:\Devel\OctoPrint\OctoPrint\src\octoprint\server\__init__.py&#34;, line 1212, in &lt;lambda&gt;
</span></span><span style="display:flex;"><span>    octoprint.events.Events.DISCONNECTED, lambda e, p: run_autorefresh()
</span></span><span style="display:flex;"><span>                                                       ^^^^^^^^^^^^^^^^^
</span></span><span style="display:flex;"><span>  File &#34;C:\Devel\OctoPrint\OctoPrint\src\octoprint\server\__init__.py&#34;, line 1195, in run_autorefresh
</span></span><span style="display:flex;"><span>    autorefresh.stop()
</span></span><span style="display:flex;"><span>    ^^^^^^^^^^^^^^^^
</span></span><span style="display:flex;"><span>AttributeError: &#39;RepeatedTimer&#39; object has no attribute &#39;stop&#39;
</span></span></code></pre></div><p>(And yes, I&rsquo;ve fixed the error that lead to this stack trace as well 😉)</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>I don&rsquo;t know about you, but I always forget about positive/negative look-ahead/behind and pattern-match modifiers.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>The downside of this is that now <code>-n</code> (print line number of match) will not work anymore and just happily report line 1 for every single match.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content:encoded></item><item><title>On wrong assumptions</title><link>https://foosel.net/blog/2021-03-19-on-wrong-assumptions/</link><pubDate>Fri, 19 Mar 2021 00:00:00 +0000</pubDate><guid>https://foosel.net/blog/2021-03-19-on-wrong-assumptions/</guid><description>How I once spent two weeks barking up the wrong tree</description><content:encoded><![CDATA[<p><img src="https://foosel.net/blog/2021-03-19-on-wrong-assumptions/screenshot.jpg" alt="A shot of the screen displaying the diff of the fix" loading="lazy"></p><p><em>The original version of this post was published as a <a href="https://twitter.com/foosel/status/1242121324438355974">Twitter thread on March 23rd 2020</a>. I figured I should give it a more permanent home here since IMHO it was a quite fun story.</em></p>
<p>Since everyone can use some entertainment right now, how about a battle story on how a year ago I spent almost two weeks trying to wrap my head around a really weird issue of a lagging GCODE viewer and overall print progress reporting in <a href="https://octoprint.org">OctoPrint</a> and finally figuring it out?</p>
<p>Our story begins around the release of 1.4.0, when <a href="https://community.octoprint.org/t/curious-issue-with-print-progress/16304">a new topic on the community forum</a> showed up:</p>
<blockquote>
<h3 id="curious-issue-with-print-progress">Curious issue with print progress</h3>
<p>The print progress figures on my Octopi setup are lagging behind the actual print. [&hellip;] Nothing is broken - anything I throw at it (an Ender 3) prints fine but as a print progresses, the percentage complete, current layer, and sync&rsquo;d gcode viewer gradually lag behind what is actually being printed. For example, on a print with 400 layers, as the last layer is printed the reported progress and current layer is around 96% and 385 respectively. If I do a quick calculation of the displayed Printed/Total file size figures it works out at 96% but what it has actually printed is over 99%. When the print finishes the numbers jump to 100% and 400 and everything is fine.</p>
<p>[&hellip;]</p>
</blockquote>
<p>This was indeed a very curious issue, since due to the nature of the communication with the printer and buffering in the firmware the progress is usually rather slightly <em>ahead</em> than behind. Some quick testing on my end showed no reproduction, however more and more people chimed in with the same observation.</p>
<p>I was stumped.</p>
<p>My first approach was to collect information from those affected by it. Printer model, firmware version, installed plugins, used slicer and so on. It soon turned out that all affected installations were using Ultimaker Cura as the slicer.</p>
<p>A quick test by the OP with a different slicer confirmed that it indeed just occurred with GCODE sliced by Cura for him, same file in another slicer had everything work as designed. However, comparing the GCODE revealed no immediate differences that would explain this, and what actually is <em>in</em> the file also doesn&rsquo;t really play into progress tracking. My own experiments with Cura failed to reproduce.</p>
<p>Convinced that the issue must be some sort of delay between the backend and the frontend &ndash; maybe due to network issues? &ndash; I whipped up a plugin (since deleted) to log progress on both ends to a log which could then be shared and analysed. The first results came in an guess what? I had barked up the wrong tree, the reported progress was identical. So back to square one.</p>
<p>I still couldn&rsquo;t reproduce it on my end and was starting to get really angry at this issue 😅 I finally threw a copy of some GCODE files now shared by the reporter of the issue on my own printer and <em>finally</em> I could reproduce. Which doesn&rsquo;t mean I had any idea WTF was going on though.</p>
<p>After many test prints, head scratching and going through the files with a comb I finally noticed something. The files with the issue had <code>CRLF</code> (or <code>\r\n</code>) line endings. Those without (including my own sliced files) had just <code>LF</code> (or <code>\n</code>) line endings.</p>
<p>So that made me go 🤨 Some cursing and breakpoint setting later I had proof that the reported progress in backend and frontend was flawed to begin with. I could see that a line was being reported with a file position that it actually was not located at in the file, and which instead belonged to a couple lines earlier. Which meant my positions were reported wrong right at the source &ndash; with a lag. And then it suddenly hit me.</p>
<p>But before I can tell you what was happening I need to give you some background on how OctoPrint reads GCODE files it&rsquo;s printing in order to understand what was going on. Printed files are read line by line because that is how they are sent to the printer. For that OctoPrint uses the <a href="https://docs.python.org/3/library/io.html?highlight=readline#io.IOBase.readline"><code>readline</code></a> method of the file stream. And that works by reading chunks of data from the file until a line separator is found, returning everything read up to this separator and saving the rest for the next line to be read. That means the file will have to be read further than what is returned. And that means that the position in the open file as reported by <a href="https://docs.python.org/3/library/io.html?highlight=readline#io.IOBase.tell"><code>tell</code></a> on the file stream will always be slightly ahead. For progress reporting in OctoPrint however I need to know the exact byte position of each line in the file. So what I do instead of relying on the internal and slightly ahead file position is that I increase my own position indicator by the length of the line read from the file. And this is where my problem was located.</p>
<p>It turns out that for some reason I wasn&rsquo;t getting the lines back from <code>readline</code> with the original line endings attached. Instead I always got <code>LF</code>, even for files with <code>CRLF</code>. And that means I was counting one byte short for every single line in <code>CRLF</code> terminated files. One byte short per line doesn&rsquo;t sound like much, but that adds up through a file with several hundred thousands of lines, to a point where progress reporting will be off by whole layers the further in the print and thus the file you are.</p>
<p>But what was the reason for this popping up in 1.4.0? I hadn&rsquo;t modified the code in question at all. It had been the same since 2016 actually. Well, it turns out that a tiny change during the Python 3 compatibility migration done to a helper function I used in that code had interesting side effects: switching from <a href="https://docs.python.org/3/library/codecs.html#codecs.open"><code>codecs.open</code></a> to <a href="https://docs.python.org/3/library/io.html#io.open"><code>io.open</code></a>.</p>
<p>It turns out that <code>io.open</code> (and thus Python 3&rsquo;s built-in <code>open</code>) by default will open text files in &ldquo;universal newlines mode&rdquo; (see <a href="https://www.python.org/dev/peps/pep-0278/">PEP278</a>), meaning it will happily parse every common line ending, but convert it to <code>LF</code> before returning. Which caused my off-by-one issue in files with <code>CRLF</code>.</p>
<p>And the fix? <a href="https://github.com/foosel/OctoPrint/commit/27bbab9582eb3a1a9fca8f2b203e88b1682fcdc5">Setting <code>newline=&quot;&quot;</code> on the open call</a>:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-diff" data-lang="diff"><span style="display:flex;"><span>diff --git a/src/octoprint/util/comm.py b/src/octoprint/util/comm.py
</span></span><span style="display:flex;"><span>index 67191a7af..a6dfc1e24 100644
</span></span><span style="display:flex;"><span><span style="color:#f92672">--- a/src/octoprint/util/comm.py
</span></span></span><span style="display:flex;"><span><span style="color:#a6e22e">+++ b/src/octoprint/util/comm.py
</span></span></span><span style="display:flex;"><span><span style="color:#75715e">@@ -4078,7 +4078,7 @@ def start(self):
</span></span></span><span style="display:flex;"><span> 		&#34;&#34;&#34;
</span></span><span style="display:flex;"><span> 		PrintingFileInformation.start(self)
</span></span><span style="display:flex;"><span> 		with self._handle_mutex:
</span></span><span style="display:flex;"><span><span style="color:#f92672">-			self._handle = bom_aware_open(self._filename, encoding=&#34;utf-8&#34;, errors=&#34;replace&#34;)
</span></span></span><span style="display:flex;"><span><span style="color:#a6e22e">+			self._handle = bom_aware_open(self._filename, encoding=&#34;utf-8&#34;, errors=&#34;replace&#34;, newline=&#34;&#34;)
</span></span></span><span style="display:flex;"><span> 			self._pos = self._handle.tell()
</span></span><span style="display:flex;"><span> 			if self._handle.encoding.endswith(&#34;-sig&#34;):
</span></span><span style="display:flex;"><span> 				# Apparently we found an utf-8 bom in the file.
</span></span></code></pre></div><p>The moral of the story? Don&rsquo;t trust your file position calculations. I could have saved myself a lot of time on debugging this if I had just looked there <em>first</em> instead of assuming this code to be fine 😅</p>
<p>In the end, even a year later, I still have no idea why Cura produced <code>CRLF</code> code for some and <code>LF</code> for me, but I also never really looked hard. A UNIX vs Windows issue can be ruled out here since the affected parties and me were all using Windows. It made me learn something about <code>io.open</code> and was a valuable lesson on wrong assumptions however!</p>
]]></content:encoded></item></channel></rss>