<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>debugging on foosel.net</title><link>https://foosel.net/tags/debugging/</link><description>Recent content in debugging on foosel.net</description><generator>Hugo</generator><language>en-us</language><copyright>Gina Häußge (foosel)</copyright><lastBuildDate>Sun, 09 May 2021 00:00:00 +0000</lastBuildDate><atom:link href="https://foosel.net/tags/debugging/feed.xml" rel="self" type="application/rss+xml"/><item><title>A debugging story</title><link>https://foosel.net/blog/2021-05-09-a-debugging-story/</link><pubDate>Sun, 09 May 2021 00:00:00 +0000</pubDate><guid>https://foosel.net/blog/2021-05-09-a-debugging-story/</guid><description>Could this bug for once not be my fault?</description><content:encoded><![CDATA[<p><img src="https://foosel.net/blog/2021-05-09-a-debugging-story/issue_4117.png" alt="The screenshot showing the issue -- a rendering defect in OctoPrint's GCode viewer" loading="lazy"></p><p>About a week ago I got a new <a href="https://github.com/OctoPrint/OctoPrint/issues/4117">bug report</a> on <a href="https://octoprint.org">OctoPrint&rsquo;s</a> issue tracker:</p>
<blockquote>
<p><strong>GCode Viewer Visualisation Problem</strong></p>
<p><em>The problem</em></p>
<p>The visualisation in GCode viewer ist not correct. The print is OK.
See gcode file (zip) on Layer 43 to 47 and 49</p>
<p>And screenshot</p>
</blockquote>
<p>You already saw the included screenshot, and it shows that there was a spike being visualized in the GCode Viewer that
wasn&rsquo;t actually there. My first attempt at reproduction failed spectacularly &ndash; the file looked exactly like
it was supposed to. Then I noticed that the OP was using Google Chrome however (adding the detected user agent
to the system information contained in OctoPrint&rsquo;s new System Info Bundles already paid off!) and tried with that
instead of my usual Firefox, and lo and behold, I saw the issue.</p>
<p>Scrolling a bit through the file revealed further defects, as also mentioned by the OP, e.g. this one:</p>
<p><img src="https://foosel.net/blog/2021-05-09-a-debugging-story/issue_4117_2.png" alt="Another defect, this time a whole part of the outline is being misplaced" loading="lazy">
</p>
<p>At this point it was clear that this was a Chrome-only issue. But was it a bug in OctoPrint or possibly a browser
bug? More information for that was needed but not readily available, and the file was also too big to quickly
gleam anything from the GCode itself that could possibly help to narrow down on the problem.</p>
<p>So the first step was to create a minimal GCode file that showed the same error. For this I took a look at
the reported layer height in the viewer on the layer a defect was visible and then narrowed down on the affected lines
by using the horizontal command sliders to further limit the view. That way I quickly found that these were the
problematic lines:</p>
<pre tabindex="0"><code class="language-gcode" data-lang="gcode">G1 X173.595 Y103.9 E247.16716
G3 X173.600 Y126.097 I-105613.507 J39.645 E248.20080
G1 X169.552 Y126.098 E248.38933
</code></pre><p>More specifically, the error was caused by the contained <a href="https://reprap.org/wiki/G-code#G2_.26_G3:_Controlled_Arc_Move"><code>G3</code> command</a>,
which instructs the printer to move in a counter clockwise arc
from its current position to the given X and Y coordinates, with the center of said arc offset by the given I and J
parameters. In the case of these lines, that meant to move in an arc from <code>(173.595, 103.0)</code> to <code>(173.600, 126.097)</code>
with the arc&rsquo;s center at <code>x = 173.595 + (-105613.507) = -105439.912</code> and <code>y = 103.9 + 39.645 = 143.54500000000002</code>.
Or in other words, a rather short arc with an enormous radius of over 105m that was more a straight line than
an arc really. And that line was being drawn too long, causing the weird spike in the rendition.</p>
<p>In order to understand how that could happen however we need to take a look at how the GCode viewer is implemented and how
arcs work in that implementation. At its core, the GCode viewer is an HTML5 2D canvas on which the path described in
a GCode file gets drawn. Commands like <code>G0</code> and <code>G1</code> that describe straight lines are drawn using <a href="https://developer.mozilla.org/en-US/docs/Web/API/CanvasRenderingContext2D/lineTo"><code>lineTo</code></a>,
arcs as described by <code>G2</code> and <code>G3</code> are drawn using <a href="https://developer.mozilla.org/en-US/docs/Web/API/CanvasRenderingContext2D/arc"><code>arc</code></a>.</p>
<p><code>arc</code> takes six parameters: the <code>x</code> and <code>y</code> coordinate of the center of the arc, the radius <code>r</code>, the <code>startAngle</code> determining from which angle to start
drawing the arc and the <code>endAngle</code> until which to draw the arc, and a flag that&rsquo;s <code>true</code> for counter clockwise and <code>false</code> or empty for clockwise.
It is obvious this doesn&rsquo;t directly translate to the data contained in the GCode itself, where we rather have three points defining the arc &ndash; a start
point, and end point, and the arc&rsquo;s center. So we need to translate this into the data required by the <code>arc</code> method. Using some trigonometry,
that is fairly straightforward:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-js" data-lang="js"><span style="display:flex;"><span><span style="color:#75715e">// given: G2/G3 X&lt;endX&gt; Y&lt;endY&gt; I&lt;i&gt; J&lt;j&gt;, &lt;startX&gt;, &lt;startY&gt;
</span></span></span><span style="display:flex;"><span><span style="color:#a6e22e">arcX</span> <span style="color:#f92672">=</span> <span style="color:#a6e22e">startX</span> <span style="color:#f92672">+</span> <span style="color:#a6e22e">i</span>;
</span></span><span style="display:flex;"><span><span style="color:#a6e22e">arcY</span> <span style="color:#f92672">=</span> <span style="color:#a6e22e">startY</span> <span style="color:#f92672">+</span> <span style="color:#a6e22e">j</span>;
</span></span><span style="display:flex;"><span><span style="color:#a6e22e">r</span> <span style="color:#f92672">=</span> Math.<span style="color:#a6e22e">sqrt</span>(<span style="color:#a6e22e">i</span> <span style="color:#f92672">*</span> <span style="color:#a6e22e">i</span> <span style="color:#f92672">+</span> <span style="color:#a6e22e">j</span> <span style="color:#f92672">*</span> <span style="color:#a6e22e">j</span>);
</span></span><span style="display:flex;"><span><span style="color:#a6e22e">startAngle</span> <span style="color:#f92672">=</span> Math.<span style="color:#a6e22e">atan2</span>(<span style="color:#a6e22e">startY</span> <span style="color:#f92672">-</span> <span style="color:#a6e22e">arcY</span>, <span style="color:#a6e22e">startX</span> <span style="color:#f92672">-</span> <span style="color:#a6e22e">arcX</span>);
</span></span><span style="display:flex;"><span><span style="color:#a6e22e">endAngle</span> <span style="color:#f92672">=</span> Math.<span style="color:#a6e22e">atan2</span>(<span style="color:#a6e22e">endY</span> <span style="color:#f92672">-</span> <span style="color:#a6e22e">arcY</span>, <span style="color:#a6e22e">endX</span> <span style="color:#f92672">-</span> <span style="color:#a6e22e">arcX</span>);
</span></span><span style="display:flex;"><span><span style="color:#a6e22e">ccw</span> <span style="color:#f92672">=</span> (<span style="color:#a6e22e">command</span> <span style="color:#f92672">===</span> <span style="color:#e6db74">&#34;G3&#34;</span>)
</span></span></code></pre></div><p><img src="https://foosel.net/blog/2021-05-09-a-debugging-story/drawing.png" alt="The parameters and their relation" loading="lazy">
</p>
<p>My first guess was that the result of this conversion was somehow different between Firefox and Chrome, and so I modified the GCode viewer to log
the calculated values and then compared the two outcomes. The values were completely identical between both browsers, so what was being fed
into the canvas <code>arc</code> command was identical and yet produces different results. Why?</p>
<p>My next approach was to add some more visual debug output to the viewer itself. I modified it such that the arc parameters as calculated would
actually be drawn on the canvas as well, in form of a geometrical pizza slice showing the arc&rsquo;s center, its &ldquo;legs&rdquo; and its rim. And this is where
I saw a difference in the rendered output. Where in Firefox the arc&rsquo;s rim and its legs met perfectly:</p>
<p><img src="https://foosel.net/blog/2021-05-09-a-debugging-story/arc_ff.png" alt="The arc in Firefox is rendered correctly" loading="lazy">
</p>
<p>in Chrome the rim overshot:</p>
<p><img src="https://foosel.net/blog/2021-05-09-a-debugging-story/arc_chrome.png" alt="The same arc in Chrome is rendered wrong" loading="lazy">
</p>
<p>So while the calculated parameters were correct and in both cases provided to the <code>arc</code> method just the same, Chrome was rendering the wrong segment length!</p>
<p>I suspected a rounding error and thus started searching for matching reports from other people. I couldn&rsquo;t find a specific bug report, but I came across a post on Stack Overflow that sounded mightily familiar: <a href="https://stackoverflow.com/questions/8603656/html5-canvas-arcs-not-rendering-correctly-in-google-chrome">HTML5 canvas arcs not rendering correctly in Google Chrome</a>, from 2011. A ten year old post&hellip; could it be?</p>
<p>Honestly, I still do not know if this indeed described the same issue or not, or if there&rsquo;s a Chrome ticket describing this behaviour &ndash; I&rsquo;ll continue to look, but first and foremost I was focused on fixing this problem in OctoPrint&rsquo;s GCode viewer. The Stack Overflow post provided a code snippet that reimplements <code>arc</code> utilizing bezier curves, and so I gave this a try. Long story short, OctoPrint&rsquo;s GCode Viewer as part of version 1.7.0+ will ship with a Chrome-only <code>arc</code> replacement that will be enabled by default, but can also be disabled in real time, with great effect:</p>
<p><img src="https://foosel.net/blog/2021-05-09-a-debugging-story/arc_fix.gif" alt="Enabling and disabling the arc workaround makes the defects disappear and reappear" loading="lazy">
</p>
<p>And the moral of the story: It rarely is a browser bug. But sometimes, all signs say it indeed <em>is</em> and a workaround is the easiest solution.</p>
]]></content:encoded></item><item><title>On wrong assumptions</title><link>https://foosel.net/blog/2021-03-19-on-wrong-assumptions/</link><pubDate>Fri, 19 Mar 2021 00:00:00 +0000</pubDate><guid>https://foosel.net/blog/2021-03-19-on-wrong-assumptions/</guid><description>How I once spent two weeks barking up the wrong tree</description><content:encoded><![CDATA[<p><img src="https://foosel.net/blog/2021-03-19-on-wrong-assumptions/screenshot.jpg" alt="A shot of the screen displaying the diff of the fix" loading="lazy"></p><p><em>The original version of this post was published as a <a href="https://twitter.com/foosel/status/1242121324438355974">Twitter thread on March 23rd 2020</a>. I figured I should give it a more permanent home here since IMHO it was a quite fun story.</em></p>
<p>Since everyone can use some entertainment right now, how about a battle story on how a year ago I spent almost two weeks trying to wrap my head around a really weird issue of a lagging GCODE viewer and overall print progress reporting in <a href="https://octoprint.org">OctoPrint</a> and finally figuring it out?</p>
<p>Our story begins around the release of 1.4.0, when <a href="https://community.octoprint.org/t/curious-issue-with-print-progress/16304">a new topic on the community forum</a> showed up:</p>
<blockquote>
<h3 id="curious-issue-with-print-progress">Curious issue with print progress</h3>
<p>The print progress figures on my Octopi setup are lagging behind the actual print. [&hellip;] Nothing is broken - anything I throw at it (an Ender 3) prints fine but as a print progresses, the percentage complete, current layer, and sync&rsquo;d gcode viewer gradually lag behind what is actually being printed. For example, on a print with 400 layers, as the last layer is printed the reported progress and current layer is around 96% and 385 respectively. If I do a quick calculation of the displayed Printed/Total file size figures it works out at 96% but what it has actually printed is over 99%. When the print finishes the numbers jump to 100% and 400 and everything is fine.</p>
<p>[&hellip;]</p>
</blockquote>
<p>This was indeed a very curious issue, since due to the nature of the communication with the printer and buffering in the firmware the progress is usually rather slightly <em>ahead</em> than behind. Some quick testing on my end showed no reproduction, however more and more people chimed in with the same observation.</p>
<p>I was stumped.</p>
<p>My first approach was to collect information from those affected by it. Printer model, firmware version, installed plugins, used slicer and so on. It soon turned out that all affected installations were using Ultimaker Cura as the slicer.</p>
<p>A quick test by the OP with a different slicer confirmed that it indeed just occurred with GCODE sliced by Cura for him, same file in another slicer had everything work as designed. However, comparing the GCODE revealed no immediate differences that would explain this, and what actually is <em>in</em> the file also doesn&rsquo;t really play into progress tracking. My own experiments with Cura failed to reproduce.</p>
<p>Convinced that the issue must be some sort of delay between the backend and the frontend &ndash; maybe due to network issues? &ndash; I whipped up a plugin (since deleted) to log progress on both ends to a log which could then be shared and analysed. The first results came in an guess what? I had barked up the wrong tree, the reported progress was identical. So back to square one.</p>
<p>I still couldn&rsquo;t reproduce it on my end and was starting to get really angry at this issue 😅 I finally threw a copy of some GCODE files now shared by the reporter of the issue on my own printer and <em>finally</em> I could reproduce. Which doesn&rsquo;t mean I had any idea WTF was going on though.</p>
<p>After many test prints, head scratching and going through the files with a comb I finally noticed something. The files with the issue had <code>CRLF</code> (or <code>\r\n</code>) line endings. Those without (including my own sliced files) had just <code>LF</code> (or <code>\n</code>) line endings.</p>
<p>So that made me go 🤨 Some cursing and breakpoint setting later I had proof that the reported progress in backend and frontend was flawed to begin with. I could see that a line was being reported with a file position that it actually was not located at in the file, and which instead belonged to a couple lines earlier. Which meant my positions were reported wrong right at the source &ndash; with a lag. And then it suddenly hit me.</p>
<p>But before I can tell you what was happening I need to give you some background on how OctoPrint reads GCODE files it&rsquo;s printing in order to understand what was going on. Printed files are read line by line because that is how they are sent to the printer. For that OctoPrint uses the <a href="https://docs.python.org/3/library/io.html?highlight=readline#io.IOBase.readline"><code>readline</code></a> method of the file stream. And that works by reading chunks of data from the file until a line separator is found, returning everything read up to this separator and saving the rest for the next line to be read. That means the file will have to be read further than what is returned. And that means that the position in the open file as reported by <a href="https://docs.python.org/3/library/io.html?highlight=readline#io.IOBase.tell"><code>tell</code></a> on the file stream will always be slightly ahead. For progress reporting in OctoPrint however I need to know the exact byte position of each line in the file. So what I do instead of relying on the internal and slightly ahead file position is that I increase my own position indicator by the length of the line read from the file. And this is where my problem was located.</p>
<p>It turns out that for some reason I wasn&rsquo;t getting the lines back from <code>readline</code> with the original line endings attached. Instead I always got <code>LF</code>, even for files with <code>CRLF</code>. And that means I was counting one byte short for every single line in <code>CRLF</code> terminated files. One byte short per line doesn&rsquo;t sound like much, but that adds up through a file with several hundred thousands of lines, to a point where progress reporting will be off by whole layers the further in the print and thus the file you are.</p>
<p>But what was the reason for this popping up in 1.4.0? I hadn&rsquo;t modified the code in question at all. It had been the same since 2016 actually. Well, it turns out that a tiny change during the Python 3 compatibility migration done to a helper function I used in that code had interesting side effects: switching from <a href="https://docs.python.org/3/library/codecs.html#codecs.open"><code>codecs.open</code></a> to <a href="https://docs.python.org/3/library/io.html#io.open"><code>io.open</code></a>.</p>
<p>It turns out that <code>io.open</code> (and thus Python 3&rsquo;s built-in <code>open</code>) by default will open text files in &ldquo;universal newlines mode&rdquo; (see <a href="https://www.python.org/dev/peps/pep-0278/">PEP278</a>), meaning it will happily parse every common line ending, but convert it to <code>LF</code> before returning. Which caused my off-by-one issue in files with <code>CRLF</code>.</p>
<p>And the fix? <a href="https://github.com/foosel/OctoPrint/commit/27bbab9582eb3a1a9fca8f2b203e88b1682fcdc5">Setting <code>newline=&quot;&quot;</code> on the open call</a>:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-diff" data-lang="diff"><span style="display:flex;"><span>diff --git a/src/octoprint/util/comm.py b/src/octoprint/util/comm.py
</span></span><span style="display:flex;"><span>index 67191a7af..a6dfc1e24 100644
</span></span><span style="display:flex;"><span><span style="color:#f92672">--- a/src/octoprint/util/comm.py
</span></span></span><span style="display:flex;"><span><span style="color:#a6e22e">+++ b/src/octoprint/util/comm.py
</span></span></span><span style="display:flex;"><span><span style="color:#75715e">@@ -4078,7 +4078,7 @@ def start(self):
</span></span></span><span style="display:flex;"><span> 		&#34;&#34;&#34;
</span></span><span style="display:flex;"><span> 		PrintingFileInformation.start(self)
</span></span><span style="display:flex;"><span> 		with self._handle_mutex:
</span></span><span style="display:flex;"><span><span style="color:#f92672">-			self._handle = bom_aware_open(self._filename, encoding=&#34;utf-8&#34;, errors=&#34;replace&#34;)
</span></span></span><span style="display:flex;"><span><span style="color:#a6e22e">+			self._handle = bom_aware_open(self._filename, encoding=&#34;utf-8&#34;, errors=&#34;replace&#34;, newline=&#34;&#34;)
</span></span></span><span style="display:flex;"><span> 			self._pos = self._handle.tell()
</span></span><span style="display:flex;"><span> 			if self._handle.encoding.endswith(&#34;-sig&#34;):
</span></span><span style="display:flex;"><span> 				# Apparently we found an utf-8 bom in the file.
</span></span></code></pre></div><p>The moral of the story? Don&rsquo;t trust your file position calculations. I could have saved myself a lot of time on debugging this if I had just looked there <em>first</em> instead of assuming this code to be fine 😅</p>
<p>In the end, even a year later, I still have no idea why Cura produced <code>CRLF</code> code for some and <code>LF</code> for me, but I also never really looked hard. A UNIX vs Windows issue can be ruled out here since the affected parties and me were all using Windows. It made me learn something about <code>io.open</code> and was a valuable lesson on wrong assumptions however!</p>
]]></content:encoded></item></channel></rss>