<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>regex on foosel.net</title><link>https://foosel.net/tags/regex/</link><description>Recent content in regex on foosel.net</description><generator>Hugo</generator><language>en-us</language><copyright>Gina Häußge (foosel)</copyright><lastBuildDate>Wed, 01 Feb 2023 00:00:00 +0000</lastBuildDate><atom:link href="https://foosel.net/tags/regex/feed.xml" rel="self" type="application/rss+xml"/><item><title>TIL: How to grep a log for multiline errors</title><link>https://foosel.net/til/2023-02-01-how-to-grep-a-log-for-multline-errors/</link><pubDate>Wed, 01 Feb 2023 00:00:00 +0000</pubDate><guid>https://foosel.net/til/2023-02-01-how-to-grep-a-log-for-multline-errors/</guid><description>&lt;p&gt;I just found myself in the position to have to &lt;code&gt;grep&lt;/code&gt; an OctoPrint log file for error log entries with attached Python stack traces. I wanted to not only get the starting line where the exception log output starts, but the full stack trace up until the next regular log line.&lt;/p&gt;
&lt;p&gt;The format of the lines in &lt;code&gt;octoprint.log&lt;/code&gt; is a simple &lt;code&gt;%(asctime)s - %(name)s - %(levelname)s - %(message)s&lt;/code&gt;, so a log with an error and attached exception looks like this:&lt;/p&gt;</description><content:encoded><![CDATA[<p>I just found myself in the position to have to <code>grep</code> an OctoPrint log file for error log entries with attached Python stack traces. I wanted to not only get the starting line where the exception log output starts, but the full stack trace up until the next regular log line.</p>
<p>The format of the lines in <code>octoprint.log</code> is a simple <code>%(asctime)s - %(name)s - %(levelname)s - %(message)s</code>, so a log with an error and attached exception looks like this:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-plain" data-lang="plain"><span style="display:flex;"><span>2023-01-30 17:50:45,704 - octoprint.events.fire - DEBUG - Firing event: Disconnecting (Payload: None)
</span></span><span style="display:flex;"><span>2023-01-30 17:50:45,704 - octoprint.events - DEBUG - Sending action to &lt;bound method PrinterStateConnection._onEvent of &lt;octoprint.server.util.sockjs.PrinterStateConnection object at 0x000001635CB6EE50&gt;&gt;
</span></span><span style="display:flex;"><span>2023-01-30 17:50:45,705 - octoprint.plugin - DEBUG - Calling on_event on action_command_notification
</span></span><span style="display:flex;"><span>2023-01-30 17:50:45,705 - octoprint.plugin - DEBUG - Calling on_event on action_command_prompt
</span></span><span style="display:flex;"><span>2023-01-30 17:50:45,705 - octoprint.plugin - DEBUG - Calling on_event on announcements
</span></span><span style="display:flex;"><span>2023-01-30 17:50:45,705 - octoprint.plugin - DEBUG - Calling on_event on file_check
</span></span><span style="display:flex;"><span>2023-01-30 17:50:45,706 - octoprint.plugin - DEBUG - Calling on_event on firmware_check
</span></span><span style="display:flex;"><span>2023-01-30 17:50:45,706 - octoprint.plugin - DEBUG - Calling on_event on pluginmanager
</span></span><span style="display:flex;"><span>2023-01-30 17:50:45,706 - octoprint.plugin - DEBUG - Calling on_event on softwareupdate
</span></span><span style="display:flex;"><span>2023-01-30 17:50:45,706 - octoprint.plugin - DEBUG - Calling on_event on tracking
</span></span><span style="display:flex;"><span>2023-01-30 17:50:45,711 - octoprint.plugin - DEBUG - Calling on_event on mqtt
</span></span><span style="display:flex;"><span>2023-01-30 17:50:45,732 - octoprint.events.fire - DEBUG - Firing event: Disconnected (Payload: None)
</span></span><span style="display:flex;"><span>2023-01-30 17:50:45,735 - octoprint.events - DEBUG - Sending action to &lt;function Server.run.&lt;locals&gt;.&lt;lambda&gt; at 0x000001635BDB2CA0&gt;
</span></span><span style="display:flex;"><span>2023-01-30 17:50:45,750 - octoprint.events - ERROR - Got an exception while sending event Disconnected (Payload: None) to &lt;function Server.run.&lt;locals&gt;.&lt;lambda&gt; at 0x000001635BDB2CA0&gt;
</span></span><span style="display:flex;"><span>Traceback (most recent call last):
</span></span><span style="display:flex;"><span>  File &#34;C:\Devel\OctoPrint\OctoPrint\src\octoprint\events.py&#34;, line 197, in _work
</span></span><span style="display:flex;"><span>    listener(event, payload)
</span></span><span style="display:flex;"><span>  File &#34;C:\Devel\OctoPrint\OctoPrint\src\octoprint\server\__init__.py&#34;, line 1212, in &lt;lambda&gt;
</span></span><span style="display:flex;"><span>    octoprint.events.Events.DISCONNECTED, lambda e, p: run_autorefresh()
</span></span><span style="display:flex;"><span>                                                       ^^^^^^^^^^^^^^^^^
</span></span><span style="display:flex;"><span>  File &#34;C:\Devel\OctoPrint\OctoPrint\src\octoprint\server\__init__.py&#34;, line 1195, in run_autorefresh
</span></span><span style="display:flex;"><span>    autorefresh.stop()
</span></span><span style="display:flex;"><span>    ^^^^^^^^^^^^^^^^
</span></span><span style="display:flex;"><span>AttributeError: &#39;RepeatedTimer&#39; object has no attribute &#39;stop&#39;
</span></span><span style="display:flex;"><span>2023-01-30 17:50:45,753 - octoprint.events - DEBUG - Sending action to &lt;bound method PrinterStateConnection._onEvent of &lt;octoprint.server.util.sockjs.PrinterStateConnection object at 0x000001635CB6EE50&gt;&gt;
</span></span><span style="display:flex;"><span>2023-01-30 17:50:45,755 - octoprint.plugin - DEBUG - Calling on_event on action_command_notification
</span></span><span style="display:flex;"><span>2023-01-30 17:50:45,756 - octoprint.server.util.sockjs - DEBUG - Socket message held back until permissions cleared, added to backlog: plugin
</span></span><span style="display:flex;"><span>2023-01-30 17:50:45,758 - octoprint.plugins.action_command_notification - INFO - Notifications cleared
</span></span><span style="display:flex;"><span>2023-01-30 17:50:45,758 - octoprint.plugin - DEBUG - Calling on_event on action_command_prompt
</span></span><span style="display:flex;"><span>2023-01-30 17:50:45,758 - octoprint.plugin - DEBUG - Calling on_event on announcements
</span></span><span style="display:flex;"><span>2023-01-30 17:50:45,759 - octoprint.plugin - DEBUG - Calling on_event on file_check
</span></span><span style="display:flex;"><span>2023-01-30 17:50:45,759 - octoprint.plugin - DEBUG - Calling on_event on firmware_check
</span></span><span style="display:flex;"><span>2023-01-30 17:50:45,759 - octoprint.server.util.sockjs - DEBUG - Socket message held back until permissions cleared, added to backlog: plugin
</span></span><span style="display:flex;"><span>2023-01-30 17:50:45,763 - octoprint.plugin - DEBUG - Calling on_event on pluginmanager
</span></span><span style="display:flex;"><span>2023-01-30 17:50:45,764 - octoprint.plugin - DEBUG - Calling on_event on softwareupdate
</span></span><span style="display:flex;"><span>2023-01-30 17:50:45,764 - octoprint.plugin - DEBUG - Calling on_event on tracking
</span></span></code></pre></div><p>What I now wanted is for <code>grep</code> to spit out just the <code>ERROR</code> line and the attached stack trace:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-plain" data-lang="plain"><span style="display:flex;"><span>2023-01-30 17:50:45,750 - octoprint.events - ERROR - Got an exception while sending event Disconnected (Payload: None) to &lt;function Server.run.&lt;locals&gt;.&lt;lambda&gt; at 0x000001635BDB2CA0&gt;
</span></span><span style="display:flex;"><span>Traceback (most recent call last):
</span></span><span style="display:flex;"><span>  File &#34;C:\Devel\OctoPrint\OctoPrint\src\octoprint\events.py&#34;, line 197, in _work
</span></span><span style="display:flex;"><span>    listener(event, payload)
</span></span><span style="display:flex;"><span>  File &#34;C:\Devel\OctoPrint\OctoPrint\src\octoprint\server\__init__.py&#34;, line 1212, in &lt;lambda&gt;
</span></span><span style="display:flex;"><span>    octoprint.events.Events.DISCONNECTED, lambda e, p: run_autorefresh()
</span></span><span style="display:flex;"><span>                                                       ^^^^^^^^^^^^^^^^^
</span></span><span style="display:flex;"><span>  File &#34;C:\Devel\OctoPrint\OctoPrint\src\octoprint\server\__init__.py&#34;, line 1195, in run_autorefresh
</span></span><span style="display:flex;"><span>    autorefresh.stop()
</span></span><span style="display:flex;"><span>    ^^^^^^^^^^^^^^^^
</span></span><span style="display:flex;"><span>AttributeError: &#39;RepeatedTimer&#39; object has no attribute &#39;stop&#39;
</span></span></code></pre></div><p>For this I needed a way to set <code>grep</code> to match multiple lines and do a (non-matching) look ahead for the end. It turns out that the secret to success here is to treat the whole input as one line, use Perl compatible regex mode, and make sure to set the multiline flag. After some fiddling around on <a href="https://regex101.com/r/qYOrnT/1">regex101.com</a> and reading up on <a href="https://perldoc.perl.org/perlre#Extended-Patterns">Perl&rsquo;s regex options</a><sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup>, I came up with the following:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-plain" data-lang="plain"><span style="display:flex;"><span>grep -Pazo &#39;(?m)^\N+\- ERROR \-\N*\n(^\N*?\n)*?(?=\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2},\d{3} \- )&#39; octoprint.log
</span></span></code></pre></div><p>Let&rsquo;s walk through this:</p>
<ul>
<li><code>-P</code> enables Perl compatible regex mode</li>
<li><code>-a</code> enables text mode</li>
<li><code>-z</code> turns all newlines into null bytes and thus treats the whole input as a single line for finding matches<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup></li>
<li><code>-o</code> only outputs the matched part of the line (otherwise we&rsquo;d get the whole file printed out)</li>
<li><code>(?m)</code> enables multiline mode</li>
<li><code>^\N+\- ERROR \-\N*\n</code> matches the first line of the error, which is the one that starts with the timestamp and package and contains the word <code>ERROR</code></li>
<li><code>(^\N*?\n)*?</code> non-greedily matches all following lines of the error, which are anything but a newline followed by a newline</li>
<li><code>(?=\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2},\d{3} \- )</code> is a positive look-ahead that matches a line starting with a timestamp again, which signifies the end of the error&rsquo;s lines</li>
</ul>
<p>Hooray, it works 🥳:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-plain" data-lang="plain"><span style="display:flex;"><span>❯ grep -Pazo &#39;(?m)^\N+\- ERROR \-\N*\n(^\N*?\n)*?(?=\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2},\d{3} \- )&#39; octoprint.log
</span></span><span style="display:flex;"><span>2023-01-30 17:50:45,750 - octoprint.events - ERROR - Got an exception while sending event Disconnected (Payload: None) to &lt;function Server.run.&lt;locals&gt;.&lt;lambda&gt; at 0x000001635BDB2CA0&gt;
</span></span><span style="display:flex;"><span>Traceback (most recent call last):
</span></span><span style="display:flex;"><span>  File &#34;C:\Devel\OctoPrint\OctoPrint\src\octoprint\events.py&#34;, line 197, in _work
</span></span><span style="display:flex;"><span>    listener(event, payload)
</span></span><span style="display:flex;"><span>  File &#34;C:\Devel\OctoPrint\OctoPrint\src\octoprint\server\__init__.py&#34;, line 1212, in &lt;lambda&gt;
</span></span><span style="display:flex;"><span>    octoprint.events.Events.DISCONNECTED, lambda e, p: run_autorefresh()
</span></span><span style="display:flex;"><span>                                                       ^^^^^^^^^^^^^^^^^
</span></span><span style="display:flex;"><span>  File &#34;C:\Devel\OctoPrint\OctoPrint\src\octoprint\server\__init__.py&#34;, line 1195, in run_autorefresh
</span></span><span style="display:flex;"><span>    autorefresh.stop()
</span></span><span style="display:flex;"><span>    ^^^^^^^^^^^^^^^^
</span></span><span style="display:flex;"><span>AttributeError: &#39;RepeatedTimer&#39; object has no attribute &#39;stop&#39;
</span></span></code></pre></div><p>(And yes, I&rsquo;ve fixed the error that lead to this stack trace as well 😉)</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>I don&rsquo;t know about you, but I always forget about positive/negative look-ahead/behind and pattern-match modifiers.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>The downside of this is that now <code>-n</code> (print line number of match) will not work anymore and just happily report line 1 for every single match.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content:encoded></item></channel></rss>