<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  </head>
  <body>
    <p>i can exclude neon code for dd-wrt in dropbear if it helps. but
      would be greater to nail down the problem. otherwise other
      programms would be likelly affected too<br>
    </p>
    <div class="moz-cite-prefix">Am 28.03.2020 um 21:06 schrieb Horshack
      ‪‬:<br>
    </div>
    <blockquote type="cite"
cite="mid:BY5PR13MB33304958232D0035516D7CDBA4CD0@BY5PR13MB3330.namprd13.prod.outlook.com">
      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
      <style type="text/css" style="display:none;"> P {margin-top:0;margin-bottom:0;}</style>
      <div style="font-family: Calibri, Helvetica, sans-serif;
        font-size: 12pt; color: rgb(0, 0, 0);">
        As a postscript, I was able to refine the logic to produce the
        corrupted result almost instantaneously. I'm also able to get it
        to fail with an all-zero input dataset and a bitwise OR
        operation instead of the original squaring multiplication
        operations, which allows me to see what actual corrupted loads
        are. The result is very interesting - sometimes the corrupted
        data is valid ARM instructions, other times valid kernel-space
        addresses, so it seems clear this is an addressing problem. Also
        interesting is how I'll see just one or a few corrupted words,
        which implies the corruption is in the interface between DCACHE
        and the processor rather than errant fetch of a line into DCACHE
        from memory (otherwise the entire DCACHE line would hold corrupt
        data). You can see a sample of the failure output here: <a
href="https://github.com/horshack-dpreview/ipq8065-sqrbug/blob/master/SampleFailures.txt"
          moz-do-not-send="true">
https://github.com/horshack-dpreview/ipq8065-sqrbug/blob/master/SampleFailures.txt</a><br>
      </div>
      <div style="font-family: Calibri, Helvetica, sans-serif;
        font-size: 12pt; color: rgb(0, 0, 0);">
        <br>
      </div>
      <div style="font-family: Calibri, Helvetica, sans-serif;
        font-size: 12pt; color: rgb(0, 0, 0);">
        Finally, to exclude any possibility the issue is related to
        possible kernel code running and corrupting register sets/memory
        (such as an interrupt routine), I ported the test to a kernel
        module and ran the logic within a local_irq_disable() block,
        which disables both preemption and interrupts on the core. Still
        fails. I created a separate repository for the kernel module
        version here:
        <a
          href="https://github.com/horshack-dpreview/ipq8065-sqrbug-driver"
          moz-do-not-send="true">https://github.com/horshack-dpreview/ipq8065-sqrbug-driver</a><br>
      </div>
      <div style="font-family: Calibri, Helvetica, sans-serif;
        font-size: 12pt; color: rgb(0, 0, 0);">
        <br>
      </div>
      <div>
        <hr tabindex="-1" style="display:inline-block; width:98%">
        <div id="divRplyFwdMsg" dir="ltr"><font style="font-size:11pt"
            face="Calibri, sans-serif" color="#000000"><b>From:</b>
            Horshack ‪‬ <a class="moz-txt-link-rfc2396E" href="mailto:horshack@live.com">&lt;horshack@live.com&gt;</a><br>
            <b>Sent:</b> Tuesday, March 24, 2020 9:25 PM<br>
            <b>To:</b> Sebastian Gottschall
            <a class="moz-txt-link-rfc2396E" href="mailto:s.gottschall@dd-wrt.com">&lt;s.gottschall@dd-wrt.com&gt;</a>; <a class="moz-txt-link-abbreviated" href="mailto:dropbear@ucc.asn.au">dropbear@ucc.asn.au</a>
            <a class="moz-txt-link-rfc2396E" href="mailto:dropbear@ucc.asn.au">&lt;dropbear@ucc.asn.au&gt;</a><br>
            <b>Subject:</b> Re: SSH key exchange fails 30-70% of the
            time on Netgear X4S R7800</font>
          <div> </div>
        </div>
        <div dir="ltr">
          <div style="font-family:Calibri,Helvetica,sans-serif;
            font-size:12pt; color:rgb(0,0,0)">
            I excluded context switches as a possible culprit by looping
            until a corruption happened for which no context switches
            occurred while the test was running (ie, at the start of the
            test I would save the # of involuntary/voluntary context
            switches from /proc/&lt;pid&gt;/status, then check those
            counts again after the failure - if they were different I
            restarted the test and kept looping until a failure happened
            in which the ctx switch counts were the same.<br>
          </div>
          <div>
            <div style="font-family:Calibri,Helvetica,sans-serif;
              font-size:12pt; color:rgb(0,0,0)">
              <br>
            </div>
            <hr tabindex="-1" style="display:inline-block; width:98%">
            <div id="x_divRplyFwdMsg" dir="ltr"><font
                style="font-size:11pt" face="Calibri, sans-serif"
                color="#000000"><b>From:</b>
                <a class="moz-txt-link-abbreviated" href="mailto:dropbear-bounces+horshack=live.com@ucc.asn.au">dropbear-bounces+horshack=live.com@ucc.asn.au</a>
                <a class="moz-txt-link-rfc2396E" href="mailto:dropbear-bounces+horshack=live.com@ucc.asn.au">&lt;dropbear-bounces+horshack=live.com@ucc.asn.au&gt;</a> on
                behalf of Sebastian Gottschall
                <a class="moz-txt-link-rfc2396E" href="mailto:s.gottschall@dd-wrt.com">&lt;s.gottschall@dd-wrt.com&gt;</a><br>
                <b>Sent:</b> Tuesday, March 24, 2020 9:13 PM<br>
                <b>To:</b> <a class="moz-txt-link-abbreviated" href="mailto:dropbear@ucc.asn.au">dropbear@ucc.asn.au</a>
                <a class="moz-txt-link-rfc2396E" href="mailto:dropbear@ucc.asn.au">&lt;dropbear@ucc.asn.au&gt;</a><br>
                <b>Subject:</b> Re: SSH key exchange fails 30-70% of the
                time on Netgear X4S R7800</font>
              <div> </div>
            </div>
            <div>
              <div class="x_x_moz-text-html" lang="x-unicode">
                <p style="margin-top: 0px; margin-bottom: 0px;">if the
                  corruption is caused by a context switch the problem
                  can be caused by the kernel.<br>
                  try the following and disable
                  "CONFIG_KERNEL_MODE_NEON" <br>
                  in the kernel config. this will disable some kernel
                  crypto assembly code<br>
                </p>
                <div class="x_x_moz-cite-prefix">Am 24.03.2020 um 16:11
                  schrieb Matt Johnston:<br>
                </div>
                <blockquote type="cite">
                  <div class="">Good work narrowing down a test case
                    there.</div>
                  <div class="">That's an interesting finding - I guess
                    it might be worth posting on OpenWRT lists/forum to
                    try find other testers.</div>
                  <div class="">Could it be power related if the tight
                    multiplication loop is stressing it somehow? It
                    doesn't seem to be using the Neon instruction for
                    anything apart from loads/stores though - is there
                    something that the compiler should be doing mixing
                    Neon and non-Neon operations?</div>
                  <div class=""><br class="">
                  </div>
                  <div class="">
                    <div>Cheers,</div>
                    <div>Matt</div>
                  </div>
                  <div><br class="">
                  </div>
                  <div>(Your emails got held up being over 100kB, I've
                    trimmed the reply below and let them through.
                    Apologies to everyone for the stale old one that got
                    let through with them just now, I wasn't looking
                    closely)</div>
                  <div><br class="">
                    <blockquote type="cite" class="">
                      <div class="">On Tue 24/3/2020, at 11:23 am,
                        Horshack ‪‬ &lt;<a
                          href="mailto:horshack@live.com" class=""
                          moz-do-not-send="true">horshack@live.com</a>&gt;
                        wrote:</div>
                      <br class="x_x_Apple-interchange-newline">
                      <div class="">
                        <div class="" style="font-style:normal;
                          font-variant-caps:normal; font-weight:normal;
                          letter-spacing:normal; text-align:start;
                          text-indent:0px; text-transform:none;
                          white-space:normal; word-spacing:0px;
                          text-decoration:none;
                          font-family:Calibri,Helvetica,sans-serif;
                          font-size:12pt">
                          I was able to isolate the issue to just a
                          handful of assembly instructions within
                          fast_s_mp_sqr(), related to the squaring loop.
                          I broke that code out into a separate utility
                          that reproduces the issue within a few
                          seconds. The failure is somewhat sensitive to
                          the data pattern and very sensitive to timing,
                          indicating a likely memory/data path issue
                          within my particular router. I'm guessing it's
                          the IPQ8065 and not the SDRAM because I can
                          get it to fail with a tiny data set easily
                          fits within DCACHE. I can alter the frequency
                          of the failure with a single ARM memory
                          barrier instruction, which at first implied a
                          superscalar data ordering condition but the
                          memory barrier also alters the timing through
                          the DCACHE so that is likely the effect it's
                          having. I was able to exclude the VFP/Neon
                          register corruption as the cause with some
                          test code. I also excluded any context
                          switch-speciifc issue by measuring the # of
                          context switches in /proc/&lt;pid&gt;/status
                          and catching a failure where no switches had
                          occurred. I also modified the affinity so the
                          utility runs on just one processor to rule out
                          a specific core having the issue.<br class="">
                        </div>
                        <div class="" style="font-style:normal;
                          font-variant-caps:normal; font-weight:normal;
                          letter-spacing:normal; text-align:start;
                          text-indent:0px; text-transform:none;
                          white-space:normal; word-spacing:0px;
                          text-decoration:none;
                          font-family:Calibri,Helvetica,sans-serif;
                          font-size:12pt">
                          <br class="">
                        </div>
                        <div class="" style="font-style:normal;
                          font-variant-caps:normal; font-weight:normal;
                          letter-spacing:normal; text-align:start;
                          text-indent:0px; text-transform:none;
                          white-space:normal; word-spacing:0px;
                          text-decoration:none;
                          font-family:Calibri,Helvetica,sans-serif;
                          font-size:12pt">
                          I put the source and binary of my utility on
                          github - if anyone on this mailing list has
                          this model router can you give it a try if
                          possible? You only need the ipq8065-sqrbug
                          (binary) and run-ipq8065-sqrbug.sh (script).
                          Here's the link to the repository:<span
                            class="x_x_Apple-converted-space"> </span><a
href="https://github.com/horshack-dpreview/ipq8065-sqrbug" class=""
                            moz-do-not-send="true">https://github.com/horshack-dpreview/ipq8065-sqrbug</a><br
                            class="">
                        </div>
                        <div class="" style="font-style:normal;
                          font-variant-caps:normal; font-weight:normal;
                          letter-spacing:normal; text-align:start;
                          text-indent:0px; text-transform:none;
                          white-space:normal; word-spacing:0px;
                          text-decoration:none;
                          font-family:Calibri,Helvetica,sans-serif;
                          font-size:12pt">
                          <br class="">
                        </div>
                        <div class="" style="font-family:Helvetica;
                          font-size:13px; font-style:normal;
                          font-variant-caps:normal; font-weight:normal;
                          letter-spacing:normal; text-align:start;
                          text-indent:0px; text-transform:none;
                          white-space:normal; word-spacing:0px;
                          text-decoration:none">
                          <div class=""
                            style="font-family:Calibri,Helvetica,sans-serif;
                            font-size:12pt"><br class="">
                          </div>
                          <hr tabindex="-1" class=""
                            style="display:inline-block;
                            width:620.328125px">
                          <div id="x_x_divRplyFwdMsg" dir="ltr" class=""><font
                              class="" style="font-size:11pt"
                              face="Calibri, sans-serif"><b class="">From:</b><span
                                class="x_x_Apple-converted-space"> </span>Horshack
                              ‪‬ &lt;<a href="mailto:horshack@live.com"
                                class="" moz-do-not-send="true">horshack@live.com</a>&gt;<br
                                class="">
                              <b class="">Sent:</b><span
                                class="x_x_Apple-converted-space"> </span>Saturday,
                              March 21, 2020 7:54 AM<br class="">
                              <b class="">To:</b><span
                                class="x_x_Apple-converted-space"> </span><a
                                href="mailto:dropbear@ucc.asn.au"
                                class="" moz-do-not-send="true">dropbear@ucc.asn.au</a><span
                                class="x_x_Apple-converted-space"> </span>&lt;<a
                                href="mailto:dropbear@ucc.asn.au"
                                class="" moz-do-not-send="true">dropbear@ucc.asn.au</a>&gt;<br
                                class="">
                              <b class="">Subject:</b><span
                                class="x_x_Apple-converted-space"> </span>SSH
                              key exchange fails 30-70% of the time on
                              Netgear X4S R7800</font>
                            <div class=""> </div>
                          </div>
                          <div dir="auto" class="">
                            <div dir="ltr" class="">Including mailing
                              list for my last two messages below...<br
                                class="">
                              <div dir="ltr" class=""><br class="">
                                Begin forwarded message:<br class="">
                                <br class="">
                              </div>
                              <blockquote type="cite" class="">
                                <div dir="ltr" class=""><b class="">From:</b><span
                                    class="x_x_Apple-converted-space"> </span>Horshack
                                  ‪‬ &lt;<a
                                    href="mailto:horshack@live.com"
                                    class="" moz-do-not-send="true">horshack@live.com</a>&gt;<br
                                    class="">
                                  <b class="">Date:</b><span
                                    class="x_x_Apple-converted-space"> </span>March
                                  21, 2020 at 7:35:18 AM PDT<br class="">
                                  <b class="">To:</b><span
                                    class="x_x_Apple-converted-space"> </span>Matt
                                  Johnston &lt;<a
                                    href="mailto:matt@ucc.asn.au"
                                    class="" moz-do-not-send="true">matt@ucc.asn.au</a>&gt;<br
                                    class="">
                                  <b class="">Cc:</b><span
                                    class="x_x_Apple-converted-space"> </span>"<a
                                    href="mailto:dropbear@ucc.asn.au"
                                    class="" moz-do-not-send="true">dropbear@ucc.asn.au</a>"
                                  &lt;<a
                                    href="mailto:dropbear@ucc.asn.au"
                                    class="" moz-do-not-send="true">dropbear@ucc.asn.au</a>&gt;<br
                                    class="">
                                  <b class="">Subject:</b><span
                                    class="x_x_Apple-converted-space"> </span><b
                                    class="">Re:  SSH key exchange fails
                                    30-70% of the time on Netgear X4S
                                    R7800</b><br class="">
                                  <br class="">
                                </div>
                              </blockquote>
                              <blockquote type="cite" class="">
                                <div dir="ltr" class="">
                                  <div class=""
                                    style="font-family:Calibri,Helvetica,sans-serif;
                                    font-size:12pt">Disassembly of
                                    fast_s_mp_sqr() and other libtommath
                                    functions reveals gcc is utilizing
                                    the arm NEON SIMD instructions and
                                    registers for calculations involved
                                    with libtommath's mp_word scalar.
                                    Based on the 64-bit word corruption
                                    I see I'm guessing the SIMD
                                    registers aren't being
                                    preserved/restored properly
                                    somewhere, probably during a context
                                    switch, specifically s16–s31
                                    (d8–d15, q4–q7), which AAPCS says
                                    must be preserved and which I see
                                    being used in the disassembly of
                                    fast_s_mp_sqr(). I'lll write some
                                    test code later today to see if this
                                    is the case, and if so, try to track
                                    down where and why the registers
                                    aren't being preserved.<br class="">
                                  </div>
                                  <div class="">
                                    <div class=""
                                      style="font-family:Calibri,Helvetica,sans-serif;
                                      font-size:12pt"><br class="">
                                    </div>
                                    <hr tabindex="-1" class=""
                                      style="display:inline-block;
                                      width:610.53125px">
                                    <div id="x_x_x_divRplyFwdMsg"
                                      dir="ltr" class=""><font class=""
                                        style="font-size:11pt"
                                        face="Calibri, sans-serif"><b
                                          class="">From:</b><span
                                          class="x_x_Apple-converted-space"> </span>Horshack
                                        ‪‬ &lt;<a
                                          href="mailto:horshack@live.com"
                                          class=""
                                          moz-do-not-send="true">horshack@live.com</a>&gt;<br
                                          class="">
                                        <b class="">Sent:</b><span
                                          class="x_x_Apple-converted-space"> </span>Saturday,
                                        March 21, 2020 1:11 AM<br
                                          class="">
                                        <b class="">To:</b><span
                                          class="x_x_Apple-converted-space"> </span>Matt
                                        Johnston &lt;<a
                                          href="mailto:matt@ucc.asn.au"
                                          class=""
                                          moz-do-not-send="true">matt@ucc.asn.au</a>&gt;<br
                                          class="">
                                        <b class="">Cc:</b><span
                                          class="x_x_Apple-converted-space"> </span><a
href="mailto:dropbear@ucc.asn.au" class="" moz-do-not-send="true">dropbear@ucc.asn.au</a>
                                        &lt;<a
                                          href="mailto:dropbear@ucc.asn.au"
                                          class=""
                                          moz-do-not-send="true">dropbear@ucc.asn.au</a>&gt;<br
                                          class="">
                                        <b class="">Subject:</b><span
                                          class="x_x_Apple-converted-space"> </span>Re:
                                        SSH key exchange fails 30-70% of
                                        the time on Netgear X4S R7800</font>
                                      <div class=""> </div>
                                    </div>
                                    <div dir="ltr" class="">
                                      <div class=""
                                        style="font-family:Calibri,Helvetica,sans-serif;
                                        font-size:12pt">
                                        <div class=""
                                          style="font-family:Calibri,Helvetica,sans-serif;
                                          font-size:12pt">I have one of
                                          the failure paths isolated
                                          down to a single corrupt
                                          64-bit word in memory, which
                                          required a significant amount
                                          of code instrumentation to
                                          achieve. I implemented a code
                                          execution history buffer that
                                          gets filled at various
                                          checkpoints within
                                          s_mp_exptmod() and some of the
                                          modules called by it. To
                                          facilitate this history
                                          mechanism I packaged all of
                                          s_mp_exptmod()'s local
                                          variables inside a structure ,
                                          which consists of saving the
                                          local scalar vars in addition
                                          to crc32's of all the mp_int
                                          data structures with a
                                          separate crc32 of the
                                          mp_int.dp payload (data). When
                                          a failure occurs, ie one or
                                          more of the three back-to-back
                                          debug invocations of
                                          s_mp_exptmod yields a
                                          mismatching signed key result,
                                          I  dump out the history
                                          elements for each of the
                                          invocations to determine the
                                          first code checkpoint where
                                          failing invocation departed
                                          from the known correct
                                          invocation.<br class="">
                                        </div>
                                      </div>
                                    </div>
                                  </div>
                                </div>
                              </blockquote>
                            </div>
                          </div>
                        </div>
                      </div>
                    </blockquote>
                    <br class="">
                  </div>
                  <div>*snipped*</div>
                  <div><br class="">
                  </div>
                  <br class="">
                </blockquote>
              </div>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
  </body>
</html>