<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<div class="moz-text-html" lang="x-unicode">
<p>if the corruption is caused by a context switch the problem can
be caused by the kernel.<br>
try the following and disable "CONFIG_KERNEL_MODE_NEON" <br>
in the kernel config. this will disable some kernel crypto
assembly code<br>
</p>
<div class="moz-cite-prefix">Am 24.03.2020 um 16:11 schrieb Matt
Johnston:<br>
</div>
<blockquote type="cite"
cite="mid:1E0D44CA-5341-4C3A-924E-8C6BF850B64A@ucc.asn.au">
<div class="">Good work narrowing down a test case there.</div>
<div class="">That's an interesting finding - I guess it might
be worth posting on OpenWRT lists/forum to try find other
testers.</div>
<div class="">Could it be power related if the tight
multiplication loop is stressing it somehow? It doesn't seem
to be using the Neon instruction for anything apart from
loads/stores though - is there something that the compiler
should be doing mixing Neon and non-Neon operations?</div>
<div class=""><br class="">
</div>
<div class="">
<div>Cheers,</div>
<div>Matt</div>
</div>
<div><br class="">
</div>
<div>(Your emails got held up being over 100kB, I've trimmed the
reply below and let them through. Apologies to everyone for
the stale old one that got let through with them just now, I
wasn't looking closely)</div>
<div><br class="">
<blockquote type="cite" class="">
<div class="">On Tue 24/3/2020, at 11:23 am, Horshack
<<a href="mailto:horshack@live.com" class="">horshack@live.com</a>>
wrote:</div>
<br class="Apple-interchange-newline">
<div class="">
<div style="font-style: normal; font-variant-caps: normal;
font-weight: normal; letter-spacing: normal; text-align:
start; text-indent: 0px; text-transform: none;
white-space: normal; word-spacing: 0px;
-webkit-text-stroke-width: 0px; text-decoration: none;
font-family: Calibri, Helvetica, sans-serif; font-size:
12pt;" class="">I was able to isolate the issue to just
a handful of assembly instructions within
fast_s_mp_sqr(), related to the squaring loop. I broke
that code out into a separate utility that reproduces
the issue within a few seconds. The failure is somewhat
sensitive to the data pattern and very sensitive to
timing, indicating a likely memory/data path issue
within my particular router. I'm guessing it's the
IPQ8065 and not the SDRAM because I can get it to fail
with a tiny data set easily fits within DCACHE. I can
alter the frequency of the failure with a single ARM
memory barrier instruction, which at first implied a
superscalar data ordering condition but the memory
barrier also alters the timing through the DCACHE so
that is likely the effect it's having. I was able to
exclude the VFP/Neon register corruption as the cause
with some test code. I also excluded any context
switch-speciifc issue by measuring the # of context
switches in /proc/<pid>/status and catching a
failure where no switches had occurred. I also modified
the affinity so the utility runs on just one processor
to rule out a specific core having the issue.<br
class="">
</div>
<div style="font-style: normal; font-variant-caps: normal;
font-weight: normal; letter-spacing: normal; text-align:
start; text-indent: 0px; text-transform: none;
white-space: normal; word-spacing: 0px;
-webkit-text-stroke-width: 0px; text-decoration: none;
font-family: Calibri, Helvetica, sans-serif; font-size:
12pt;" class=""><br class="">
</div>
<div style="font-style: normal; font-variant-caps: normal;
font-weight: normal; letter-spacing: normal; text-align:
start; text-indent: 0px; text-transform: none;
white-space: normal; word-spacing: 0px;
-webkit-text-stroke-width: 0px; text-decoration: none;
font-family: Calibri, Helvetica, sans-serif; font-size:
12pt;" class="">I put the source and binary of my
utility on github - if anyone on this mailing list has
this model router can you give it a try if possible? You
only need the ipq8065-sqrbug (binary) and
run-ipq8065-sqrbug.sh (script). Here's the link to the
repository:<span class="Apple-converted-space"> </span><a
href="https://github.com/horshack-dpreview/ipq8065-sqrbug" class="">https://github.com/horshack-dpreview/ipq8065-sqrbug</a><br
class="">
</div>
<div style="font-style: normal; font-variant-caps: normal;
font-weight: normal; letter-spacing: normal; text-align:
start; text-indent: 0px; text-transform: none;
white-space: normal; word-spacing: 0px;
-webkit-text-stroke-width: 0px; text-decoration: none;
font-family: Calibri, Helvetica, sans-serif; font-size:
12pt;" class=""><br class="">
</div>
<div style="caret-color: rgb(0, 0, 0); font-family:
Helvetica; font-size: 13px; font-style: normal;
font-variant-caps: normal; font-weight: normal;
letter-spacing: normal; text-align: start; text-indent:
0px; text-transform: none; white-space: normal;
word-spacing: 0px; -webkit-text-stroke-width: 0px;
text-decoration: none;" class="">
<div style="font-family: Calibri, Helvetica, sans-serif;
font-size: 12pt;" class=""><br class="">
</div>
<hr tabindex="-1" style="display: inline-block; width:
620.328125px;" class="">
<div id="divRplyFwdMsg" dir="ltr" class=""><font
style="font-size: 11pt;" class="" face="Calibri,
sans-serif"><b class="">From:</b><span
class="Apple-converted-space"> </span>Horshack
<<a href="mailto:horshack@live.com" class="">horshack@live.com</a>><br
class="">
<b class="">Sent:</b><span
class="Apple-converted-space"> </span>Saturday,
March 21, 2020 7:54 AM<br class="">
<b class="">To:</b><span
class="Apple-converted-space"> </span><a
href="mailto:dropbear@ucc.asn.au" class="">dropbear@ucc.asn.au</a><span
class="Apple-converted-space"> </span><<a
href="mailto:dropbear@ucc.asn.au" class="">dropbear@ucc.asn.au</a>><br
class="">
<b class="">Subject:</b><span
class="Apple-converted-space"> </span>SSH key
exchange fails 30-70% of the time on Netgear X4S
R7800</font>
<div class=""> </div>
</div>
<div dir="auto" class="">
<div dir="ltr" class="">Including mailing list for my
last two messages below...<br class="">
<div dir="ltr" class=""><br class="">
Begin forwarded message:<br class="">
<br class="">
</div>
<blockquote type="cite" class="">
<div dir="ltr" class=""><b class="">From:</b><span
class="Apple-converted-space"> </span>Horshack
<<a href="mailto:horshack@live.com"
class="">horshack@live.com</a>><br class="">
<b class="">Date:</b><span
class="Apple-converted-space"> </span>March
21, 2020 at 7:35:18 AM PDT<br class="">
<b class="">To:</b><span
class="Apple-converted-space"> </span>Matt
Johnston <<a href="mailto:matt@ucc.asn.au"
class="">matt@ucc.asn.au</a>><br class="">
<b class="">Cc:</b><span
class="Apple-converted-space"> </span>"<a
href="mailto:dropbear@ucc.asn.au" class="">dropbear@ucc.asn.au</a>"
<<a href="mailto:dropbear@ucc.asn.au"
class="">dropbear@ucc.asn.au</a>><br
class="">
<b class="">Subject:</b><span
class="Apple-converted-space"> </span><b
class="">Re: SSH key exchange fails 30-70% of
the time on Netgear X4S R7800</b><br class="">
<br class="">
</div>
</blockquote>
<blockquote type="cite" class="">
<div dir="ltr" class="">
<div style="font-family: Calibri, Helvetica,
sans-serif; font-size: 12pt;" class="">Disassembly
of fast_s_mp_sqr() and other libtommath
functions reveals gcc is utilizing the arm
NEON SIMD instructions and registers for
calculations involved with libtommath's
mp_word scalar. Based on the 64-bit word
corruption I see I'm guessing the SIMD
registers aren't being preserved/restored
properly somewhere, probably during a context
switch, specifically s16–s31 (d8–d15, q4–q7),
which AAPCS says must be preserved and which I
see being used in the disassembly of
fast_s_mp_sqr(). I'lll write some test code
later today to see if this is the case, and if
so, try to track down where and why the
registers aren't being preserved.<br class="">
</div>
<div class="">
<div style="font-family: Calibri, Helvetica,
sans-serif; font-size: 12pt;" class=""><br
class="">
</div>
<hr tabindex="-1" style="display:
inline-block; width: 610.53125px;" class="">
<div id="x_divRplyFwdMsg" dir="ltr" class=""><font
style="font-size: 11pt;" class=""
face="Calibri, sans-serif"><b class="">From:</b><span
class="Apple-converted-space"> </span>Horshack
<<a href="mailto:horshack@live.com"
class="">horshack@live.com</a>><br
class="">
<b class="">Sent:</b><span
class="Apple-converted-space"> </span>Saturday,
March 21, 2020 1:11 AM<br class="">
<b class="">To:</b><span
class="Apple-converted-space"> </span>Matt
Johnston <<a
href="mailto:matt@ucc.asn.au" class="">matt@ucc.asn.au</a>><br
class="">
<b class="">Cc:</b><span
class="Apple-converted-space"> </span><a
href="mailto:dropbear@ucc.asn.au"
class="">dropbear@ucc.asn.au</a> <<a
href="mailto:dropbear@ucc.asn.au"
class="">dropbear@ucc.asn.au</a>><br
class="">
<b class="">Subject:</b><span
class="Apple-converted-space"> </span>Re:
SSH key exchange fails 30-70% of the time
on Netgear X4S R7800</font>
<div class=""> </div>
</div>
<div dir="ltr" class="">
<div style="font-family: Calibri, Helvetica,
sans-serif; font-size: 12pt;" class="">
<div style="font-family: Calibri,
Helvetica, sans-serif; font-size: 12pt;"
class="">I have one of the failure paths
isolated down to a single corrupt 64-bit
word in memory, which required a
significant amount of code
instrumentation to achieve. I
implemented a code execution history
buffer that gets filled at various
checkpoints within s_mp_exptmod() and
some of the modules called by it. To
facilitate this history mechanism I
packaged all of s_mp_exptmod()'s local
variables inside a structure , which
consists of saving the local scalar vars
in addition to crc32's of all the mp_int
data structures with a separate crc32 of
the mp_int.dp payload (data). When a
failure occurs, ie one or more of the
three back-to-back debug invocations of
s_mp_exptmod yields a mismatching signed
key result, I dump out the history
elements for each of the invocations to
determine the first code checkpoint
where failing invocation departed from
the known correct invocation.<br
class="">
</div>
</div>
</div>
</div>
</div>
</blockquote>
</div>
</div>
</div>
</div>
</blockquote>
<br class="">
</div>
<div>*snipped*</div>
<div><br class="">
</div>
<br class="">
</blockquote>
</div>
</body>
</html>