delay question

Rob Landley rob at landley.net
Thu Aug 25 22:28:47 WST 2011


On 08/24/2011 12:00 PM, Guylhem Aznar wrote:
> Hello
> 
> On Wed, Aug 24, 2011 at 07:46, Matt Johnston <matt at ucc.asn.au> wrote:
>> A slow CPU could take 5 seconds to do the big-number maths
>> for the key exchange. Is CPU usage high, and what processor
>> is it using?
> 
> CPU is 85% idle, and is:
> 
> # cat /proc/cpuinfo
> system type             : Ikanos Fusiv Core
> processor               : 0
> cpu model               : MIPS 24K V4.12
> BogoMIPS                : 330.75
...
> ~guylhem/sbin/strace -f -ff -r ~guylhem/bin/chroot ~guylhem
> /bin/dropbear -b /etc/dropbear/banner.txt -d
> /etc/dropbear/sshd_dsa_host_key -r /etc/dropbear/sshd_rsa_host_key -m
> 2 -F
> I see (xxx is the client ipv6):
> 
> [pid 12140]      6.115917 accept(3, {sa_family=AF_INET6,
> sin6_port=htons(55563), inet_pton(AF_INET6, "xxxxxxxxxxxxx",
> &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, [28]) = 5
> [pid 12140]      0.001617 pipe([6, 7])  = 6
> [pid 12140]      0.001625 fork(Process 12171 attached
> )        = 12171
> [pid 12140]      0.006739 close(7)      = 0
> [pid 12140]      0.001669 close(5)      = 0
> [pid 12140]      0.000427 _newselect(7, [3 6], NULL, NULL, NULL) = 1 (in [6])
> [pid 12140]      3.723798 close(6)      = 0
> 
> The 6 seconds are the delay between starting the strace and the ssh
> client.

You have a 6 second delay starting the ssh client, before it actually
makes a system call for strace to report?

Are you, perchance, running on NFS?

> The 3.72 second in the close seems to be the delay I am
> experiencing (tested several times)

That delay is it waiting on a pipe to the child process.  Might be nice
to know which child process

> 
> There is do difference with a ssh -4 or a ssh -6.
> 
> With tcpdump, I see a packet containing diffie-hellman-group-1 sent by
> the dropbear server, 2 packets returned by the client, one being sent
> by the dropbear server .... then the next one being send approx 4
> seconds later. I can send you a .cap if you want
> 
> I am not sure what to make of these results. The "close" seems to be
> taking a lot of time.

On NFS close is a horrible synchronous operation with a built in flush.
 Running on NFS is pretty much the only time where close() can return a
nonzero error code, which is funny because what are you supposed to DO
about it?

Don't use NFS.  Use v9fs with diod, or network block devices or ANYTHING
else really.  Even Samba/cifs is better than NFS.  Random performance
suckage with NFS is more or less normal.  (Random failures with NFS are
more or less normal.)

Rob


More information about the Dropbear mailing list