Dropbear processes getting into uninterruptible I/O process "D" state

Matt Johnston matt at ucc.asn.au
Tue Oct 15 22:30:55 AWST 2019


Hi Binny,

I think regardless of what Dropbear's doing with pipes (closed sessions etc), there is probably something wrong with the Linux kernel.
As far as I know userspace can't trigger D state even intentionally (I'd be interested if anyone knows a way though).
-K is unrelated, that just sends some SSH traffic at a certain interval.

If you can reproduce it, "echo t > /proc/sysrq-trigger" might be helpful - you can look at "dmesg" for stack traces of all threads and see what the stuck processes are doing in the kernel.
Was there anything unusual in dmesg on the problematic machine?

Cheers,
Matt




> On Mon 14/10/2019, at 12:44 pm, Jeshan, Binny <Binny.Jeshan at netscout.com> wrote:
> 
> Dear Matt,
>  
> Thank you for your response. Here is our situation said below...
> This has happened to one or two of those users out of a thousand such devices that are deployed. We have never seen this reported since many years now. When the issue was reported, we could only take out the below logs from the user unit. User is in an end customer deployment.
> We do not use NFS, therefore when we checked the disk stats of all processes, nothing was holding on to it, and all other disk r/w operations were working normal. 
> We do not have strace in our target device to debug, nor the problem is reproducible to us easily. It happened only once, and to really debug the problem we have to simulate such condition in our lab. 
>  
> As of now after reading some user guides, I wonder if the below precaution should be added. But I am not sure whether it will help when such situations where Open pipes exist in the process. I believe they are from common_session_init() and session_loop(). 
>  
> Will my approach be right if I use this below flag -K in my situation of drop bear process?  I have a feeling that the Open IPC pipes is also doing sort of an I/O operation that leads to this state of D. And the said user has a habit of not closing the SSH session properly with an exit or logout from the terminal, leaving it to idle-close or abrupt close the terminal.
>  
> -K timeout_seconds
> Ensure that traffic is transmitted at a certain interval in seconds. This is useful for working around firewalls or routers that drop connections after a certain period of inactivity. The trade-off is that a session may be closed if there is a temporary lapse of network connectivity. A setting if 0 disables keepalives.
>  
> Please advise your thoughts with the above. I am still working on recreating the problem by just forcing these pipes that are kept open to be there forever in that state. Any other suggestions may help.
>  
>  
> Thanks for your help again,
> Binny
>  
> From: Matt Johnston <matt at ucc.asn.au <mailto:matt at ucc.asn.au>> 
> Sent: Wednesday, October 9, 2019 6:56 PM
> To: Jeshan, Binny <Binny.Jeshan at netscout.com <mailto:Binny.Jeshan at netscout.com>>; dropbear at ucc.asn.au <mailto:dropbear at ucc.asn.au>; rwoodsmall at gmail.com <mailto:rwoodsmall at gmail.com>
> Subject: Re: Dropbear processes getting into uninterruptible I/O process "D" state
>  
> This message originated outside of NETSCOUT. Do not click links or open attachments unless you recognize the sender and know the content is safe.
> Hi Binny,
> 
> I don't think it's related to 2019.78
> 
> Usually D state means something else is wrong on the system, bad NFS mounts or IO devices. Can you strace the stuck processes?
> 
> Cheers,
> Matt
> 
> On 9 October 2019 10:53:56 am GMT+07:00, "Jeshan, Binny" <Binny.Jeshan at netscout.com <mailto:Binny.Jeshan at netscout.com>> wrote:
> Dear all in the mailing list,
>  
> We are seeing a problem with 2018.76 version of the dropbear SSH server which exhibits the following:
>  
>  
> The processes go into uninterruptible “D” state and lies there, couldn’t be killed nor shutdown.
> 
>  
> The stuck processes in the bad state show the below behavior of two open pipes that are not closed.
>  
> 
>  
> The last process with PID 394 seems to work fine, and has no open IPCs.
>  
> When I look at the release notes of 2019.78, I see this: “2019.78 - 27 March 2019
>  
> - Fix dbclient regression in 2019.77. After exiting the terminal would be left
>   in a bad state. Reported by Ryan Woodsmall
>>  
> Is the problem that we see is same as you fixed? Please suggest. Your feedback and ideas will help.
>  
>  
> Thanks
> Binny

-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://lists.ucc.gu.uwa.edu.au/pipermail/dropbear/attachments/20191015/803e127a/attachment.htm 


More information about the Dropbear mailing list