scp of sysfs file repeats data

Craig McQueen craig.mcqueen at innerrange.com
Wed Jun 24 11:35:42 AWST 2015


> From: Reimar Döffinger [mailto:Reimar.Doeffinger at gmx.de]
> 
> On 24.06.2015, at 01:49, Craig McQueen <craig.mcqueen at innerrange.com>
> wrote:
> 
> >> From: Peter Korsgaard [mailto:jacmet at gmail.com] On Behalf Of Peter
> >>
> >>>>>>> "Peter" == Peter Korsgaard <peter at korsgaard.com> writes:
> >>
> >>>>>>> "Matt" == Matt Johnston <matt at ucc.asn.au> writes:
> >>>> I see what you mean. I'll update scp to OpenSSH's latest -  >>
> >>>> there are a
> >> few changes to be merged.
> >>
> >>> FYI, openssh scp has a similar problem here:
> >>
> >>> open("vendor", O_RDONLY|O_NONBLOCK)     = 3
> >>> fstat(3, {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0
> >>> fcntl(3, F_GETFL)                       = 0x8800 (flags
> >> O_RDONLY|O_NONBLOCK|O_LARGEFILE)
> >>> fcntl(3, F_SETFL, O_RDONLY|O_LARGEFILE) = 0
> >>> write(6, "C0444 4096 vendor\n", 18)     = 18
> >>> read(3, "0x8086\n", 4096)               = 7
> >>> read(3, "", 4089)                       = 0
> >>> write(6,
> >> "0x8086\n\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
> >> 4096) =
> >> 4096
> >>
> >>> This is with openssh 6.7p1-6.
> >>
> >> But thinking further about it, I guess that's just an artifact of the
> >> protocol and there isn't any way around it.
> >
> > I guess the question is, should the copied file's size be that reported in the
> directory listing, or what is actually successfully read from the file?
> >
> > If I use 'cp' to copy a sysfs file, then the copy contains the bytes actually
> read. E.g.:
> >
> > # ls -al /sys/class/net/eth0/address
> > -r--r--r--    1 root     root          4096 Jun 23 23:43 /sys/class/net/eth0/address
> > # cp /sys/class/net/eth0/address .
> > # ls -al
> > drwxr-xr-x    1 root     root          4096 Jun 23 23:43 .
> > drwxr-xr-x    1 root     root          4096 Jun 23 06:51 ..
> > -r--r--r--    1 root     root            18 Jun 23 23:43 address
> >
> > So with 'cp' I get a file length of 18 (the bytes successfully read), not 4096.
> >
> > As a user, I expect 'scp' to do the same thing. Does the protocol make that
> difficult somehow?
> 
> Well, as I understand it the file size is sent first.
>
> If you cannot trust stat, you'd have read the whole file first and even buffer it
> (in case it changes - though I guess that race condition also exists with current
> stat usage).

So in that case, 'scp' wouldn't work well with:

* Files that are appended or truncated while they're being 'scp'ed.
* sysfs files.
* procfs files.
* Named pipes.
* char devices.

It does sound as though the protocol was designed only with static regular files in mind. That's a pity.

> I see a few things that could (and probably should) be done:
> 1) if it works better, use lseek to end to get file size instead of stat

Not sure if that would work better.

> 2) when read fails, zero the buffer (so at least there are only extra 0s, not
> 'random' data)

That seems important, otherwise you could have a security issue vaguely like POODLE.

> 3) if the file size is less than some limit (64kB? - would be easier to make this
> the same as the normal buffer size, 4kB is a bit small anyway) do not rely on
> stat but just read the whole file to get the size without race conditions. Could
> be done by just always before anything else reading 64kB and if the read is
> short just skip anything else.

That would fix some use-cases, but perhaps make other use-case failures more obscure and surprising.

> 4) print a warning when read fails (if the protocol allows it)

Warnings are good as long as there is a human running it who can read it; not so good if it's a component of a larger software system.

> Sorry for the long text, especially as I have no intention to actually work on it.
> Hope someone is motivated to fix it :)

It would be great if the situation could be improved. It is certainly best for users to be able to trust essential system tools like scp.

Having file size sent in the protocol up-front does sound like an awkward protocol design, which cramps the solution space, unfortunately.

-- 
Craig McQueen



More information about the Dropbear mailing list