RFR: 8329961: Buffer overflow in os::Linux::kernel_version [v3]

Wed Apr 10 13:33:13 UTC 2024

On Wed, 10 Apr 2024 12:26:30 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:

> > > > > Hmm, wouldn't sscanf not be simpler and safer? No need to factor out the parser. IMHO no need to even add a gtest since parsing would be really simple and not loop based. E.g.
> > > > > ```
> > > > > if (sscanf(release, "%d.%d", &major, &minor) != 2) {
> > > > >   log_warning blabla
> > > > > }
> > > > > ```
> > > > > 
> > > > > 
> > > > >     
> > > > >       
> > > > >     
> > > > > 
> > > > >       
> > > > >     
> > > > > 
> > > > >     
> > > > >   
> > > > > As bonus, you avoid accidental conversion from hex numbers and such that strotol provides and that we don't really want here.
> > > > 
> > > > 
> > > > Hi, according to C11 standard (and my man pages) it is UB to call the scanf-family of functions with "invalid" data and `strtol` is recommended instead. So, unfortunately, it might not be safer.
> > > 
> > > 
> > > Can you point me to the relevant man page? If that were not to work, it would drastically limit the usefulness of scanf, and a lot of code in hotspot would be UB.
> > > My manpage:
> > > ```
> > > The format string consists of a sequence of directives which describe how to process the sequence of input characters. If processing of a directive fails, no further input is read, and scanf() returns. A "failure" can be either of the following: input failure, meaning that input characters were unavailable, or matching failure, meaning that the input was inappropriate (see below). 
> > > ```
> > > 
> > > 
> > >     
> > >       
> > >     
> > > 
> > >       
> > >     
> > > 
> > >     
> > >   
> > > Matching failures are part of the API.
> > > > Another thing: We shouldn't call `uname`. Reading `/proc/sys/kernel/osrelease` is sufficient.
> > > 
> > > 
> > > I agree with @robehn, uname is Posix and portable. May also be cached inside glibc, saving us a proc fs read.
> > 
> > 
> > See BUGS section: https://man7.org/linux/man-pages/man3/sscanf.3.html
> 
> Okay that. Note that deprecation of numerical conversions has been hugely controversial, and it was a step taken by glibc manpage maintainers. This is just about numerical overflow. E.g. 3333333333333333333333333333333.333333333333333333333333333333 will overflow int range, and return true. Are we honestly concerned about this here?
> 
> By avoiding sscanf, we swap very readable and simple-hence-safe code with manual parsing, which is both unreadable and, as this issue shows, in practice a lot unsafer than had we used sscanf to begin with.
> 
> > OK, I'm probably too paranoid regarding `uname`. I don't like that we've got a C string and there's no way of knowing whether it actually ends in a NUL byte or not. At least with a file you know that it has a specific length and can account for that. We probably have to assume that if someone is being malicious with `uname` then we're compromised regardless.
> 
> If you are paranoid, do this:
> 
> ```
> struct {
>   utsname uts; char c;
> } v;
> ```
> 
> then set v.c to 0 before calling uname.

Aha, it's just overflow issues? Well, then let's just use `sscanf`. Thanks for the hint, interesting code pattern! That won't work as the string isn't necessarily embedded within the struct, but it's an interesting strategy nonetheless. I'll just trust that it ends at some point.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/18697#issuecomment-2047551964