RFR: 8329961: Buffer overflow in os::Linux::kernel_version [v3]

Thomas Stuefe stuefe at openjdk.org
Wed Apr 10 12:29:09 UTC 2024


On Wed, 10 Apr 2024 11:58:13 GMT, Johan Sjölen <jsjolen at openjdk.org> wrote:

> > > > Hmm, wouldn't sscanf not be simpler and safer? No need to factor out the parser. IMHO no need to even add a gtest since parsing would be really simple and not loop based. E.g.
> > > > ```
> > > > if (sscanf(release, "%d.%d", &major, &minor) != 2) {
> > > >   log_warning blabla
> > > > }
> > > > ```
> > > > 
> > > > 
> > > >     
> > > >       
> > > >     
> > > > 
> > > >       
> > > >     
> > > > 
> > > >     
> > > >   
> > > > As bonus, you avoid accidental conversion from hex numbers and such that strotol provides and that we don't really want here.
> > > 
> > > 
> > > Hi, according to C11 standard (and my man pages) it is UB to call the scanf-family of functions with "invalid" data and `strtol` is recommended instead. So, unfortunately, it might not be safer.
> > 
> > 
> > Can you point me to the relevant man page? If that were not to work, it would drastically limit the usefulness of scanf, and a lot of code in hotspot would be UB.
> > My manpage:
> > ```
> > The format string consists of a sequence of directives which describe how to process the sequence of input characters. If processing of a directive fails, no further input is read, and scanf() returns. A "failure" can be either of the following: input failure, meaning that input characters were unavailable, or matching failure, meaning that the input was inappropriate (see below). 
> > ```
> > 
> > 
> >     
> >       
> >     
> > 
> >       
> >     
> > 
> >     
> >   
> > Matching failures are part of the API.
> > > Another thing: We shouldn't call `uname`. Reading `/proc/sys/kernel/osrelease` is sufficient.
> > 
> > 
> > I agree with @robehn, uname is Posix and portable. May also be cached inside glibc, saving us a proc fs read.
> 
> See BUGS section: https://man7.org/linux/man-pages/man3/sscanf.3.html
> 

Okay that. Note that deprecation of numerical conversions has been hugely controversial, and it was a step taken by glibc manpage maintainers. This is just about numerical overflow. E.g. 3333333333333333333333333333333.333333333333333333333333333333 will overflow int range, and return true. Are we honestly concerned about this here?

By avoiding sscanf, we swap very readable and simple-hence-safe code with manual parsing, which is both unreadable and, as this issue shows, in practice a lot unsafer than had we used sscanf to begin with.

> OK, I'm probably too paranoid regarding `uname`. I don't like that we've got a C string and there's no way of knowing whether it actually ends in a NUL byte or not. At least with a file you know that it has a specific length and can account for that. We probably have to assume that if someone is being malicious with `uname` then we're compromised regardless.

If you are paranoid, do this:


struct {
  utsname uts; char c;
} v;


then set v.c to 0 before calling uname.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/18697#issuecomment-2047409656


More information about the hotspot-runtime-dev mailing list