Null-terminated Unicode strings in java.io on Windows
Roman Kennke
roman.kennke at aicas.com
Mon Jan 21 22:01:04 UTC 2008
Hi Alan,
Am Montag, den 21.01.2008, 21:52 +0000 schrieb Alan Bateman:
> Roman Kennke wrote:
> > Hi,
> >
> > I'm trying to understand a piece of code in java.io . Let me try to
> > explain:
> >
> > When you look into WinNTFileSystem.c in the method
> > getBooleanAttributes(), you see that the file object is converted to a
> > WCHAR* using fileToNTPath(). In io_util.c, fileToNTPath(), the filename
> > string is extracted from the File object, and passed to pathToNTPath().
> >
> > This is where it gets interesting. The pathToNTPath() function first
> > converts the string into a jchar* using the macro WITH_UNICODE_STRING.
> > This macro uses GetStringChars() to do this conversion. Now this is
> > where I'm lost. Java strings are not null-terminated, and neither are
> > the jchar* returned by GetStringChars() (which is in itself a long
> > discussed problem in the JNI spec, but that's another story). But back
> > in pathToNTPath() this jchar* is treated just like a null-terminated
> > string, for example, we call wcslen() to determine its length, which
> > relies on the string beeing null-terminated. Now I assume that this
> > works somehow, and I only see the following options:
> > 1. There's something in this picture that I don't see. Maybe the string
> > ends up null-terminated somehow?
> > 2. Maybe this works by accident because Hotspot terminates strings with
> > a null internally?
> > 3. Or this is a serious bug, that for some reason doesn't bomb all the
> > time. After all, it _does_ bomb in the JamaicaVM, where I'm trying to
> > port the code to...
> >
> > Any ideas? I'd be happy to get an explanation for this problem.
> >
> > Cheers, Roman
> >
> The GetStringChars implementation in HotSpot always returns a copy that
> is length+1 and zero terminated. There is a long-standing bug to clarify
> the JNI specification on this topic. I believe it should say that the
> returned array of Unicode characters is not required to be zero
> terminated and that one should use GetStringLength to determine the
> length. Steve Bohne (cc'ed) has done the recent maintenance on the JNI
> spec and may wish to comment. In any case, I did a quick cscope and
> aside from java.io, it only appears to impact a small number of places.
So this is indeed a bug, right? Do you think it makes sense to go out
and fix it?
/Roman
--
Dipl.-Inform. (FH) Roman Kennke, Software Engineer, http://kennke.org
aicas Allerton Interworks Computer Automated Systems GmbH
Haid-und-Neu-Straße 18 * D-76131 Karlsruhe * Germany
http://www.aicas.com * Tel: +49-721-663 968-0
USt-Id: DE216375633, Handelsregister HRB 109481, AG Karlsruhe
Geschäftsführer: Dr. James J. Hunt
More information about the core-libs-dev
mailing list