[security-dev 00007]: Unicode Characters & RC4-HMAC w/ Kerberos&JGSS

Thu Jul 5 14:34:54 UTC 2007

[Crosspost from http://forum.java.sun.com/thread.jspa?threadID=5192018]

I just stumbled over an issue w/ unicode characters in passwords with
JGSS in our Windows domain.

I extracted my machine account password ($machine.acc) using the
windows LSARetrievePrivateData API via the Win32 Python Extensions.
Result: was a unicode string with one catch: it contained the
character '\ude09', a lower surrogate character with no higher
surrogate in front. I don't know whether this is a Python issue, an
issue with the auto-generated password or what. The password is not a
valid unicode string.

Using this string in JGSS fails pre-authentication because the
UTF-16LE encoder in
sun.security.krb5.internal.crypto.dk.DkCrypto#charToUtf16 doesn't like
the sequence and inserts an "error" sequence FDFF.

If however, I use the following encoding, authentication against our
PDC works fine:

DkCrypto:
static byte[] charToUtf16(char[] chars) {
  ByteBuffer buffer = ByteBuffer.allocate(2 *
chars.length).order(ByteOrder.LITTLE_ENDIAN);
  buffer.asCharBuffer().put(chars);
  return buffer.array();
 }

This is agnostic of surrogates and maybe closer to what the RFC describes:

"Each Windows UNICODE character is encoded in little-endian format of
2 octets each."

Maybe someone who's in this a little deeper than me can judge whether
DkCrypto should be changed.

Thanks
Matthias