RFR 6722928: Support SSPI as a native GSS-API provider

Wed Dec 12 00:40:54 UTC 2018

My review of the first 16% of https://cr.openjdk.java.net/~weijun/6722928/webrev.02/ is below.

 - First, I noticed this in the gssapi.h in the JDK:

    typedef void * gss_name_t;
    typedef void * gss_cred_id_t;
    typedef void * gss_ctx_id_t;

   This is not good!  We long ago discovered that this is MUCH better as it
   engages C type safety rather than defeating it:

    typedef struct gss_name gss_name_t;
    typedef struct gss_cred gss_cred_id_t;
    typedef struct gss_ctx gss_ctx_id_t;

   RFC 2744 allows this.  When we made that change in Solaris we
   actually found at least one bug that was only not catastrophic by
   sheer luck.  Especially now that you're implementing a GSS-API
   provider in the OpenJDK, you'll really want to make this change to
   the OpenJDK's gssapi.h.

   Another benefit of this is that you then get to define those structs
   in your GSS provider / mechanism implementation, and then you don't
   need casts:

    struct gss_name {
        SEC_WCHAR* name;
    };

 - Comments on sspi.cpp are inline below:

    /*
     * Copyright (c) 2018, Oracle and/or its affiliates. All rights reserved.
     * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
     *
     * This code is free software; you can redistribute it and/or modify it
     * under the terms of the GNU General Public License version 2 only, as
     * published by the Free Software Foundation.  Oracle designates this
     * particular file as subject to the "Classpath" exception as provided
     * by Oracle in the LICENSE file that accompanied this code.
     *
     * This code is distributed in the hope that it will be useful, but WITHOUT
     * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
     * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
     * version 2 for more details (a copy is included in the LICENSE file that
     * accompanied this code).
     *
     * You should have received a copy of the GNU General Public License version
     * 2 along with this work; if not, write to the Free Software Foundation,
     * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA.
     *
     * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA
     * or visit www.oracle.com if you need additional information or have any
     * questions.
     */

    // This library is client-side only, and only supports the default credentials.
    // It only speaks krb5 internally, and SPNEGO is supported by creating its own
    // NetTokenInit and parsing incoming NegTokenResp tokens. This ensures no other
    // mechanism (Ex: NTLM) is chosen.
    //
    // This library can be built directly with the following command:
    //   cl -I %OPENJDK%\src\java.security.jgss\share\native\libj2gss\ sspi.cpp \
    //      -link -dll -out:sspi_bridge.dll

    #define UNICODE
    #define _UNICODE

    #include <windows.h>
    #include <stdlib.h>
    #include <stdio.h>
    #include <string.h>
    #include <Strsafe.h>

    #define GSS_DLL_FILE
    #include <gssapi.h>

    #define SECURITY_WIN32
    #include <sspi.h>

    #pragma comment(lib, "secur32.lib")

    // A debugging macro
    #define PP(fmt, ...) \
            if (trace) { \
--->            fprintf(stdout, "[SSPI:%ld] "fmt"\n", __LINE__, ##__VA_ARGS__); \
--->            fflush(stdout); \
            }

   Why stdout instead of stderr?

    #define SEC_SUCCESS(Status) (*minor_status = Status, (Status) >= 0)

--->#ifdef __cplusplus
    extern "C" {
    #endif /* __cplusplus */

   The file extension is .cpp.

    // When KRB5_TRACE is set, debug info goes to stdout. The value is ignored.
--->char* trace = getenv("KRB5_TRACE");

   Global variable initialization with non-const expressions is C++ but
   not valid C.  This is inside the extern "C" block.  How does this
   build?

   If you want this to be C, make the file extension .c and make sure it
   builds with a C compiler.

   A thread-local `trace' variable would be better anyways, then you
   could toggle tracing by setting/clearing the env var (the JDK doesn't
   really give you a way to do this, but the JGSS JNI shim could arrange
   to expose the debug property to the native C GSS provider somehow,
   such as by setting/clearing the env var).

    void
--->dump(LPSTR title, PBYTE data, DWORD len)

   You're only using dump() to dump a gss_OID.  The length of a gss_OID
   is an OM_uint32 (which can be larger than uint32_t, but here it's
   always the same as uint32_t).  Why not just use size_t instead of
   DWARD?

   For C constant string literals I'd use const char * instead of LPSTR.
   LPSTR is not const.  I like const correctness.

    {
        if (trace) {
            printf("==== %s ====\n", title);
            for (DWORD i = 0; i < len; i++) {
                if (i != 0 && i % 16 == 0) {
                    printf("\n");
                }
                printf("%02X ", *(data + i) & 0xff);
            }
            printf("\n");
        }
    }

    gss_OID_desc KRB5_OID = {9, "\x2a\x86\x48\x86\xf7\x12\x01\x02\x02"};
    gss_OID_desc SPNEGO_OID = {6, "\x2b\x06\x01\x05\x05\x02"};
    gss_OID_desc USER_NAME_OID = {10, "\x2a\x86\x48\x86\xf7\x12\x01\x02\x01\x01"};
    gss_OID_desc KRB5_NAME_OID = {10, "\x2a\x86\x48\x86\xf7\x12\x01\x02\x02\x01"};
    gss_OID_desc HOST_SERVICE_NAME_OID = {10, "\x2a\x86\x48\x86\xf7\x12\x01\x02\x01\x04"};
    gss_OID_desc EXPORT_NAME_OID = {6, "\x2b\x06\x01\x05\x06\x04"};

   I've checked KRB5_NAME_OID too, and it's correct.

    // gss_name_t is Name*. type is always KRB5_NAME
--->typedef struct {
        SEC_WCHAR* name;
    } Name;

   See commentary above about gssapi.h's declarations of GSS handle
   types.

   You should change gss_name_t to be a typedef of struct gss_name *,
   then you should change this to be

       struct gss_name {
           SEC_WCHAR* name;
       };

   You won't need to case any gss_name_t values.

    // gss_ctx_id_t is Context*
--->typedef struct {
        CredHandle* phCred;
        CtxtHandle hCtxt;
        DWORD cbMaxMessage;
        SecPkgContext_Sizes SecPkgContextSizes;
        SecPkgContext_NativeNames nnames;
        BOOLEAN established;
    } Context;

   See above commentary.

    // gss_cred_id_t is Credential*
--->typedef struct {
        CredHandle* phCred;
        long time;
    } Credential;

   See above commentary.

--->/* This section holds supporting functions that are not exported */

   Why not make these functions static then?

    // Prepend ASN.1 tag+length to data.
    //
    // [in, out] data: the data, which already had enough allocated space before
    //                 its head when called. Head moved backwards after called.
--->// [in, out] pLen: a counter to maintain
--->// [in]       len: the ASN.1 length (MUST be less than 2^16)

   Can you add support for up to 2^24?

    // [in]       tag: the ASN.1 tag
    void
--->add_der_head(char** data, int* pLen, int len, int tag)

   I don't like that there's no size of the buffer pointed to by *data.

   I'd rather have arguments for: a) the `size_t' length of `*data', b)
   the `size_t *' number of bytes that can be prepended to `*data'.

   And add_der_head() should have a non-void return value by which to
   indicate that it could not write because there wasn't enough room or
   because the length to write was bigger than the limit you implemented
   for DER length encoding.

   `tag' should be unsigned.

    {
        char* pos = *data;
        int lenlen;
        if (len <= 0x7f) {
            pos[-1] = (char)len;
            lenlen = 1;
        } else if (len < 256) {
            pos[-1] = (char)len;
            pos[-2] = (char)0x81;
            lenlen = 2;
        } else {
            pos[-1] = (char)(len & 0xff);
            pos[-2] = (char)(len >> 8);
            pos[-3] = (char)0x82;
            lenlen = 3;
        }
        pos[-1-lenlen] = (char)tag;
        *pLen += 1 + lenlen;
        *data = pos - 1 - lenlen;
    }

   My UNTESTED version, written here as a stream of thought and
   attempting to support all of the DER tag and length encoding rules:

        typedef enum {
            UNIVERSAL = 0,
            APPLICATION,
            CONTEXT,
            PRIVATE
        } asn1_tag_class;

        // Prepend ASN.1 tag+length to data.  See x.690 sections 8.1.2.x
        // and 8.1.3.x.
        //
        // x.690:
        //
        // https://www.itu.int/rec/dologin_pub.asp?lang=e&id=T-REC-X.690-199407-S!!PDF-E&type=items
        //
        // [in, out]   data: `*data' is the data to prepend a DER Tag&Length
        //                   to, and which has `*pLen' bytes of room
        //                   available in front; mutated to point to the
        //                   prepended DER Tag
        // [in, out]   pLen: `*pLen' is the number of bytes of room
        //                   available in front of `*data'; mutated to
        //                   subtract the number of bytes prepended
        // [in]         len: the ASN.1 length (MUST be less than 2^16)
        // [in]         tag: the ASN.1 tag    (MUST be less than 30)
        // [in]       class: the class of the ASN.1 tag
        // [in] constructed: whether `*data' is a SEQUENCE or SET
        //                   encoding
        //
        // returns: number of bytes added (zero on failure)
        size_t
        add_der_head(unsigned char **data, size_t *pLen, size_t len, unsigned int tag,
                     asn1_tag_class class, int constructed)
        {
            unsigned char *pos = *data;
            size_t lenNeeded = 1;
            size_t tagNeeded = 1;
            int lenOctect1;
            int tagOctect1 = /* first three bits of first tag byte, x.690 section 8.1.2.3 */
                             (class & 0x03) << 6 |
                             (constructed ? 1 << 5 : 0);

            if (len <= 0x7f) {
                /* Short form definite length, x.690 section 8.1.3.4 */
                lenOctect1 = 0x80 | len;
            } else {
                size_t bits = 0;
                size_t n = len;

                /* Long form definite length, x.690 section 8.1.3.5 */
                while (n) {
                    lenNeeded++;
                    bits = (bits << 1) | 1;
                    n >>= 8;
                }

                if (bits > 0x7e)
                    return 0;
                lenOctect1 = 0x80 + bits;
            }

            if (tag <= 30) {
                /* Small tag, x.690 section 8.1.2.3 */
                tagOctect1 |= tag;
            } else {
                size_t n = tag;

                /* Large tag, x.690 section 8.1.2.4 */
                tagOctect1 |= 0x1f;
                while (n) {
                    tagNeeded++;
                    n >>= 7;
                }
            }

            if (*pLen < lenNeeded + tagNeeded)
                return 0;

            *pLen -= (lenNeeded + tagNeeded);

            while (lenNeeded > 1 && len) {
                *(pos--) = len & 0xff;
                len >>= 8;
            }
            *(pos--) = lenOctect1;

            while (tagNeeded > 1 && tag) {
                *(pos--) = tag & 0x7f;
                tag >>= 7;
            }
            *(pos--) = tagOctect1;

            *data = pos;
            return tagNeeded + lenNeeded;
        }

    // Prepend bytes to data.
    //
    // [in, out] data: the data, which already had enough allocated space before
    //                 its head when called. Head moved backwards after called.
    // [in, out] pLen: a counter to maintain
    // [in]        in: a block of bytes
    // [in]       len: length of in
    void
--->add_data(char** data, int* pLen, char* in, int len)

   I'd also make similar changes here.

    {
        char* pos = *data;
        memcpy_s(pos - len, len, in, len);
        *pLen += len;
        *data = pos - len;
    }

    // Wrap a krb5 token into NegTokenInit
    // Returns 1 for success, 0 for failure
    //
--->// Implements a very small subset of RFC 4178, negotiating exactly
--->// one mechanism (Kerberos, which is RFCs 1964 and 4121).
--->//
--->// https://tools.ietf.org/html/rfc4178

   The above are my additions.

    int
--->krb5_to_spnego(gss_buffer_t input, gss_buffer_t output)

   I've still got to review this carefully.

    {
        int len = (int)input->length;
        if (len > 60000) { // Too long, need 3 bytes DER length
            return 0;
        }
        // Just enough for a token with 2-bytes length
        int maxLen = len + 43;
--->    char* buffer = new char[maxLen];

   More non-C C++ (new not malloc()/calloc()) -- you don't have to make
   this C, but there's no need to keep the extern "C" if you don't,
   right?

        memcpy_s(buffer + maxLen - len, len, input->value, len);
        char* pos = buffer + maxLen - len;

--->

   It might be nice to explain that we're encoding from right to left.
   I know, it was obvious to me.

   It's also good to explain that we're excluding the mechListMIC field
   of NegTokenInit (which is OPTIONAL) because it's not needed when we
   only offer one mechanism.

        // mechToken [2] OCTET STRING  OPTIONAL,
--->    add_der_head(&pos, &len, len, 4);
--->    add_der_head(&pos, &len, len, 0xA2);

   This deserves some explanation.

   RFC 4178 says that SPNEGO uses DER and EXPLICIT tagging, which means
   we get two tag-lengths here: one for the Kerberos mechanism token as
   an OCTET STRING, which is tagged as [UNIVERSAL 4], and one for the
   mechToken field's explicit tag from the NegTokenInit SEQUENCE from
   RFC 4178, which is tagged as [CONTEXT 2].

   My variant of add_der_head() would make the tag class and tag (and
   constructed vs. primitive) all much more readable.

   Obviously, if you take my variant of add_der_head() and add_data(),
   then you'll need to check whether they return 0 or 1.

   However, you could check just the last call's return value because it
   doesn't matter if the all fail -- the last one can only succeed if
   the all succeeded.

   Also worth noting: we're skipping the reqFlags field of NegTokenInit,
   which is OPTIONAL.

        // mechTypes [0] MechTypeList,
        add_data(&pos, &len, (char*)KRB5_OID.elements, KRB5_OID.length);
--->    add_der_head(&pos, &len, KRB5_OID.length, 6);

   Do note in a comment that this is the OBJECT IDENTIFIER
   [UNIVERSAL 6] tag.

--->    add_der_head(&pos, &len, KRB5_OID.length + 2, 0x30);

   This is the SEQUENCE [UNIVERSAL 16] tag.

   The + 2 should really be + length returned by my version of
   add_der_head().

--->    add_der_head(&pos, &len, KRB5_OID.length + 4, 0xA0);

   This is the [CONTEXT 0] tag of the mechTypes field of NegTokenInit.

        // NegTokenInit ::= SEQUENCE { ... }
        add_der_head(&pos, &len, len, 0x30);

   Now I understand why your add_der_head() outputs the new length of
   the data.  With my version you'd be doing:

            len += add_der_head(&pos, &plen, len, 0, CONTEXT, 0);

--->    // negTokenInit [0] NegTokenInit,
        add_der_head(&pos, &len, len, 0xA0);

   Nice comment.  Yeah, this is the negTokenInit arm of the
   NegotiationToken CHOICE.

--->    // [APPLICATION 0] IMPLICIT SEQUENCE {
        //   thisMech MechType,
        //   innerContextToken ANY DEFINED BY thisMech

   This should say that this that is from RFC 2743, section 3.1.

        add_data(&pos, &len, (char*)SPNEGO_OID.elements, SPNEGO_OID.length);
        add_der_head(&pos, &len, SPNEGO_OID.length, 6);
        add_der_head(&pos, &len, len, 0x60);
        if (len != maxLen) {
            char* result = new char[len];
            memcpy_s(result, len, buffer + maxLen - len, len);
            delete[] buffer;
            buffer = result;
        }
        output->value = buffer;
        output->length = len;
        return 1;
    }

   OK, I've read the SPNEGO code.  That took hours to double check.

   I would very much prefer a safer add_der_head().  Please feel free to
   fixup my version and use it.

    // Skip ASN.1 tag and length bytes
    //
    // [in, out] data: An ASN.1 encoding. Head moved to data after call.
--->// Returns the length value

   s/ value/ of the value/

--->int

   I would prefer that this be a size_t and that on error this return 0
   (since the DER tag and length can never be fewer than two bytes, zero
   is a perfectly valid indication of failure by a function whose job is
   to parse a DER tag and length.

--->skip_tag_len(char** data)

   This function absolutely must take a `size_t' argument that is the
   length of the buffer pointed to by `*data', and the function body
   *must* check this length as it goes.

   If the encoded length can be decoded but the rest of the buffer is
   not long enough to contain a payload of the encoded value length,
   then this function *must* fail.

   This is why I'm very suspicious of hand-coded DER codecs.  It's
   extremely difficult to get right.  I've spent three plus hours
   reading ~1.5Kloc (16%) of sspi.cpp because that's how carefully one
   has to read hand-coded DER codecs.

   You might consider using the Heimdal ASN.1 compiler and library...

   An alternative would be to rototill the Java side of the JGSS stack
   to support Java-coded SPNEGO with JNI GSS/Kerberos.  That would
   require first accepting and pushing our contribution.

    {
        char* pos = *data;
--->    pos++;

   You really have to decode the tag byte so you can know how many bytes
   the tag takes up.

   Assume a malicious peer.

        int n = *pos & 0xff;

   Here it's find to not have more correct DER length decoding: the
   response token should never be very large at all and it's OK to
   return an error when they are.

        if (n < 128) {
            pos++;
        } else if (n == 0x81) {
            pos++;
            n = *pos & 0xff;
            pos++;
        } else if (n == 0x82) {
            pos++;
            n = *pos & 0xff;
            pos++;
            n = (n << 8) | (*pos & 0xff);
            pos++;
        } else {
            return -1;
        }
        *data = pos;
        return n;
    }

    // Extract a krb5 token from a NegTokenResp. Returns the negState,
    // or -1 if there is no negState.
    int
    spnego_to_krb5(gss_buffer_t input, gss_buffer_t output)
    {
        int result = -1;
--->    char* data = (char*)input->value;

   It's worth noting that we parse left to right -- the opposite of how
   we encode.

--->    skip_tag_len(&data);
--->    skip_tag_len(&data);

   You *really* have to check for errors every time you parse data fed
   to you by an attacker.  Not checking is not an option.

--->    while (data - (char*)input->value < (int)input->length) {

   This loop is looking for the responseToken field of NegTokenResp.

   But a loop is not appropriate.  This is a SEQUENCE, so look for the
   negState field, then the supportedMech field, and so on.  Yes, they
   are optional.  I would much prefer this sort of code:

        /* Skip negState */
        if (/* check that *data points to at least one byte */ &&
            *data == (char)0xA0) {
            if (skip_tag_len(&data, &datalen) == 0)
                goto fail; /* or whatever */
        }
        /* Skip supportedMech */
        if (/* check that *data points to at least one byte */ &&
            *data == (char)0xA1) {
            if (skip_tag_len(&data, &datalen) == 0)
                goto fail; /* or whatever */
        }
        if (/* check that *data points to at least one byte */ &&
            *data != (char)0xA2)
                goto fail; /* responseToken is missing! */

        output->length = skip_tag_len(&data, &datalen);
        if (output->length == 0)
            goto fail;
        output->value = data;
        ...

   Back to your code:

            if (*data == (char)0xA2) {
--->            skip_tag_len(&data);

   See above.

                output->length = skip_tag_len(&data);
                output->value = data;
--->            if (input->length - ((char*)output->value - (char*)input->value)
                        < output->length) {
                    // inner token has a length field longer than outer token
                    break;

   This will be unnecessary with my variant above.

                }
                return result;
--->        } else {

   This too.

                if (*data == (char)0xA0) {
                    result = data[4];
                }
                int tagAndLen = skip_tag_len(&data);
                data += tagAndLen;
            }
        }
        output->length = 0;
        output->value = NULL;
        return result;
    }

That's it for today.  That's about 16% of the way through.

Nico
--