An idea: Add a port layer

Tue May 17 01:25:53 PDT 2011

Hello all,

     I see some reply from MacOS mailing list so I'd paste here FYI.

     So in OpenJDK there is npt and hprof working for the similar 
purpose but for jvmti. And Kelly who was working on JVM TI give a 
document for npt[1].
     I suppose we may start with it to work for a well-defined portlib.

[1]

  Native Platform Toolkit (NPT) Concept Document

  Draft: WARNING: May Not Reflect Actual Implementation

    Problem Statement

Internal to the J2SE, anyone developing JNI code, or interfacing Java to 
native code, has run into the problem of implementing something in pure 
native code that is completely different between platforms, or 
represents a significant amount of code that ultimately isn't shared by 
anyone else. You either end up with lots of ugly '#ifdef' code that is 
hard to read and understand, or you duplicate the first platform's 
source and create separate copies for every platform, src/solaris/*, 
src/windows/*, etc. Many of these basic native code functions are 
trivial when looking at one native platform, but somewhat convoluted 
when dealing with multiple native platforms.  Making things worse, the 
same code is more often than not copied from one native library to the 
next native library.  This copy of native code is error prone and 
increases the maintenance burden unnecessarily.

In addition, certain native code libraries such as agent libraries could 
benefit from sharing some basic native functionality such as:

    * Better memory management functionality
    * A common native logging interface
    * Common error handling or stderr message printing
    * Native UTF-8 conversion functions (e.g. UTF-8 <-> Platform Encoding)
    * Common hash table functionality
    * etc.

Sometimes this functionality can be obtained by calling JNI functions or 
making calls into Java code, but not only is this inefficient at times, 
it isn't possible for many native libraries.  To call JNI or Java, you 
need a live JVM, and many native libraries (like the JVMTI agent 
libraries) need this functionality before the JVM is fully initialized.

Sometimes this functionality is provided by platform specific libraries, 
but they often differ in their interfaces.

Sometimes the basic functionality just doesn't exist in a way that can 
be used in a MT-safe and isolated way.

    Proposed Solution

Create a native library (libnpt.so or npt.dll) and a native interface to 
that library that gives JNI and Java native code users some helpful 
functionality in a clean platform neutral way. This would be kept 
internal to the J2SE and it's many native libraries.  Exposing this 
library in any public way should not be considered at this time. 
However, demos using it should be able to selectively re-use some of the 
sources of this library, and so some of the sources in this library will 
be made available in the demos for the J2SE, but not used in the demos 
as a shared library.

Comments are always welcome..

    Requirements

    * Must be easily extensible, always compatible from release to release
    * Library must be dynamically accessable (e.g. dlopen)
    * Call overhead should be minimal, good performance is expected from
      all these interfaces
    * Should leverage the native platform functionality whenever possible
    * Code written to NPT should be platform independent, easy to dlopen
    * Use of NPT should be allowed from C++ or C, similar to JNI and
      JVMTI interfaces
    * Must allow for multiple users of the library
    * All interfaces must be MT safe
    * All code must be compiler warning free, linted where possible, and
      fully prototyped
    * Should be easy to add another interface for sharing
    * The debug version of the library should do full argument
      consistent checking
    * This library should only have system library dependencies, e.g. libc
    * These functions should NOT require a running JVM, they are pure
      native code interfaces, however the jni.h typedefs and macros will
      be used:
          o All functions and function pointers should use JNICALL from
            jni.h for the safest calling mechanism
          o Should use the basic typedefs for Java types from jni.h,
            where possible

    Interface Details

The include file "npt.h" should provide some macros or inline functions 
that can be used to easily get the library loaded and the interface 
returned (this library loading is highly platform specific and error 
prone, we need to make this easier).

The library itself should just have just two major extern interfaces 
visible, something like:

|    JNIEXPORT void JNICALL nptInitialize(NptEnv **pnpt, char 
*npt_version, char *options);
     JNIEXPORT void   JNICALL  nptTerminate(NptEnv *npt, char *options);
|
But the #include file "npt.h" should also provide macros or inline 
functions  such as  NPT_INITIALIZE() which automatically loads the "npt" 
native library, get the address of nptInitialize() in the library, and 
returns the NptEnv* by calling through this pointer.   All very platform 
specific and error prone code.   Need to experiment on this...

Where options are needed, options will be provided as character strings, 
this provides for maximum extensibility and compatibility. The overhead 
for parsing these small strings is minimal.
||

      Proposed Example Usage

|#include "npt.h"

int
main(void) {
     nptLibrary *nptLib;
     NptEnv *npt;

     NPT_INITIALIZE(&npt, NPT_VERSION_STRING, start_up_options);

     if ( npt != NULL ) {
         int new_len;
         char output[64];

         new_len = npt->utf8ToPlatform("some utf-8 byte array", 21, 
&output, 64);
     } else {
         fprintf(stderr, "NPT interface not available\n");
         exit(1)
     }

     NPT_TERMINATE(npt, shutdown_options);
     return 0;
}|

      Compatibility and Extensibility

The user should never know the size of the NptEnv or any of the objects 
returned by this interface (struct LogInst, struct HeapInst, etc.). The 
field offsets in these structs must never change so that older code 
compiled to an older interface will continue to work. The version string 
is a simple "major.minor.micro" version number and the runtime version 
of the library must be able to support the "major.minor" version of this 
library for the initialization to be successful. This should be checked 
at initialization automatically.

      UTF-8 Related

Here are a few possible UTF related interfaces (fields in NptEnv):|

/* UTF-8 to and from Platform encoding */
int JNICALL (*utf8ToPlatform)(struct UtfInst *ui, jbyte *utf8, int len, 
char *output, int outputMaxLen);
int JNICALL (*utf8FromPlatform)(struct UtfInst *ui, char *str, int len,
                                jbyte *output, int outputMaxLen);

/* UTF-8 to Unicode, Unicode to UTF-8 Modified or Standard */
int  JNICALL (*utf8ToUtf16)(struct UtfInst *ui, jbyte *utf8, int len, 
jchar *output, int outputMaxLen);
int  JNICALL (*utf16ToUtf8m)(struct UtfInst *ui, jchar *utf16, int len,
                              jbyte *output, int outputMaxLen);
int  JNICALL (*utf16ToUtf8s)(struct UtfInst *ui, jchar *utf16, int len,
                              jbyte *output, int outputMaxLen);

/* UTF-8 Standard to UTF-8 Modified */
int  JNICALL (*utf8sToUtf8mLength)(struct UtfInst *ui, jbyte *string, 
int length);
void JNICALL (*utf8sToUtf8m)(struct UtfInst *ui, jbyte *string, int length,
                              jbyte *new_string, int new_length);

/* UTF-8 Modified to UTF-8 Standard */
int  JNICALL (*utf8mToUtf8sLength)(struct UtfInst *ui, jbyte *string, 
int length);
void JNICALL (*utf8mToUtf8s)(struct UtfInst *ui, jbyte *string, int length,
                              jbyte *new_string, int new_length);|

      Heap Management

One of the more dangerous parts of writing native code is the handling 
of the heap, too much Java programming usually causes JNI programmers to 
be sloppy :^). The use of malloc() and free(), and all it's relations 
continues to be an error prone activity, sometimes causing failures in 
code that is completely unrelated to the buggy code, or causing memory 
leaks that can be hard to track down.  Performance is also an issue with 
managing the memory used by the various native code libraries.  
Performance wise, too many malloc() calls can slow down the agent code, 
and sometimes slow down the Java Virtual Machine, which could also be 
using malloc().  Providing a way to create multiple heap instances, and 
also do allocations from blocks or chunks of memory, then  specific 
individual free()'s could be replaced with a global free of the entire 
heap or the blocks.

These are the kinds of Heap interfaces I was thinking about:|

/* Create a new heap, or delete the entire heap (managed and unmanaged)
  *   example HeapInitialize options: 
"init=4096,incr=1024,limit=0x00ffffff,zap=yes,watch=full"
  *   example heapTerminate options: "verify=yes,zap=no"
  *   Environment variables could be used to dynamically add to these 
options.
  */
||struct HeapInst* JNICALL (*heapInitialize)(char *options);|||
|void     JNICALL (*heapTerminate)(struct HeapInst *heap, char *options);

/* Allocate memory from a specific heap, individually managed or unmanaged
  *   (Unmanaged means less overhead and tracking, you can't realloc or
  *    or indivdually free these pieces of memory)
  */
||const void * JNICALL (*heapAlloc)(struct HeapInst* heap, int size);
||void *       JNICALL (*heapAllocManaged)(||struct HeapInst* heap, 
||int size);
||void *       JNICALL (*heapReallocManaged)(||struct HeapInst* heap, 
||void *ptr, int size);||
void *       JNICALL (*heapFreeManaged)(||struct HeapInst* heap, ||void 
*ptr);|

Often simple agent libraries only need to allocate incremental amounts 
of space that is never freed until it's time to report or terminate, 
using the above interfaces a single call to heapTerminate() is all that 
is needed.  Various protection code can be added internally and the 
options could be used to turn on tracing or logging of the calls and the 
status of the heaps.

Macros could be provided for dynamic stack allocated space, e.g. 
NptHeapLocalAlloc() and NptHeapLocalFree(), that could use the Solaris 
alloca() when available, or use heapAllocManaged() and heapFreeManaged() 
when alloca() functionality wasn't available. e.g.

|#ifndef solaris
     #define NptHeapLocalAlloc(heap, size)   heapAllocManaged(heap, size)
     #define NptHeapLocalFree(heap, ptr)     heapFreeManaged(heap, ptr)
#else
     #define ||NptHeapLocalAlloc(heap, size)   ((void*)alloca(size))
     #define NptHeapLocalFree(heap, ptr)|
|#endif
|

      Native Logging

Logging events or tracing executed code in the native world is a bit 
tricky. Synchronization in a pure native world is very different from 
platform to platform, and every native library that has logging or 
tracing sends their log to a separate file or place. There is little 
consistency right now in the available logging and tracing on the native 
side.  If we could generate the standard ULF (Uniform Logging Format) 
and somehow merge these logs (maybe even with the Java logging), t seems 
like this would be a good thing. But we need a common interface. And 
maybe the synchronization is unnecessary if we can design the interface 
correctly, I'd prefer to not have any synchronization in it.

|struct LogInst*;

/* Initialize or terminate a logging session */
||struct LogInst* JNICALL (*InitializeLog)(char *options);
||struct LogInst* JNICALL (*TerminateLog)(char *options);|
|
/* Uniform Logging Format Entry (ULF)
  * ||"[#|Date&Time&Zone|LogLevel|ProductName|ModuleID
  * |||OptionalKey1=Value1;||OptionalKeyN=ValueN|MessageID:MessageText|#]\n"
  */||
||struct LogInst* JNICALL (*Log)(||struct LogInst *log, int level, const 
char *module,
         const char *optional, const char *messageID, const char *message);

|This hasn't been fully thought out yet. It's possible the VM itself 
could use this library. However, I don't think that's a big issue since 
the VM developers always copy their libjvm.so into a jdk install area, 
if libnpt is there, it should just work.

      Error Messages

Error messages from native code to stderr (or sometimes stdout) is rare, 
but usually inconsistent. I haven't any proposal here yet, but this 
seems like a potential area where we could benefit from more 
consistentcy and sharing.
Something I noticed with the JNI calls that start up a JavaVM is that it 
has some kind of stderr/stdout re-direction option, something I suspect 
is slightly broken when you consider the native libraries doing 
arbitrary printf's or fprintf's to stderr.
Another issue I've seen is that most classnames printed out to 
stderr/stdout messages are UTF-8 bytes, yet printf/fprintf really are 
expecting the default platform encoding, this seems to be an I18n issue 
that could be fixed by localizing the error messages here somehow.
If you add in the Modified UTF-8 vs. Standard UTF-8 complications, this 
is a bit of a mess.
Linux gets away with it because their default encoding is UTF-8,  
Solaris and Windows seem like they have some problems here.
The Logging messages would have the same issue here with encodings.

      Hash Lookup Table

Functions like hsearch(3C) on Solaris are old and not MT-safe, we sure 
could use some functionality in this area.
It seems like everybody has implemented their own native code hash table 
logic. :^(

      Dictionary

Using the above Hash Lookup Table, we could create a shared dictionary 
mechanism.
I know it's hard to get a bunch of engineers to agree on the interface, 
but I'd rather re-use something like this than spend months getting it 
right and then maintaining all the code.

      Platform Specific

Things like getting the current directory, definitions of 'errno', etc.

Just browsing the src/windows and src/solaris directory trees in the 
J2SE workspace should yield a long list of functions that needed to be 
completely different between Windows and Solaris, and you will likely 
find multiple copies of the same basic coding.

于 2011-5-16 21:11, Jing LV 写道:
> Hello Mario,
>
>       That's good news! Your rich experience will greatly help everyone
> on this topic! :)
>       In early days I was also developing some portlib and I focused on a
> common native interface, covering the differences between windows, linux
> and some other platforms. I understand service providers but it is a bit
> new to me to apply c/c++ implementation as providers, but this is really
> a new way we can consider.
>
>       If I understand correct then at least we can count on a realtime
> (QNX) system layer :D  I also post on macos list (if there is some other
> platform list please tell me) to see if we have won others' interest on
> this topic.
>
> 于 2011-5-16 14:29, neugens.limasoftware at gmail.com 写道:
>> This is a very good idea.
>>
>> I have a lot if experience in porting java on weird OS and I can tell
>> you I had lots of troubles trying to unify the various native layers.
>>
>> In some OS, a common native interface should be enough, but I think a
>> better solution is to have service providers together with native
>> abstractions, similar to the concept we already have in the current
>> filesystem api.
>>
>> In either case, this is definitely a great idea and you can count on
>> me for some manpower :) I have access to a QNX box for the moment
>> (which is mostly compatible, needs some minor things in the network
>> interfaces, and the graphics layer of course), but I can help with
>> other OS as well if I get access to them somehow.
>>
>> Mario
>> -- 
>> Sent from HTC Desire...
>>
>> pgp key: http://subkeys.pgp.net/ PGP Key ID: 80F240CF
>> Fingerprint: BA39 9666 94EC 8B73 27FA  FC7C 4086 63E3 80F2 40CF
>>
>> http://www.icedrobot.org
>>
>> Proud GNU Classpath developer: http://www.classpath.org/
>> Read About us at: http://planet.classpath.org
>> OpenJDK: http://openjdk.java.net/projects/caciocavallo/
>>
>> Please, support open standards:
>> http://endsoftpatents.org/
>>
>>
>> ----- Reply message -----
>> Da: "Jing LV"<lvjing at linux.vnet.ibm.com>
>> Data: lun, mag 16, 2011 04:08
>> Oggetto: An idea: Add a port layer
>> A:<bsd-port-dev at openjdk.java.net>
>>
>> Hello BSD developers,
>>
>> I see on openjdk we'll have more platforms - except BSD, MacOS, there is
>> discussions about AIX. This is great news to the community, as well as a
>> new challenge to the community to manage different native implementation
>> for new added platform as well as early platform. The challenges we may
>> face are:
>> 1. in current implementation, we have native implementation in
>> windows/linux/solaris directory, and create some same jni methods. But
>> actually they have the same or very similar logic. This is not very
>> manageable. If some logic is change we need to modify implementation on
>> all platforms. and may cause some of problem as no one knows all
>> platforms differences, and the platform developers need to understand
>> the logic before the modification, it may be a extra work for developers
>> like BSD/AIX engineers.
>> 2. different platforms offers different system APIs, and even different
>> versions of system have different APIs; in current implementation I see
>> some code like
>> #ifdef someplatform
>> use some API
>> #endif
>> This increases the complexity of the code, and make code ugly. Also the
>> developer may have much trouble to read and modify if necessary.
>> 3. Openjdk is working on project digjaw/modularization, it may meet some
>> trouble if the native API and logic are separated by platform level, not
>> in functional level.
>>
>> I am wondering if a port layer, leave all APIs differences in this
>> layer. The jni developers can use this unified API, like "int write(fd,
>> byte[])" should work on all platforms openjdk supported, including
>> BSD/linux, windows, MacOS etc. This may help us the developers:
>> 1. the platform developer can focus on covering the API difference and
>> care nothing of the upper logical - say, e.g, focus on write some given
>> bytes into the give fd, do not care what the fd is and how to deal with
>> the buffers etc, so we will write it only once, and only update for new
>> APIs when necessary. Meanwhile, the classlib developers can use an
>> unified system API and focus on the logic, we write the code once for
>> all platforms. It save time and effort on both side.
>> 2. The code is then clear, no #endif is required, this helps the
>> developer to read and understand, and much easier to modify.
>> 3. it may help to modularize the jdk as well.
>>
>> An new portlib may also have some problems, like modification on the
>> current code, and performance. We need to define the portlayer well, and
>> make excellent build script to avoid performance degradation of the
>> layer. However, in the long run, the portlayer will really help the
>> developers as well as JDK, like some other opensource jdk do.
>>
>> I believe the portlayer would help BSD developers a lot in code
>> maintenance, and when updating new features. I'd like to listen to your
>> opinions/comments/suggestions on this topic.
>>
>> Thanks!
>>
>>
>> -- 
>> Best Regards,
>> Jimmy, Jing LV
>>
>>
>>
>>
>

-- 
Best Regards,
Jimmy, Jing LV

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/bsd-port-dev/attachments/20110517/ecfb18f8/attachment.html