From cowwoc at bbs.darktech.org  Thu Jan  3 03:25:02 2008
From: cowwoc at bbs.darktech.org (cowwoc)
Date: Wed, 2 Jan 2008 19:25:02 -0800 (PST)
Subject: Fixing bug #4128333: Serializing strings restricted to 64k bytes
Message-ID: <14591177.post@talk.nabble.com>


Hi,

Bug URL: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4128333

I'd like to start a discussion on how we can possibly solve this bug in a
backwards-compatible way. Here is my personal proposal but I'd love to hear
your own ideas!

1) Add a new method to DataInputStream/DataOutputStream for
encoding/decoding longer Strings (ideally this new encoding should have no
fixed limit). I think this should be done independently of Serialization as
this is needed by other clients.

2) Add a method to ObjectOutputStream to enable the new encoding format
(which is not backwards compatible). The default would be to use the old
encoding format but developers of new applications would be encouraged to
use the new format. I recommend ObjectOutputStream.setMinimumVersion(enum).
The default would be ObjectOutputStream.setMinimumVersion(JDK1_1) which
indicates the format is backwards-compatible to JDK 1.1 but we would add
ObjectOutputStream.setMinimumVersion(JDK1_7) for the new file format.

Please let me know what you think!
Gili
-- 
View this message in context: http://www.nabble.com/Fixing-bug--4128333%3A-Serializing-strings-restricted-to-64k-bytes-tp14591177p14591177.html
Sent from the OpenJDK Core Libraries mailing list archive at Nabble.com.


From cowwoc at bbs.darktech.org  Thu Jan  3 03:32:29 2008
From: cowwoc at bbs.darktech.org (cowwoc)
Date: Wed, 2 Jan 2008 19:32:29 -0800 (PST)
Subject: Fixing bug #4128333: Serializing strings restricted to 64k bytes
In-Reply-To: <14591177.post@talk.nabble.com>
References: <14591177.post@talk.nabble.com>
Message-ID: <14591181.post@talk.nabble.com>


I see now that ObjectOutputStream.html#useProtocolVersion() already exists:
http://java.sun.com/javase/6/docs/api/java/io/ObjectOutputStream.html#useProtocolVersion(int)

so the only thing we'd need to do in step 2 is add
ObjectStreamConstants.PROTOCOL_VERSION_3.
-- 
View this message in context: http://www.nabble.com/Fixing-bug--4128333%3A-Serializing-strings-restricted-to-64k-bytes-tp14591177p14591181.html
Sent from the OpenJDK Core Libraries mailing list archive at Nabble.com.


From peter.jones at sun.com  Fri Jan  4 16:04:25 2008
From: peter.jones at sun.com (Peter Jones)
Date: Fri, 4 Jan 2008 11:04:25 -0500
Subject: Fixing bug #4128333: Serializing strings restricted to 64k bytes
In-Reply-To: <14591177.post@talk.nabble.com>
References: <14591177.post@talk.nabble.com>
Message-ID: <20080104160425.GA10619@east>

If your concern is primarily about Java object serialization, note
that it has supported serializing strings with UTF-8 encoding larger
than 64KB since J2SE 1.3:

        http://bugs.sun.com/view_bug.do?bug_id=4217676

I presume that 4128333 remains open for Data{Input,Output}Stream only.

-- Peter


On Wed, Jan 02, 2008 at 07:25:02PM -0800, cowwoc wrote:
> 
> Hi,
> 
> Bug URL: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4128333
> 
> I'd like to start a discussion on how we can possibly solve this bug in a
> backwards-compatible way. Here is my personal proposal but I'd love to hear
> your own ideas!
> 
> 1) Add a new method to DataInputStream/DataOutputStream for
> encoding/decoding longer Strings (ideally this new encoding should have no
> fixed limit). I think this should be done independently of Serialization as
> this is needed by other clients.
> 
> 2) Add a method to ObjectOutputStream to enable the new encoding format
> (which is not backwards compatible). The default would be to use the old
> encoding format but developers of new applications would be encouraged to
> use the new format. I recommend ObjectOutputStream.setMinimumVersion(enum).
> The default would be ObjectOutputStream.setMinimumVersion(JDK1_1) which
> indicates the format is backwards-compatible to JDK 1.1 but we would add
> ObjectOutputStream.setMinimumVersion(JDK1_7) for the new file format.
> 
> Please let me know what you think!
> Gili


From jackieict at gmail.com  Sat Jan  5 12:01:35 2008
From: jackieict at gmail.com (zhang Jackie)
Date: Sat, 5 Jan 2008 20:01:35 +0800
Subject: RMI benchmark
Message-ID: <13432ab00801050401v4fc4c90dha223feb3b0b2cd34@mail.gmail.com>

Hi, everyone!
      Recently ,I want to have a performance comparision on RMI and my own
version with little changes. Can you give me some microbenchmarks and some
other suites used for estimate the performance of RMI? I googled the keyword
"RMI benchmark", but cant get one.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20080105/d07100f9/attachment.html>

From richard.warburton at gmail.com  Sat Jan  5 12:16:24 2008
From: richard.warburton at gmail.com (Richard Warburton)
Date: Sat, 5 Jan 2008 12:16:24 +0000
Subject: RMI benchmark
In-Reply-To: <13432ab00801050401v4fc4c90dha223feb3b0b2cd34@mail.gmail.com>
References: <13432ab00801050401v4fc4c90dha223feb3b0b2cd34@mail.gmail.com>
Message-ID: <749b5dd60801050416r18a25611lfc4bf10c372cbd7a@mail.gmail.com>

On Jan 5, 2008 12:01 PM, zhang Jackie <jackieict at gmail.com> wrote:
> Hi, everyone!
>       Recently ,I want to have a performance comparision on RMI and my own
> version with little changes. Can you give me some microbenchmarks and some
> other suites used for estimate the performance of RMI? I googled the keyword
> "RMI benchmark", but cant get one.

The KaRMI system was originally meant to be a faster drop in
replacement for RMI, and used to have some benchmarks associated with
it.  Unfortunately I can no longer find them anymore.  It is availible
at:

http://svn.ipd.uni-karlsruhe.de/trac/javaparty/wiki/KaRMI

  Richard Warburton


From linuxhippy at gmail.com  Mon Jan  7 00:11:24 2008
From: linuxhippy at gmail.com (Clemens Eisserer)
Date: Mon, 7 Jan 2008 01:11:24 +0100
Subject: Performance regression in java.util.zip.Deflater
In-Reply-To: <476B0ABA.6030102@sun.com>
References: <194f62550712201120p1d10ac45xf86eb9cacd2eee87@mail.gmail.com>
	<476ADDAF.2070409@sun.com>
	<194f62550712201336y3380808bv3726d891873be277@mail.gmail.com>
	<476AEDCD.6080504@sun.com>
	<194f62550712201520p30d7b15wa8f2005749a77243@mail.gmail.com>
	<476B0ABA.6030102@sun.com>
Message-ID: <194f62550801061611x7e363a61q95e74b89db6a17ff@mail.gmail.com>

Hello again,

I implemented two prototypes of the striding to see how they perform
and how complex the code would be. Both prototypes implement the
striding on the java-side (call JNI method for each stride) which I
plan to change to minimize overhead and hide the striding (except Sun
would like to have it in Java).

The first prototype uses two Direct-ByteBuffers where it copies the
data to/from the input/output arrays, the whole input/output data is
this way only copied once.
The second prototype uses striding (1kb chunks) in the
Cirtical-Section, I also did some measurements to see how long the
cirtical-section is held in worst-case.

Buffers / 2k/1k stride size: (input-buffer: 2k, output-buffer 1k)
1.) Compress 50mb with level=0 / 100byte-output-array:  603ms
2.) Compress 50mb with level=1  / 100byte-output-array:  277ms
3.) Compress 50mb with level=9  / 1kb output-array  784ms

Critical / 1k stride size: (no copying)
1.) Compress 50mb with level=0  / 100byte-output-array:720ms
2.) Compress 50mb with level=1  / 100byte-output-array: 270ms
3.) Compress 50mb with level=9  / 1kb output-array 778ms

The first two measurements are worst-case scenarios which measure the
overhead of striding when the output-buffer is way too small - here
the copying approach is even fast (maybe GetPrimitiveArrayCritical has
more overhead then GetDirectBufferAdress).
The 3.) shows a real-world example with high compression where
copying-overhead should not be high - but however it does show up
(only a few percent). I did many more measurements (however I don't
remeber exactly what I measured, it was some time ago) and my
conclusion was that especially for a little bit larger buffers (e.g.
8k/4k) the copying overhead is really low - also oprofile showed ~2-5%
in memcpy).
Because the non-copying critical-section approach has to use small
strides the are both almost equal fast, in real-world use-cases the
non-copying approach was a few ms faster.

However one thing of the copying solution I don't like: Its quite
complex, whereas the critical-section approach is quite clean.

I did some benchmarks how long the critical section is held with
compression-ratio=9 + uncompressable data (assumed this is a
worst-case) and 1kb strides in ?s:
530
351
615
339
2
1
2
292
3470 (worst case over all runs)
256
341

So on my Core2Duo (2ghz) I see worst-cases of about 3ms including
JNI-overhead with 1kb strides. Making the strides small won't help as
zlib waits until it has enough data to compress (thats why there are
2?s calls - which I assume are only used to move data inside of zlibs
compression buffer).

On the hotspot-runtime list I started a thread about "how evil"
GetPrimitiveArrayCritical is, they said it only blocks the GC - I
don't know wether 3ms are problematic. However keeping in mind that
Deflater is quite slow anyway, the copying overhead is not relevant I
guess.

So to sum it up I would recommend for Deflater either the
non-copying/critical solution or a copying solution which both work in
strides. The copying solution would allocate the stride-buffers in
deflater_init(), and free it on deflater_end(), doing the looping and
copying on the native side.

However for inflater, which is a lot faster (and has more predictable
pause-times) I would not recommend a copying approach. The remaining
question seems to be how long tolerable pauses are, and ideas?

I would be interested in some ideas and feedback. What do you think
would be a good solution?

Thank you in advance, lg Clemens

PS: The striding+GetPrimitive... is even used by NIO for copying
java-arrays into direct-ByteBuffers:
    while (length > 0) {
	size = (length > MBYTE ? MBYTE : length);
	GETCRITICAL(bytes, env, dst);
 	memcpy(bytes + dstPos, (void *)srcAddr, size);
	RELEASECRITICAL(bytes, env, dst, 0);
    ................

From Alan.Bateman at Sun.COM  Mon Jan  7 08:35:36 2008
From: Alan.Bateman at Sun.COM (Alan Bateman)
Date: Mon, 07 Jan 2008 08:35:36 +0000
Subject: Performance regression in java.util.zip.Deflater
In-Reply-To: <194f62550801061611x7e363a61q95e74b89db6a17ff@mail.gmail.com>
References: <194f62550712201120p1d10ac45xf86eb9cacd2eee87@mail.gmail.com>
	<476ADDAF.2070409@sun.com>
	<194f62550712201336y3380808bv3726d891873be277@mail.gmail.com>
	<476AEDCD.6080504@sun.com>
	<194f62550712201520p30d7b15wa8f2005749a77243@mail.gmail.com>
	<476B0ABA.6030102@sun.com>
	<194f62550801061611x7e363a61q95e74b89db6a17ff@mail.gmail.com>
Message-ID: <4781E458.1010907@sun.com>

Clemens Eisserer wrote:
> :
>
> PS: The striding+GetPrimitive... is even used by NIO for copying
> java-arrays into direct-ByteBuffers:
>     while (length > 0) {
> 	size = (length > MBYTE ? MBYTE : length);
> 	GETCRITICAL(bytes, env, dst);
>  	memcpy(bytes + dstPos, (void *)srcAddr, size);
> 	RELEASECRITICAL(bytes, env, dst, 0);
>     ................
>   
Yes, NIO uses JNI critical sections when copying to/from arrays, but as 
a FYI, we hope to eliminate this native code soon. The replacement uses 
the Unsafe interface to do the copying and will be much faster than the 
current native implementation. To allow for safepoint polling (in the 
VM) it also copies very large arrays/buffers in strides.

-Alan.


From peter.jones at sun.com  Mon Jan  7 16:26:00 2008
From: peter.jones at sun.com (Peter Jones)
Date: Mon, 7 Jan 2008 11:26:00 -0500
Subject: RMI benchmark
In-Reply-To: <13432ab00801050401v4fc4c90dha223feb3b0b2cd34@mail.gmail.com>
References: <13432ab00801050401v4fc4c90dha223feb3b0b2cd34@mail.gmail.com>
Message-ID: <20080107162559.GA1873@east>

On Sat, Jan 05, 2008 at 08:01:35PM +0800, zhang Jackie wrote:
> Hi, everyone!
>       Recently ,I want to have a performance comparision on RMI and my own
> version with little changes. Can you give me some microbenchmarks and some
> other suites used for estimate the performance of RMI? I googled the keyword
> "RMI benchmark", but cant get one.

You can find a microbenchmark suite for RMI, as well as object
serialization, in the "test" tree of the jdk7/jdk repository--
relative to the jdk7 forest, look here:

        jdk/test/java/rmi/reliability/benchmark

There isn't much documentation there, but the script here:

        jdk/test/java/rmi/reliability/scripts/create_benchmark_jars.ksh

shows how to create two JAR files, rmibench.jar and serialbench.jar
from the sources.  Running either JAR with "-h" prints a usage
message.  Each has a default config file that can be altered to
customize which of the microbenchmarks to execute, how many
repetitions of each to run, how much warmup to do, etc.  The RMI suite
can be run all in one VM or in two separate VMs, a "client" and a
"server", possibly on different hosts.

-- Peter


From cowwoc at bbs.darktech.org  Mon Jan  7 18:57:19 2008
From: cowwoc at bbs.darktech.org (cowwoc)
Date: Mon, 7 Jan 2008 10:57:19 -0800 (PST)
Subject: Proposal for improving performance of TreeMap and others
Message-ID: <14673283.post@talk.nabble.com>


I noticed that TreeMap (and maybe other classes) require a user to either
pass in a Comparator or ensure that all keys must implement Comparable. The
TreeMap code then uses a utility method whenever it needs to compare two
keys:


/**
 * Compares two keys using the correct comparison method for this TreeMap.
 */
final int compare(Object k1, Object k2) {
  return comparator == null ? ((Comparable<? super  K>) k1)
  .compareTo((K) k2) : comparator.compare((K) k1, (K) k2);
}

The problem with the above method is that it checks whether comparator is
null once per comparison instead of once when the TreeMap is constructed.
Instead I propose that this check only take place once in the constructors
and the rest of the code assume that a comparator exists. If a comparator is
not provided then you can simply define one as follows:

comparator = new Comparator<K>()
    {
      @SuppressWarnings("unchecked")
      public int compare(K first, K second)
      {
        return ((Comparable<K>) first).compareTo(second);
      }
    });

This solution should be backwards compatible while improving performance. At
least, that's my guess. There is always the chance that the JIT is smart
enough to optimize away this comparison but I'd rather not rely on JIT
implementation details. I also believe the resulting code is more readable.

What do you think?
-- 
View this message in context: http://www.nabble.com/Proposal-for-improving-performance-of-TreeMap-and-others-tp14673283p14673283.html
Sent from the OpenJDK Core Libraries mailing list archive at Nabble.com.


From linuxhippy at gmail.com  Mon Jan  7 19:56:39 2008
From: linuxhippy at gmail.com (Clemens Eisserer)
Date: Mon, 7 Jan 2008 20:56:39 +0100
Subject: Proposal for improving performance of TreeMap and others
In-Reply-To: <14673283.post@talk.nabble.com>
References: <14673283.post@talk.nabble.com>
Message-ID: <194f62550801071156m7648135du57331ea91e26fa80@mail.gmail.com>

> This solution should be backwards compatible while improving performance. At
> least, that's my guess. There is always the chance that the JIT is smart
> enough to optimize away this comparison but I'd rather not rely on JIT
> implementation details. I also believe the resulting code is more readable.
>
> What do you think?

>From the performance-overview theres no (real) difference I guess, but
I have to agree that the code is more readable and cleaner.
On the other hand its one more class that has to be shipped and loaded
at startup.
I like your approach, I just don't know the real pros and cons...

lg Clemens


From Thomas.Hawtin at Sun.COM  Mon Jan  7 19:59:53 2008
From: Thomas.Hawtin at Sun.COM (Thomas Hawtin)
Date: Mon, 07 Jan 2008 19:59:53 +0000
Subject: Proposal for improving performance of TreeMap and others
In-Reply-To: <14673283.post@talk.nabble.com>
References: <14673283.post@talk.nabble.com>
Message-ID: <478284B9.6020603@Sun.COM>

cowwoc wrote:
> I noticed that TreeMap (and maybe other classes) require a user to either
> pass in a Comparator or ensure that all keys must implement Comparable. The
> TreeMap code then uses a utility method whenever it needs to compare two
> keys:

I'm not going to comment about performance, but there is a problem with 
serialisation.

TreeMap.comparator is final (and non-transient).

TreeMaps serialised with earlier versions will be deserialised with null 
comparator. So, comparator would either need to be made non-final or 
sun.misc.Unsafe used.

For the serialisation case, it would be necessary to change writeObject 
to use putFields rather than defaultWriteObject (not very nice, but not 
half as nasty as I originally thought).

Tom Hawtin


From Martin.Buchholz at Sun.COM  Mon Jan  7 21:04:30 2008
From: Martin.Buchholz at Sun.COM (Martin Buchholz)
Date: Mon, 07 Jan 2008 13:04:30 -0800
Subject: Proposal for improving performance of TreeMap and others
In-Reply-To: <14673283.post@talk.nabble.com>
References: <14673283.post@talk.nabble.com>
Message-ID: <478293DE.9090600@sun.com>

The authors of TreeMap have thought about
eliding comparator null checks:


    /**
     * Version of getEntry using comparator. Split off from getEntry
     * for performance. (This is not worth doing for most methods,
     * that are less dependent on comparator performance, but is
     * worthwhile here.)
     */
    final Entry<K,V> getEntryUsingComparator(Object key) {
	K k = (K) key;
        Comparator<? super K> cpr = comparator;
        if (cpr != null) {
            Entry<K,V> p = root;
            while (p != null) {
                int cmp = cpr.compare(k, p.key);
                if (cmp < 0)
                    p = p.left;
                else if (cmp > 0)
                    p = p.right;
                else
                    return p;
            }
        }
        return null;
    }

As to whether using an explicit Comparator for the "natural ordering"
is a performance improvement, this depends very much on
the implementation of the JIT and the degree of polymorphism of
the call site, and on the prevalance of TreeMaps using "natural
ordering".  At the very least, a null check is very cheap, so it is
unlikely that the proposed change will be a significant performance
improvement, while, on the other hand, there is a good chance that
it will decrease performance for TreeMaps using "natural ordering".

Aside: It's probably a good idea for the comparator for
"natural ordering" to be available via some static method.

Martin


cowwoc wrote:
> I noticed that TreeMap (and maybe other classes) require a user to either
> pass in a Comparator or ensure that all keys must implement Comparable. The
> TreeMap code then uses a utility method whenever it needs to compare two
> keys:
> 
> 
> /**
>  * Compares two keys using the correct comparison method for this TreeMap.
>  */
> final int compare(Object k1, Object k2) {
>   return comparator == null ? ((Comparable<? super  K>) k1)
>   .compareTo((K) k2) : comparator.compare((K) k1, (K) k2);
> }
> 
> The problem with the above method is that it checks whether comparator is
> null once per comparison instead of once when the TreeMap is constructed.
> Instead I propose that this check only take place once in the constructors
> and the rest of the code assume that a comparator exists. If a comparator is
> not provided then you can simply define one as follows:
> 
> comparator = new Comparator<K>()
>     {
>       @SuppressWarnings("unchecked")
>       public int compare(K first, K second)
>       {
>         return ((Comparable<K>) first).compareTo(second);
>       }
>     });
> 
> This solution should be backwards compatible while improving performance. At
> least, that's my guess. There is always the chance that the JIT is smart
> enough to optimize away this comparison but I'd rather not rely on JIT
> implementation details. I also believe the resulting code is more readable.
> 
> What do you think?


From nradov at axolotl.com  Mon Jan  7 21:31:58 2008
From: nradov at axolotl.com (Nick Radov)
Date: Mon, 7 Jan 2008 13:31:58 -0800
Subject: core classes still need to be declared final?
Message-ID: <OF6D270966.3877EC37-ON882573C9.006C1027-882573C9.0076489F@axolotl.com>

Is it still necessary for the core Java classes such as java.lang.Integer 
to be declared final? I understand that may have been necessary in the 
early days for performance reasons, but modern JVMs no longer provide much 
of a performance benefit for final classes. For certain applications it 
would really be helpful to be able to subclass some of those core classes.

For example, one application I'm working on deals with integer values that 
must be between 0 and 9999 inclusive. I would like to be able to create a 
custom Integer subclass which enforces that limit in the constructor, but 
currently that isn't possible. While I could create a new class that acts 
as a wrapper around Integer, the syntax would be much more awkward and 
that would also make it much more difficult to interface with other 
third-party classes.

Nick Radov | Research and Development Manager | Axolotl Corp
www.axolotl.com, d: 408.920.0800 x116, f: 408.920.0880
160 West Santa Clara St., Suite 1000, San Jose, CA, 95113
 
THE MARKET LEADER IN HEALTH INFORMATION EXCHANGE ? PROVIDING PATIENT 
INFORMATION WHEN AND WHERE IT IS NEEDED.
 
The information contained in this e-mail transmission may contain 
confidential information. It is intended for the use of the addressee. If 
you are not the intended recipient, any disclosure, copying, or 
distribution of this information is strictly prohibited. If you receive 
this message in error, please inform the sender immediately and remove any 
record of this message.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20080107/ae94951a/attachment.html>

From cowwoc at bbs.darktech.org  Mon Jan  7 21:35:32 2008
From: cowwoc at bbs.darktech.org (cowwoc)
Date: Mon, 07 Jan 2008 16:35:32 -0500
Subject: core classes still need to be declared final?
In-Reply-To: <OF6D270966.3877EC37-ON882573C9.006C1027-882573C9.0076489F@axolotl.com>
References: <OF6D270966.3877EC37-ON882573C9.006C1027-882573C9.0076489F@axolotl.com>
Message-ID: <47829B24.2050506@bbs.darktech.org>


	My understanding is that this has nothing to do with performance. 
Certain classes, such as String, as declared final for security reasons.

	In the case of Integer I would suggest using composition. It's not as 
nice but it'll work.

Gili

Nick Radov wrote:
> 
> Is it still necessary for the core Java classes such as 
> java.lang.Integer to be declared final? I understand that may have been 
> necessary in the early days for performance reasons, but modern JVMs no 
> longer provide much of a performance benefit for final classes. For 
> certain applications it would really be helpful to be able to subclass 
> some of those core classes.
> 
> For example, one application I'm working on deals with integer values 
> that must be between 0 and 9999 inclusive. I would like to be able to 
> create a custom Integer subclass which enforces that limit in the 
> constructor, but currently that isn't possible. While I could create a 
> new class that acts as a wrapper around Integer, the syntax would be 
> much more awkward and that would also make it much more difficult to 
> interface with other third-party classes.
> 
> *Nick Radov | Research and Development Manager | Axolotl Corp*
> www.axolotl.com <http://www.axolotl.com/>, d: 408.920.0800 x116, f: 
> 408.920.0880
> 160 West Santa Clara St., Suite 1000, San Jose, CA, 95113
>  
> THE MARKET LEADER IN HEALTH INFORMATION EXCHANGE ? PROVIDING PATIENT 
> INFORMATION WHEN AND WHERE IT IS NEEDED.
>  
> /The information contained in this e-mail transmission may contain 
> confidential information. It is intended for the use of the addressee. 
> If you are not the intended recipient, any disclosure, copying, or 
> distribution of this information is strictly prohibited. If you receive 
> this message in error, please inform the sender immediately and remove 
> any record of this message./


From cowwoc at bbs.darktech.org  Mon Jan  7 22:00:33 2008
From: cowwoc at bbs.darktech.org (cowwoc)
Date: Mon, 7 Jan 2008 14:00:33 -0800 (PST)
Subject: Proposal for improving performance of TreeMap and others
In-Reply-To: <478293DE.9090600@sun.com>
References: <14673283.post@talk.nabble.com> <478293DE.9090600@sun.com>
Message-ID: <14676918.post@talk.nabble.com>


I guess you're right. It is probably as likely that the JIT will optimize
away the null check as it is that it will optimize away the
NullPointerException check. One exception, though, is when production
systems run using -Xverify:none. In such a case, wouldn't my approach run
faster?

I still think that my proposed code is somehow more consistent/cleaner on a
design-level but I guess that's just me :)

As an aside, are there standard benchmarks for testing the impact of this
change? I'd love to know whether it actually produces any performance
difference in practice.

Gili


Martin Buchholz wrote:
> 
> The authors of TreeMap have thought about
> eliding comparator null checks:
> 
> 
>     /**
>      * Version of getEntry using comparator. Split off from getEntry
>      * for performance. (This is not worth doing for most methods,
>      * that are less dependent on comparator performance, but is
>      * worthwhile here.)
>      */
>     final Entry<K,V> getEntryUsingComparator(Object key) {
> 	K k = (K) key;
>         Comparator<? super K> cpr = comparator;
>         if (cpr != null) {
>             Entry<K,V> p = root;
>             while (p != null) {
>                 int cmp = cpr.compare(k, p.key);
>                 if (cmp < 0)
>                     p = p.left;
>                 else if (cmp > 0)
>                     p = p.right;
>                 else
>                     return p;
>             }
>         }
>         return null;
>     }
> 
> As to whether using an explicit Comparator for the "natural ordering"
> is a performance improvement, this depends very much on
> the implementation of the JIT and the degree of polymorphism of
> the call site, and on the prevalance of TreeMaps using "natural
> ordering".  At the very least, a null check is very cheap, so it is
> unlikely that the proposed change will be a significant performance
> improvement, while, on the other hand, there is a good chance that
> it will decrease performance for TreeMaps using "natural ordering".
> 
> Aside: It's probably a good idea for the comparator for
> "natural ordering" to be available via some static method.
> 
> Martin
> 
> 
> cowwoc wrote:
>> I noticed that TreeMap (and maybe other classes) require a user to either
>> pass in a Comparator or ensure that all keys must implement Comparable.
>> The
>> TreeMap code then uses a utility method whenever it needs to compare two
>> keys:
>> 
>> 
>> /**
>>  * Compares two keys using the correct comparison method for this
>> TreeMap.
>>  */
>> final int compare(Object k1, Object k2) {
>>   return comparator == null ? ((Comparable<? super  K>) k1)
>>   .compareTo((K) k2) : comparator.compare((K) k1, (K) k2);
>> }
>> 
>> The problem with the above method is that it checks whether comparator is
>> null once per comparison instead of once when the TreeMap is constructed.
>> Instead I propose that this check only take place once in the
>> constructors
>> and the rest of the code assume that a comparator exists. If a comparator
>> is
>> not provided then you can simply define one as follows:
>> 
>> comparator = new Comparator<K>()
>>     {
>>       @SuppressWarnings("unchecked")
>>       public int compare(K first, K second)
>>       {
>>         return ((Comparable<K>) first).compareTo(second);
>>       }
>>     });
>> 
>> This solution should be backwards compatible while improving performance.
>> At
>> least, that's my guess. There is always the chance that the JIT is smart
>> enough to optimize away this comparison but I'd rather not rely on JIT
>> implementation details. I also believe the resulting code is more
>> readable.
>> 
>> What do you think?
> 
> 

-- 
View this message in context: http://www.nabble.com/Proposal-for-improving-performance-of-TreeMap-and-others-tp14673283p14676918.html
Sent from the OpenJDK Core Libraries mailing list archive at Nabble.com.


From linuxhippy at gmail.com  Mon Jan  7 22:38:11 2008
From: linuxhippy at gmail.com (Clemens Eisserer)
Date: Mon, 7 Jan 2008 23:38:11 +0100
Subject: Proposal for improving performance of TreeMap and others
In-Reply-To: <14676918.post@talk.nabble.com>
References: <14673283.post@talk.nabble.com> <478293DE.9090600@sun.com>
	<14676918.post@talk.nabble.com>
Message-ID: <194f62550801071438r7cef7d61l3f19872940b88ab@mail.gmail.com>

Hi cowwoc,

> I guess you're right. It is probably as likely that the JIT will optimize
> away the null check as it is that it will optimize away the
> NullPointerException check. One exception, though, is when production
> systems run using -Xverify:none. In such a case, wouldn't my approach run
> faster?
I don't think it will optimize the null-check away, however it is so
cheap that it most likely will not weight at all, compared to all the
other operations happening there. Its maybe 5 instructions compared to
thousands or even more.
-Xverify:none only disables bytecode verification at class-loading
time and has no influence (as far as I know) on the performance of the
generated code.

> I still think that my proposed code is somehow more consistent/cleaner on a
> design-level but I guess that's just me :)
I also like it more, its cleaner in my opinion :)

> As an aside, are there standard benchmarks for testing the impact of this
> change? I'd love to know whether it actually produces any performance
> difference in practice.
>From my experience i would rather guess that you won't notice the
change, noise will be higher.

lg Clemens


From Martin.Buchholz at Sun.COM  Mon Jan  7 22:51:58 2008
From: Martin.Buchholz at Sun.COM (Martin Buchholz)
Date: Mon, 07 Jan 2008 14:51:58 -0800
Subject: core classes still need to be declared final?
In-Reply-To: <OF6D270966.3877EC37-ON882573C9.006C1027-882573C9.0076489F@axolotl.com>
References: <OF6D270966.3877EC37-ON882573C9.006C1027-882573C9.0076489F@axolotl.com>
Message-ID: <4782AD0E.6000206@sun.com>

Subclassability is a problem with "value-oriented" computing.

If security or extreme reliability is a concern, then
existing apis that took Integers or Strings as arguments would
have to make defensive copies on import or export, as they have
to do with arrays today.  Since existing classes depend on
the immutability of Integers and Strings, these must forever remain
non-subclassable, at least by untrusted application code.

Inheritance is the one cornerstone of object-oriented computing that
has disappointed us, now that we have gained experience with it,
since it seriously constrains the evolution of superclasses.
Prefer composition to inheritance.
Especially so with immutable "value" types.

For the particular case of range-restricted integers,
I have some sympathy.  It would be nice if the platform offered
such things.

Martin

Nick Radov wrote:
> 
> Is it still necessary for the core Java classes such as
> java.lang.Integer to be declared final? I understand that may have been
> necessary in the early days for performance reasons, but modern JVMs no
> longer provide much of a performance benefit for final classes. For
> certain applications it would really be helpful to be able to subclass
> some of those core classes.
> 
> For example, one application I'm working on deals with integer values
> that must be between 0 and 9999 inclusive. I would like to be able to
> create a custom Integer subclass which enforces that limit in the
> constructor, but currently that isn't possible. While I could create a
> new class that acts as a wrapper around Integer, the syntax would be
> much more awkward and that would also make it much more difficult to
> interface with other third-party classes.
> 
> *Nick Radov | Research and Development Manager | Axolotl Corp*
> www.axolotl.com <http://www.axolotl.com/>, d: 408.920.0800 x116, f:
> 408.920.0880
> 160 West Santa Clara St., Suite 1000, San Jose, CA, 95113


From forax at univ-mlv.fr  Mon Jan  7 23:38:34 2008
From: forax at univ-mlv.fr (=?UTF-8?B?UsOpbWkgRm9yYXg=?=)
Date: Tue, 08 Jan 2008 00:38:34 +0100
Subject: Proposal for improving performance of TreeMap and others
In-Reply-To: <194f62550801071438r7cef7d61l3f19872940b88ab@mail.gmail.com>
References: <14673283.post@talk.nabble.com>
	<478293DE.9090600@sun.com>	<14676918.post@talk.nabble.com>
	<194f62550801071438r7cef7d61l3f19872940b88ab@mail.gmail.com>
Message-ID: <4782B7FA.60303@univ-mlv.fr>

Clemens Eisserer a ?crit :
> Hi cowwoc,
>
>   
>> I guess you're right. It is probably as likely that the JIT will optimize
>> away the null check as it is that it will optimize away the
>> NullPointerException check. One exception, though, is when production
>> systems run using -Xverify:none. In such a case, wouldn't my approach run
>> faster?
>>     
> I don't think it will optimize the null-check away, 
Hotspot  removes nullcheck and install a signal handler since its v2
(around 2000/01 If my memory serves me well).

> however it is so
> cheap that it most likely will not weight at all, compared to all the
> other operations happening there. Its maybe 5 instructions compared to
> thousands or even more.
> -Xverify:none only disables bytecode verification at class-loading
> time and has no influence (as far as I know) on the performance of the
> generated code.
>   
yes, and there is an option to remove nullcheck that is only available 
on debug VM.
...
>
> lg Clemens
>   
R?mi


From cowwoc at bbs.darktech.org  Mon Jan  7 23:51:52 2008
From: cowwoc at bbs.darktech.org (cowwoc)
Date: Mon, 7 Jan 2008 15:51:52 -0800 (PST)
Subject: Proposal for improving performance of TreeMap and others
In-Reply-To: <194f62550801071438r7cef7d61l3f19872940b88ab@mail.gmail.com>
References: <14673283.post@talk.nabble.com> <478293DE.9090600@sun.com>
	<14676918.post@talk.nabble.com>
	<194f62550801071438r7cef7d61l3f19872940b88ab@mail.gmail.com>
Message-ID: <14679084.post@talk.nabble.com>


Something very weird is going on. I tried profiling a minimal testcase and
there is a considerable amount of "missing time". I am using a dev build of
Netbeans 6.1 and it says:

MyComparator.compare(Object, Object) 19670ms
\-> MyComparator.compare(Integer, Integer) 10229ms
  \-> Self Time 3001ms
  \-> Integer.compareTo(Integer) 1575ms
\-> Self Time 3788ms

I spot at least three problems:

1) The individual item times do not add up to the total (but they do for
other stack-traces).
2) Comparator.compare() self-time consumes more CPU than Integer.compareTo()
even though it only invokes a method while the latter does actual
computation.
3) Why is extra time consumed moving from MyComparator.compare(Object,
Object) to (Integer, Integer)? It looks like Generics is doing something at
runtime which consumes a large amount of cpu.

Gili


Clemens Eisserer wrote:
> 
> Hi cowwoc,
> 
>> I guess you're right. It is probably as likely that the JIT will optimize
>> away the null check as it is that it will optimize away the
>> NullPointerException check. One exception, though, is when production
>> systems run using -Xverify:none. In such a case, wouldn't my approach run
>> faster?
> I don't think it will optimize the null-check away, however it is so
> cheap that it most likely will not weight at all, compared to all the
> other operations happening there. Its maybe 5 instructions compared to
> thousands or even more.
> -Xverify:none only disables bytecode verification at class-loading
> time and has no influence (as far as I know) on the performance of the
> generated code.
> 
>> I still think that my proposed code is somehow more consistent/cleaner on
>> a
>> design-level but I guess that's just me :)
> I also like it more, its cleaner in my opinion :)
> 
>> As an aside, are there standard benchmarks for testing the impact of this
>> change? I'd love to know whether it actually produces any performance
>> difference in practice.
>>From my experience i would rather guess that you won't notice the
> change, noise will be higher.
> 
> lg Clemens
> 
> 

-- 
View this message in context: http://www.nabble.com/Proposal-for-improving-performance-of-TreeMap-and-others-tp14673283p14679084.html
Sent from the OpenJDK Core Libraries mailing list archive at Nabble.com.


From linuxhippy at gmail.com  Tue Jan  8 15:10:20 2008
From: linuxhippy at gmail.com (Clemens Eisserer)
Date: Tue, 8 Jan 2008 16:10:20 +0100
Subject: [PATCH] Performance bug in String(byte[],int,int,Charset)
In-Reply-To: <474992D4.3010908@univ-mlv.fr>
References: <47474D15.4060504@gmail.com> <474992D4.3010908@univ-mlv.fr>
Message-ID: <194f62550801080710n5cf8cfe9r365926f89a019c95@mail.gmail.com>

Hello again,

> By the way, using clone() seams better than Arrays.copyOf() here.
>
> byte[] b = ba.clone();

Why? I remember that I've seen some benchmarks where array.clone() was
way slower than creating a new array and using System.arraycopy()
(which is exactly what copyOf does). However this may have changed ;)

lg Clemens


From cowwoc at bbs.darktech.org  Tue Jan  8 16:23:50 2008
From: cowwoc at bbs.darktech.org (cowwoc)
Date: Tue, 08 Jan 2008 11:23:50 -0500
Subject: Proposal for improving performance of TreeMap and others
In-Reply-To: <478391A0.3020901@sun.com>
References: <14673283.post@talk.nabble.com> <478293DE.9090600@sun.com>
	<14676918.post@talk.nabble.com>
	<194f62550801071438r7cef7d61l3f19872940b88ab@mail.gmail.com>
	<14679084.post@talk.nabble.com> <478391A0.3020901@sun.com>
Message-ID: <4783A396.5020001@bbs.darktech.org>


	That's good news, I guess ;) because in my minimal testcase that had 
nothing to do with TreeMap it looked like using a Comparator to wrap 
natural ordering degraded performance by an order of magnitude... which 
is really bad :)

	If the same isn't true for the actual TreeMap this change might be 
worth considering for its code-cleanup potential.

Gili

charlie hunt wrote:
> It's likely what you are observing in #2 & #3 and possibly in #1 also is 
> an artifact of inlining and possibly other JIT (dynamic) compiler 
> optimizations.
> 
> You might consider re-running your experiment with inlining disabled, 
> -XX:-Inlining.
> 
> Or, alternatively try running your experiment (with inlining enabled) 
> with Sun Studio Collector / Analyzer.  Then, when viewing the results in 
> the Analyzer, filter (View > Filter Data), the samples so that you are 
> looking at a portion of samples after the code is warmed up.  And, also 
> look at the results in machine view mode (View > Set Data Presentation > 
> Formats > View Mode > Machine).   NOTE: In machine mode you can also 
> view the generated assembly code for each method.  So, you can really 
> get down to the specifics of what's being executed.
> 
> Fwiw, I did a comparison run of a TreeMap with your suggested changes 
> including removing "if (comparator == null)" checks with one of our 
> favorite SPEC benchmarks which does a pretty good job at exercising 
> TreeMap.compare().  Even with 18 degrees of freedom I found the changes 
> to have no significant improvement. I didn't look at, or compare the 
> generated assembly code for both versions TreeMap.compare().  Though 
> that might be kind of interesting.
> 
> So, from a performance perspective, it appears this SPEC benchmark shows 
> no change in performance.
> 
> hths,
> 
> charlie ...
> 
> cowwoc wrote:
>> Something very weird is going on. I tried profiling a minimal testcase 
>> and
>> there is a considerable amount of "missing time". I am using a dev 
>> build of
>> Netbeans 6.1 and it says:
>>
>> MyComparator.compare(Object, Object) 19670ms
>> \-> MyComparator.compare(Integer, Integer) 10229ms
>>   \-> Self Time 3001ms
>>   \-> Integer.compareTo(Integer) 1575ms
>> \-> Self Time 3788ms
>>
>> I spot at least three problems:
>>
>> 1) The individual item times do not add up to the total (but they do for
>> other stack-traces).
>> 2) Comparator.compare() self-time consumes more CPU than 
>> Integer.compareTo()
>> even though it only invokes a method while the latter does actual
>> computation.
>> 3) Why is extra time consumed moving from MyComparator.compare(Object,
>> Object) to (Integer, Integer)? It looks like Generics is doing 
>> something at
>> runtime which consumes a large amount of cpu.
>>
>> Gili
>>
>>
>> Clemens Eisserer wrote:
>>  
>>> Hi cowwoc,
>>>
>>>    
>>>> I guess you're right. It is probably as likely that the JIT will 
>>>> optimize
>>>> away the null check as it is that it will optimize away the
>>>> NullPointerException check. One exception, though, is when production
>>>> systems run using -Xverify:none. In such a case, wouldn't my 
>>>> approach run
>>>> faster?
>>>>       
>>> I don't think it will optimize the null-check away, however it is so
>>> cheap that it most likely will not weight at all, compared to all the
>>> other operations happening there. Its maybe 5 instructions compared to
>>> thousands or even more.
>>> -Xverify:none only disables bytecode verification at class-loading
>>> time and has no influence (as far as I know) on the performance of the
>>> generated code.
>>>
>>>    
>>>> I still think that my proposed code is somehow more 
>>>> consistent/cleaner on
>>>> a
>>>> design-level but I guess that's just me :)
>>>>       
>>> I also like it more, its cleaner in my opinion :)
>>>
>>>    
>>>> As an aside, are there standard benchmarks for testing the impact of 
>>>> this
>>>> change? I'd love to know whether it actually produces any performance
>>>> difference in practice.
>>>>       
>>> >From my experience i would rather guess that you won't notice the
>>> change, noise will be higher.
>>>
>>> lg Clemens
>>>
>>>
>>>     
>>
>>   


From charlie.hunt at sun.com  Tue Jan  8 15:07:12 2008
From: charlie.hunt at sun.com (charlie hunt)
Date: Tue, 08 Jan 2008 09:07:12 -0600
Subject: Proposal for improving performance of TreeMap and others
In-Reply-To: <14679084.post@talk.nabble.com>
References: <14673283.post@talk.nabble.com> <478293DE.9090600@sun.com>
	<14676918.post@talk.nabble.com>
	<194f62550801071438r7cef7d61l3f19872940b88ab@mail.gmail.com>
	<14679084.post@talk.nabble.com>
Message-ID: <478391A0.3020901@sun.com>

It's likely what you are observing in #2 & #3 and possibly in #1 also is 
an artifact of inlining and possibly other JIT (dynamic) compiler 
optimizations.

You might consider re-running your experiment with inlining disabled, 
-XX:-Inlining.

Or, alternatively try running your experiment (with inlining enabled) 
with Sun Studio Collector / Analyzer.  Then, when viewing the results in 
the Analyzer, filter (View > Filter Data), the samples so that you are 
looking at a portion of samples after the code is warmed up.  And, also 
look at the results in machine view mode (View > Set Data Presentation > 
Formats > View Mode > Machine).   NOTE: In machine mode you can also 
view the generated assembly code for each method.  So, you can really 
get down to the specifics of what's being executed.

Fwiw, I did a comparison run of a TreeMap with your suggested changes 
including removing "if (comparator == null)" checks with one of our 
favorite SPEC benchmarks which does a pretty good job at exercising 
TreeMap.compare().  Even with 18 degrees of freedom I found the changes 
to have no significant improvement. I didn't look at, or compare the 
generated assembly code for both versions TreeMap.compare().  Though 
that might be kind of interesting.

So, from a performance perspective, it appears this SPEC benchmark shows 
no change in performance.

hths,

charlie ...

cowwoc wrote:
> Something very weird is going on. I tried profiling a minimal testcase and
> there is a considerable amount of "missing time". I am using a dev build of
> Netbeans 6.1 and it says:
>
> MyComparator.compare(Object, Object) 19670ms
> \-> MyComparator.compare(Integer, Integer) 10229ms
>   \-> Self Time 3001ms
>   \-> Integer.compareTo(Integer) 1575ms
> \-> Self Time 3788ms
>
> I spot at least three problems:
>
> 1) The individual item times do not add up to the total (but they do for
> other stack-traces).
> 2) Comparator.compare() self-time consumes more CPU than Integer.compareTo()
> even though it only invokes a method while the latter does actual
> computation.
> 3) Why is extra time consumed moving from MyComparator.compare(Object,
> Object) to (Integer, Integer)? It looks like Generics is doing something at
> runtime which consumes a large amount of cpu.
>
> Gili
>
>
> Clemens Eisserer wrote:
>   
>> Hi cowwoc,
>>
>>     
>>> I guess you're right. It is probably as likely that the JIT will optimize
>>> away the null check as it is that it will optimize away the
>>> NullPointerException check. One exception, though, is when production
>>> systems run using -Xverify:none. In such a case, wouldn't my approach run
>>> faster?
>>>       
>> I don't think it will optimize the null-check away, however it is so
>> cheap that it most likely will not weight at all, compared to all the
>> other operations happening there. Its maybe 5 instructions compared to
>> thousands or even more.
>> -Xverify:none only disables bytecode verification at class-loading
>> time and has no influence (as far as I know) on the performance of the
>> generated code.
>>
>>     
>>> I still think that my proposed code is somehow more consistent/cleaner on
>>> a
>>> design-level but I guess that's just me :)
>>>       
>> I also like it more, its cleaner in my opinion :)
>>
>>     
>>> As an aside, are there standard benchmarks for testing the impact of this
>>> change? I'd love to know whether it actually produces any performance
>>> difference in practice.
>>>       
>> >From my experience i would rather guess that you won't notice the
>> change, noise will be higher.
>>
>> lg Clemens
>>
>>
>>     
>
>   


From Martin.Buchholz at Sun.COM  Wed Jan  9 06:01:11 2008
From: Martin.Buchholz at Sun.COM (Martin Buchholz)
Date: Tue, 08 Jan 2008 22:01:11 -0800
Subject: [PATCH] Performance bug in String(byte[],int,int,Charset)
In-Reply-To: <194f62550801080710n5cf8cfe9r365926f89a019c95@mail.gmail.com>
References: <47474D15.4060504@gmail.com> <474992D4.3010908@univ-mlv.fr>
	<194f62550801080710n5cf8cfe9r365926f89a019c95@mail.gmail.com>
Message-ID: <47846327.5060901@sun.com>

The slowness of array.clone() has been fixed as of jdk6 and 5.0u6.

6428387: array clone() much slower than Arrays.copyOf
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6428387

The rest of this message is the latest version of my private
microbenchmark to measure the fix:

import java.util.*;

public class ArrayCopyMicroBenchmark {
    abstract static class Job {
	private final String name;
	Job(String name) { this.name = name; }
	String name() { return name; }
	abstract void work() throws Throwable;
    }

    private static void collectAllGarbage() {
	try {
	    for (int i = 0; i < 2; i++) {
		System.gc(); System.runFinalization(); Thread.sleep(10);
	    }
	} catch (InterruptedException e) { throw new Error(e); }
    }

    /**
     * Runs each job for long enough that all the runtime compilers
     * have had plenty of time to warm up, i.e. get around to
     * compiling everything worth compiling.
     * Returns array of average times per job per run.
     */
    private static long[] time0(Job ... jobs) throws Throwable {
	final long warmupNanos = 10L * 1000L * 1000L * 1000L;
	long[] nanoss = new long[jobs.length];
	for (int i = 0; i < jobs.length; i++) {
	    collectAllGarbage();
	    long t0 = System.nanoTime();
	    long t;
	    int j = 0;
	    do { jobs[i].work(); j++; }
	    while ((t = System.nanoTime() - t0) < warmupNanos);
	    nanoss[i] = t/j;
	}
	return nanoss;
    }

    private static void time(Job ... jobs) throws Throwable {

	long[] warmup = time0(jobs); // Warm up run
	long[] nanoss = time0(jobs); // Real timing run

	final String nameHeader = "Method";
	int nameWidth = nameHeader.length();
	for (Job job : jobs)
	    nameWidth = Math.max(nameWidth, job.name().length());

	final String millisHeader = "Millis";
	int millisWidth = millisHeader.length();
	for (long nanos : nanoss)
	    millisWidth =
		Math.max(millisWidth,
			 String.format("%d", nanos/(1000L * 1000L)).length());

	final String ratioHeader = "Ratio";
	int ratioWidth = ratioHeader.length();

	String format = String.format("%%-%ds %%%dd %%.3f%%n",
				      nameWidth, millisWidth);
	String headerFormat = String.format("%%-%ds %%-%ds %%-%ds%%n",
					    nameWidth, millisWidth, ratioWidth);
	System.out.printf(headerFormat, "Method", "Millis", "Ratio");

	// Print out absolute and relative times, calibrated against first job
	for (int i = 0; i < jobs.length; i++) {
	    long millis = nanoss[i]/(1000L * 1000L);
	    double ratio = (double)nanoss[i] / (double)nanoss[0];
	    System.out.printf(format, jobs[i].name(), millis, ratio);
	}
    }

    private static int intArg(String[] args, int i, int defaultValue) {
	return args.length > i ? Integer.parseInt(args[i]) : defaultValue;
    }

    private static void deoptimize(Object[] a) {
	for (Object x : a)
	    if (x == null)
		throw new Error();
    }

    public static void main(String[] args) throws Throwable {
	final int iterations = intArg(args, 0, 100000);
	final int size       = intArg(args, 1, 1000);
	final Object[] array = new Object[size];
	final Random rnd = new Random();
	for (int i = 0; i < array.length; i++)
	    array[i] = rnd.nextInt(size);

	time(
	    new Job("arraycopy") { void work() {
		Object[] a = array;
		for (int i = 0; i < iterations; i++) {
		    Object[] t = new Object[size];
		    System.arraycopy(a, 0, t, 0, size);
		    a = t;}
		deoptimize(a);}},
	    new Job("copyOf") { void work() {
		Object[] a = array;
		for (int i = 0; i < iterations; i++)
		    a = Arrays.copyOf(a, size);
		deoptimize(a);}},
	    new Job("clone") { void work() {
		Object[] a = array;
		for (int i = 0; i < iterations; i++)
		    a = a.clone();
		deoptimize(a);}},
	    new Job("loop") { void work() {
		Object[] a = array;
		for (int i = 0; i < iterations; i++) {
		    Object[] t = new Object[size];
		    for (int j = 0; j < size; j++)
			t[j] = a[j];
		    a = t;}
		deoptimize(a);}}
	    );
    }
}

Clemens Eisserer wrote:
> Hello again,
> 
>>By the way, using clone() seams better than Arrays.copyOf() here.
>>
>>byte[] b = ba.clone();
> 
> Why? I remember that I've seen some benchmarks where array.clone() was
> way slower than creating a new array and using System.arraycopy()
> (which is exactly what copyOf does). However this may have changed ;)
> 
> lg Clemens


From linuxhippy at gmail.com  Wed Jan  9 11:53:28 2008
From: linuxhippy at gmail.com (Clemens Eisserer)
Date: Wed, 9 Jan 2008 12:53:28 +0100
Subject: Performance regression in java.util.zip.Deflater
In-Reply-To: <476B1916.2060502@sun.com>
References: <194f62550712201120p1d10ac45xf86eb9cacd2eee87@mail.gmail.com>
	<476ADDAF.2070409@sun.com>
	<194f62550712201336y3380808bv3726d891873be277@mail.gmail.com>
	<476AEDCD.6080504@sun.com>
	<194f62550712201520p30d7b15wa8f2005749a77243@mail.gmail.com>
	<476B0ABA.6030102@sun.com>
	<194f62550712201702n6f44efd5hda27c397e8d1ce96@mail.gmail.com>
	<476B1916.2060502@sun.com>
Message-ID: <194f62550801090353x484a856bl3b6bfdc1e65cf58d@mail.gmail.com>

Hi again,

I've finished a very early draft of the native stride+copy
implementation of Deflater.
Its still very early and is not tested a lot (so don't wory about I
would think this should go in as is ;) ), but seems to perform quite
well.
I just post it ... well ... to get some critics and advises ;)

I don't like the code as its far too messy in my opinion, maybe
somebody has better ideas to clean it up. Furthermore I don't know
wether it breaks corner-cases.

lg Clemens
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Deflater.java
Type: application/octet-stream
Size: 14566 bytes
Desc: not available
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20080109/b339d5ae/Deflater.java>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Deflater.c
Type: application/octet-stream
Size: 10878 bytes
Desc: not available
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20080109/b339d5ae/Deflater.c>

From linuxhippy at gmail.com  Wed Jan  9 12:23:41 2008
From: linuxhippy at gmail.com (Clemens Eisserer)
Date: Wed, 9 Jan 2008 13:23:41 +0100
Subject: Performance regression in java.util.zip.Deflater
In-Reply-To: <194f62550801090353x484a856bl3b6bfdc1e65cf58d@mail.gmail.com>
References: <194f62550712201120p1d10ac45xf86eb9cacd2eee87@mail.gmail.com>
	<476ADDAF.2070409@sun.com>
	<194f62550712201336y3380808bv3726d891873be277@mail.gmail.com>
	<476AEDCD.6080504@sun.com>
	<194f62550712201520p30d7b15wa8f2005749a77243@mail.gmail.com>
	<476B0ABA.6030102@sun.com>
	<194f62550712201702n6f44efd5hda27c397e8d1ce96@mail.gmail.com>
	<476B1916.2060502@sun.com>
	<194f62550801090353x484a856bl3b6bfdc1e65cf58d@mail.gmail.com>
Message-ID: <194f62550801090423s2cd83a1aia4c81541c1e28c04@mail.gmail.com>

Sorry, sent the wrong files and found some bugs. Will re-send the
updated files soon -sorry for the traffic.

lg Clemens

2008/1/9, Clemens Eisserer <linuxhippy at gmail.com>:
> Hi again,
>
> I've finished a very early draft of the native stride+copy
> implementation of Deflater.
> Its still very early and is not tested a lot (so don't wory about I
> would think this should go in as is ;) ), but seems to perform quite
> well.
> I just post it ... well ... to get some critics and advises ;)
>
> I don't like the code as its far too messy in my opinion, maybe
> somebody has better ideas to clean it up. Furthermore I don't know
> wether it breaks corner-cases.
>
> lg Clemens
>
>


From linuxhippy at gmail.com  Wed Jan  9 15:11:47 2008
From: linuxhippy at gmail.com (Clemens Eisserer)
Date: Wed, 9 Jan 2008 16:11:47 +0100
Subject: Early version of striding Deflater
Message-ID: <194f62550801090711q35d8a5f1wb5a4a29480b40f9b@mail.gmail.com>

Hello again,

I've finished an early version of the java.util.zip.Deflater
implementation that uses striding. Its in an early stage and quite
likely will be buggy.
It passes FlaterTest and a simple test written by myself, but maybe
acts differently in corner-cases.

I would be happy to receive some comments as well as criticism ;)

lg Clemens

PS: Sorry for the traffic lately.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Deflater.c
Type: application/octet-stream
Size: 7184 bytes
Desc: not available
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20080109/7c3cfb29/Deflater.c>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Deflater.java
Type: application/octet-stream
Size: 11495 bytes
Desc: not available
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20080109/7c3cfb29/Deflater.java>

From linuxhippy at gmail.com  Thu Jan 10 02:41:36 2008
From: linuxhippy at gmail.com (Clemens Eisserer)
Date: Thu, 10 Jan 2008 03:41:36 +0100
Subject: 6539727 is no bug
Message-ID: <194f62550801091841h6ff93771ic616ca17ba39dee7@mail.gmail.com>

Hi again,

While going through the bug-database I noticed that 6539727 is no bug
- the code is just mis-using Deflater.
The old deflater implementation did/does not return any bytes when
deflateParams() has to be called, at a second call deflate() is called
again and further data is processed.

The problem is that the reporter expects that Deflater does always
return bytes, without calling finished().

lg Clemens


From David.Bristor at Sun.COM  Fri Jan 11 20:51:04 2008
From: David.Bristor at Sun.COM (Dave Bristor)
Date: Fri, 11 Jan 2008 12:51:04 -0800
Subject: Early version of striding Deflater
In-Reply-To: <194f62550801090711q35d8a5f1wb5a4a29480b40f9b@mail.gmail.com>
References: <194f62550801090711q35d8a5f1wb5a4a29480b40f9b@mail.gmail.com>
Message-ID: <4787D6B8.5030805@sun.com>

Hi Clemens,

Thanks for the code drop!  I was not yet able to spend much Quality Time with 
it.  At a glance, it seems like the fix might work.  The general ideas seem 
sound, and I appreciate your effort in addressing it.  So please don't take 
the feedback below as anything but encouragement to help get this bug fixed!

That said...it is a low priority bug (for us, anyway; a P4 out of 5) and we 
have bigger fish to fry.  I'll attend to it as time permits... Then too, there 
are some issues to resolve re the provided code.  I know it's an "Early 
version", but some changes to it would make it easier to examine.

For example, Deflater.java was completely reformatted.  When diffing the code, 
every changeof indentation, javadoc removal, spaces to tabs, moving of {, etc. 
shows up.  We don't allow such changes into the JDK sources.  Spaces only, and 
despite whatever awful "standards" (or not!) already in use in the file, 
please stick to them.  Make every change be the best/smallest one which 
directly addresses the bug.  This makes it easier for all concerned to examine 
the relevant changes.  There are similar issues in Deflater.c.

The files provided the lack the GNU copyright file headers.  My guess is that 
they originated in the src.zip of a binary distribution.  Regardless of their 
source, could you please instead use files from mercurial repository at 
hg.openjdk.java.net/jdk7/tl?

The above will make it easier to review future changes to the files.  Enough 
of that boring stuff, here's some feedback on "interesting" part of the changes!

In Deflater.java, new method rangeCheck() is used in a couple of places, but 
it is not an adequate replacement for the code previously in 
Deflater.setDictionary(); it incompatibly changes the error condition checking 
semantics (we can't omit the check on strm).  We strive to not introduce 
incompatible changes, even small ones like this.

Another incompatible change is to the semantics of Deflater.deflate()...I 
think.  In Deflater.c, it seems that deflateBytes will use setParams if 
necessary and then compress...I reference your email about 6539727 in this 
regard (You are completely right about that being a non-bug, BTW and I'll 
update it shortly: thanks!)  I think your solution would have been the Right 
Thing to do way back when, but we don't want to make an incompatible change 
now.  (I haven't reviewed this thoroughly enough; see my notes re formatting & 
priorities above.)

With a change of this sort, we really do need tests along with a fix.  Have 
you started writing any test cases?

Finally, it seems that this solution obviates the need for the striding in 
DeflaterOutputStream...IIRC, that is some of the original motivation for this 
work.  If you have suggested changes to that class as well, please include them.

I appreciate the work you've put in, and again, I hope to not dissuade you. 
But we have certain standards to which we must adhere, and it's a not a very 
high priority for us now, so we have to minimize the time we spend on it.

Thanks,
	Dave

Clemens Eisserer wrote:
> Hello again,
> 
> I've finished an early version of the java.util.zip.Deflater
> implementation that uses striding. Its in an early stage and quite
> likely will be buggy.
> It passes FlaterTest and a simple test written by myself, but maybe
> acts differently in corner-cases.
> 
> I would be happy to receive some comments as well as criticism ;)
> 
> lg Clemens
> 
> PS: Sorry for the traffic lately.


From linuxhippy at gmail.com  Sun Jan 13 23:32:47 2008
From: linuxhippy at gmail.com (Clemens Eisserer)
Date: Mon, 14 Jan 2008 00:32:47 +0100
Subject: Early version of striding Deflater
In-Reply-To: <4787D6B8.5030805@sun.com>
References: <194f62550801090711q35d8a5f1wb5a4a29480b40f9b@mail.gmail.com>
	<4787D6B8.5030805@sun.com>
Message-ID: <194f62550801131532v4a3b443bt550beb6bd34549cb@mail.gmail.com>

Hi Dave,

Thanks a lot for your reply.
To make it short: Of course I understand that this is low-priority
(also for me, its a fun-only fix because someone in forums.java.net
mentioned it) so don't hurry.
Sorry that I wasted your time with my messy files, they were taken
from my "playground" thats why they were in such a bad shape - they
were only intended to give an idea which "road" I was taking. I
attached the new files taken from the mercurial repositories and only
modified at the affected places.

> With a change of this sort, we really do need tests along with a fix.  Have
> you started writing any test cases?
I completly agree - I have some simple test-cases which test more or
less only very basic functionality of Deflater and they work well
(also FlatterTest passes).
I'll write some more tests which test exotic use-cases like changing
compression-level, ... during compression.

I have some open questions:
1.) Is the seperate structure approach to hold the stride-buffers ok?
2.) Any suggestions for the following names: 1. strm-field in class
(defAdr), 2. defAdr-parameter,3. defptr -  long_to_ptr of defAdr, 4.
def_data - name of the structure
3.) I am not really used to program in C. Are the adress-operations ok
which I used to get members of the new struct def_data?

Thanks for your patience, lg Clemens

Some notes, and changes in ramdom order:
* Changed deflate-bytes to the old behaviour to return after the call
to deflateParams
* Verified that its ok to call deflateParams when there's not enough
space in the output-buffer to flush all "old" data out (thanks to Mark
Adler)
* I changed the method-signiture of the native method compared to
original, because some variables were read from JNI-code, whereas they
could have been passed simply down using method parameters. I think
its "cleaner" to pass it.
* Allocation of the stride-buffers together with the z_stream
structure. z_stream is really large, so the two stride-buffers should
not add that much overhead. However this has the advantage of not
mallocing/freeing and also beeing able to fill the input-stride-buffer
once for several calls of the native method.
* Renamed the strm-adress-parameter to defadr, because it no longer
really points to a strm. I did not rename the java field "strm"
because I did not have an idea for a proper name.
* Removed striding from DeflaterOutputStream, (looked how code looked in 1.4.2).
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Deflater.java
Type: application/octet-stream
Size: 14312 bytes
Desc: not available
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20080114/35b82f0f/Deflater.java>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Deflater.c
Type: application/octet-stream
Size: 8251 bytes
Desc: not available
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20080114/35b82f0f/Deflater.c>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: DeflaterOutputStream.java
Type: application/octet-stream
Size: 5643 bytes
Desc: not available
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20080114/35b82f0f/DeflaterOutputStream.java>

From roman.kennke at aicas.com  Mon Jan 21 21:12:51 2008
From: roman.kennke at aicas.com (Roman Kennke)
Date: Mon, 21 Jan 2008 22:12:51 +0100
Subject: Null-terminated Unicode strings in java.io on Windows
Message-ID: <1200949971.6264.48.camel@mercury>

Hi,

I'm trying to understand a piece of code in java.io . Let me try to
explain:

When you look into WinNTFileSystem.c in the method
getBooleanAttributes(), you see that the file object is converted to a
WCHAR* using fileToNTPath(). In io_util.c, fileToNTPath(), the filename
string is extracted from the File object, and passed to pathToNTPath().

This is where it gets interesting. The pathToNTPath() function first
converts the string into a jchar* using the macro WITH_UNICODE_STRING.
This macro uses GetStringChars() to do this conversion. Now this is
where I'm lost. Java strings are not null-terminated, and neither are
the jchar* returned by GetStringChars() (which is in itself a long
discussed problem in the JNI spec, but that's another story). But back
in pathToNTPath() this jchar* is treated just like a null-terminated
string, for example, we call wcslen() to determine its length, which
relies on the string beeing null-terminated. Now I assume that this
works somehow, and I only see the following options:
1. There's something in this picture that I don't see. Maybe the string
ends up null-terminated somehow?
2. Maybe this works by accident because Hotspot terminates strings with
a null internally?
3. Or this is a serious bug, that for some reason doesn't bomb all the
time. After all, it _does_ bomb in the JamaicaVM, where I'm trying to
port the code to...

Any ideas? I'd be happy to get an explanation for this problem.

Cheers, Roman

-- 
Dipl.-Inform. (FH) Roman Kennke, Software Engineer, http://kennke.org
aicas Allerton Interworks Computer Automated Systems GmbH
Haid-und-Neu-Stra?e 18 * D-76131 Karlsruhe * Germany
http://www.aicas.com   * Tel: +49-721-663 968-0
USt-Id: DE216375633, Handelsregister HRB 109481, AG Karlsruhe
Gesch?ftsf?hrer: Dr. James J. Hunt


From Alan.Bateman at Sun.COM  Mon Jan 21 21:52:09 2008
From: Alan.Bateman at Sun.COM (Alan Bateman)
Date: Mon, 21 Jan 2008 21:52:09 +0000
Subject: Null-terminated Unicode strings in java.io on Windows
In-Reply-To: <1200949971.6264.48.camel@mercury>
References: <1200949971.6264.48.camel@mercury>
Message-ID: <47951409.70805@sun.com>

Roman Kennke wrote:
> Hi,
>
> I'm trying to understand a piece of code in java.io . Let me try to
> explain:
>
> When you look into WinNTFileSystem.c in the method
> getBooleanAttributes(), you see that the file object is converted to a
> WCHAR* using fileToNTPath(). In io_util.c, fileToNTPath(), the filename
> string is extracted from the File object, and passed to pathToNTPath().
>
> This is where it gets interesting. The pathToNTPath() function first
> converts the string into a jchar* using the macro WITH_UNICODE_STRING.
> This macro uses GetStringChars() to do this conversion. Now this is
> where I'm lost. Java strings are not null-terminated, and neither are
> the jchar* returned by GetStringChars() (which is in itself a long
> discussed problem in the JNI spec, but that's another story). But back
> in pathToNTPath() this jchar* is treated just like a null-terminated
> string, for example, we call wcslen() to determine its length, which
> relies on the string beeing null-terminated. Now I assume that this
> works somehow, and I only see the following options:
> 1. There's something in this picture that I don't see. Maybe the string
> ends up null-terminated somehow?
> 2. Maybe this works by accident because Hotspot terminates strings with
> a null internally?
> 3. Or this is a serious bug, that for some reason doesn't bomb all the
> time. After all, it _does_ bomb in the JamaicaVM, where I'm trying to
> port the code to...
>
> Any ideas? I'd be happy to get an explanation for this problem.
>
> Cheers, Roman
>   
The GetStringChars implementation in HotSpot always returns a copy that 
is length+1 and zero terminated. There is a long-standing bug to clarify 
the JNI specification on this topic. I believe it should say that the 
returned array of Unicode characters is not required to be zero 
terminated and that one should use GetStringLength to determine the 
length. Steve Bohne (cc'ed) has done the recent maintenance on the JNI 
spec and may wish to comment. In any case, I did a quick cscope and 
aside from java.io, it only appears to impact a small number of places.

-Alan.


From roman.kennke at aicas.com  Mon Jan 21 22:01:04 2008
From: roman.kennke at aicas.com (Roman Kennke)
Date: Mon, 21 Jan 2008 23:01:04 +0100
Subject: Null-terminated Unicode strings in java.io on Windows
In-Reply-To: <47951409.70805@sun.com>
References: <1200949971.6264.48.camel@mercury>  <47951409.70805@sun.com>
Message-ID: <1200952864.6264.53.camel@mercury>

Hi Alan,

Am Montag, den 21.01.2008, 21:52 +0000 schrieb Alan Bateman:
> Roman Kennke wrote:
> > Hi,
> >
> > I'm trying to understand a piece of code in java.io . Let me try to
> > explain:
> >
> > When you look into WinNTFileSystem.c in the method
> > getBooleanAttributes(), you see that the file object is converted to a
> > WCHAR* using fileToNTPath(). In io_util.c, fileToNTPath(), the filename
> > string is extracted from the File object, and passed to pathToNTPath().
> >
> > This is where it gets interesting. The pathToNTPath() function first
> > converts the string into a jchar* using the macro WITH_UNICODE_STRING.
> > This macro uses GetStringChars() to do this conversion. Now this is
> > where I'm lost. Java strings are not null-terminated, and neither are
> > the jchar* returned by GetStringChars() (which is in itself a long
> > discussed problem in the JNI spec, but that's another story). But back
> > in pathToNTPath() this jchar* is treated just like a null-terminated
> > string, for example, we call wcslen() to determine its length, which
> > relies on the string beeing null-terminated. Now I assume that this
> > works somehow, and I only see the following options:
> > 1. There's something in this picture that I don't see. Maybe the string
> > ends up null-terminated somehow?
> > 2. Maybe this works by accident because Hotspot terminates strings with
> > a null internally?
> > 3. Or this is a serious bug, that for some reason doesn't bomb all the
> > time. After all, it _does_ bomb in the JamaicaVM, where I'm trying to
> > port the code to...
> >
> > Any ideas? I'd be happy to get an explanation for this problem.
> >
> > Cheers, Roman
> >   
> The GetStringChars implementation in HotSpot always returns a copy that 
> is length+1 and zero terminated. There is a long-standing bug to clarify 
> the JNI specification on this topic. I believe it should say that the 
> returned array of Unicode characters is not required to be zero 
> terminated and that one should use GetStringLength to determine the 
> length. Steve Bohne (cc'ed) has done the recent maintenance on the JNI 
> spec and may wish to comment. In any case, I did a quick cscope and 
> aside from java.io, it only appears to impact a small number of places.

So this is indeed a bug, right? Do you think it makes sense to go out
and fix it?

/Roman

-- 
Dipl.-Inform. (FH) Roman Kennke, Software Engineer, http://kennke.org
aicas Allerton Interworks Computer Automated Systems GmbH
Haid-und-Neu-Stra?e 18 * D-76131 Karlsruhe * Germany
http://www.aicas.com   * Tel: +49-721-663 968-0
USt-Id: DE216375633, Handelsregister HRB 109481, AG Karlsruhe
Gesch?ftsf?hrer: Dr. James J. Hunt


From Tim.Bell at Sun.COM  Mon Jan 21 22:45:06 2008
From: Tim.Bell at Sun.COM (Tim Bell)
Date: Mon, 21 Jan 2008 14:45:06 -0800
Subject: Null-terminated Unicode strings in java.io on Windows
In-Reply-To: <1200952864.6264.53.camel@mercury>
References: <1200949971.6264.48.camel@mercury> <47951409.70805@sun.com>
	<1200952864.6264.53.camel@mercury>
Message-ID: <47952072.2060600@sun.com>

Alan Bateman wrote (about GetStringChars):

> [...] is length+1 and zero terminated. There is a long-standing bug to clarify the JNI specification on this topic. I believe it should say that the returned array of Unicode characters is not required to be zero terminated and that one should use GetStringLength to determine the length.

Roman Kennke wrote:

> So this is indeed a bug, right? Do you think it makes sense to go out and fix it?

I'd start here:

   4616318 Spec for JNI's GetStringChars() is incomplete
   http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4616318

HTH - Tim


From roman.kennke at aicas.com  Mon Jan 21 22:57:42 2008
From: roman.kennke at aicas.com (Roman Kennke)
Date: Mon, 21 Jan 2008 23:57:42 +0100
Subject: Null-terminated Unicode strings in java.io on Windows
In-Reply-To: <47952072.2060600@sun.com>
References: <1200949971.6264.48.camel@mercury> <47951409.70805@sun.com>
	<1200952864.6264.53.camel@mercury>  <47952072.2060600@sun.com>
Message-ID: <1200956262.6264.65.camel@mercury>

Hi,

Am Montag, den 21.01.2008, 14:45 -0800 schrieb Tim Bell:
> Alan Bateman wrote (about GetStringChars):
> 
> > [...] is length+1 and zero terminated. There is a long-standing bug to clarify the JNI specification on this topic. I believe it should say that the returned array of Unicode characters is not required to be zero terminated and that one should use GetStringLength to determine the length.
> 
> Roman Kennke wrote:
> 
> > So this is indeed a bug, right? Do you think it makes sense to go out and fix it?
> 
> I'd start here:
> 
>    4616318 Spec for JNI's GetStringChars() is incomplete
>    http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4616318

Hmm, I'm not talking about fixing the spec (I've read that bug report
while searching for clarfication on the spec actually). When the spec
doesn't tell _that_ the returned array is zero terminated, I think we
should assume that it isn't (and it seems to be the trend that the spec
should be clarfied by saying that an implementation isn't required to
return a zero-terminated array, but this is only speculation). What I'm
asking is, should we fix the java.io C code to deal with
non-zero-terminated jchar arrays? Unfortunately, this probably means
allocating additional buffers, because we really need zero terminated
strings here (AFAICS).

/Roman

-- 
Dipl.-Inform. (FH) Roman Kennke, Software Engineer, http://kennke.org
aicas Allerton Interworks Computer Automated Systems GmbH
Haid-und-Neu-Stra?e 18 * D-76131 Karlsruhe * Germany
http://www.aicas.com   * Tel: +49-721-663 968-0
USt-Id: DE216375633, Handelsregister HRB 109481, AG Karlsruhe
Gesch?ftsf?hrer: Dr. James J. Hunt


From program.spe at home.pl  Tue Jan 22 07:35:59 2008
From: program.spe at home.pl (Krzysztof =?UTF-8?Q?=C5=BBelechowski?=)
Date: Tue, 22 Jan 2008 08:35:59 +0100
Subject: Null-terminated Unicode strings in java.io on Windows
In-Reply-To: <1200956262.6264.65.camel@mercury>
References: <1200949971.6264.48.camel@mercury> <47951409.70805@sun.com>
	<1200952864.6264.53.camel@mercury>  <47952072.2060600@sun.com>
	<1200956262.6264.65.camel@mercury>
Message-ID: <1200987359.6488.3.camel@a1dmin.vola.spe.com.pl>


Dnia 21-01-2008, Pn o godzinie 23:57 +0100, Roman Kennke pisze:
> Hi,
> 
> Am Montag, den 21.01.2008, 14:45 -0800 schrieb Tim Bell:
> > Alan Bateman wrote (about GetStringChars):
> > 
> > > [...] is length+1 and zero terminated. There is a long-standing bug to clarify the JNI specification on this topic. I believe it should say that the returned array of Unicode characters is not required to be zero terminated and that one should use GetStringLength to determine the length.
> > 
> > Roman Kennke wrote:
> > 
> > > So this is indeed a bug, right? Do you think it makes sense to go out and fix it?
> > 
> > I'd start here:
> > 
> >    4616318 Spec for JNI's GetStringChars() is incomplete
> >    http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4616318
> 
> Hmm, I'm not talking about fixing the spec (I've read that bug report
> while searching for clarfication on the spec actually). When the spec
> doesn't tell _that_ the returned array is zero terminated, I think we
> should assume that it isn't (and it seems to be the trend that the spec
> should be clarfied by saying that an implementation isn't required to
> return a zero-terminated array, but this is only speculation). What I'm
> asking is, should we fix the java.io C code to deal with
> non-zero-terminated jchar arrays? Unfortunately, this probably means
> allocating additional buffers, because we really need zero terminated
> strings here (AFAICS).

If the specification gets fixed so that GSC result MUST be z-term, 
your VM will cease being conformant 
so it will be fixed and no additional buffers will be needed. 

Chris


From forax at univ-mlv.fr  Thu Jan 24 12:05:44 2008
From: forax at univ-mlv.fr (=?ISO-8859-1?Q?R=E9mi_Forax?=)
Date: Thu, 24 Jan 2008 13:05:44 +0100
Subject: Selector cleanup
Message-ID: <47987F18.4040309@univ-mlv.fr>

Hi all, i currently develop a small web server and  I think codes related
to selectors can be improved just by changing some small pieces of code.
To be crystal clear, i don't want to re-implement all selector related 
stuffs but
just patch some parts of the actual code.

There are some allocations in JDK API  that can be removed,
the code was badly retrofited to 1.5 and lot of field can be declared final.
Some methods/fields still 'use' raw types and doesn't take
advantage of autoboxing.

Futhermore, there is some divergence between Windows and *nix
code i don't understand.
By example, WindowsSelectorImpl and PollSelectorImpl uses a pipe to
implements wakeup but WindowsSelectorImpl  relies on Pipe
and PollSelectorImpl on IOUtil.initPipe().
I think this code should be the same.

in WindowsSelectorImpl:
  - updateSelectedKeys() use an iterator to traverse the array
    (an ArrayList). It should use an indexed loop instead
    to avoid Iterator allocation.
  - field threads should be declared as an ArrayList
    because adjustThreadsCount() supose that i can be iterate
    using an indexed loop.
    Furthermore, it can be generified like this:
    private final ArrayList<Thread> threads = new ArrayList<Thread>();
 - class FDMap,
   I don't see why FdMap need to be a class, all methods can be moved
   as member methods of WindowsSelectorImpl without problems.
   Futhermore, the constructor of FdMap is private (get/put/remove too)
   so the compiler stupidly inserts accessor methods (access$000 etc.).
   Ok, the main point, here when the code was retrofited to 1.5,
   The new Integer() was not transformed to use Integer.valueOf()
   to share small integers and avoid allocation if file descriptor value 
are small.
- In class MapEntry, ski should be declared final.
- close(), set selectedKeys() to null doesn't allow the Set to be collected
    because publicSelectedKeys contains() a reference to it.

in PollSelectorImpl:
  - interruptLock should be final.
  - close(), see WindowsSelectorImpl

in EpollSelectorImpl:
  - like in poll, interruptLock should be final.
  - hashMap fdTokey should be generified and final.
  - close(), see WindowsSelectorImpl
  - implRegister/implDereg
    - They should use Integer.valueOf() instead of new Integer().
    - IOUtil.fdVal() is used spuriously, in implRegister but not
      in implDereg.

  - EPollArrayWrapper
    - updateList is a LinkedList, a double linked list that stores 
Updator object,
      I think it's more efficient to add a field next in the Updator 
object and
      link updator by hand in order to avoid to create LinkedList$Entry .
   - Updataor.opcode and Updator.fd should be declared final.
   
   - SelectorImpl:
     key and selectedKeys should be LinkedHashSet instead of Set
     because they are frequently iterated.
   
let discuss about that before I submit patchs.
R?mi


From Alan.Bateman at Sun.COM  Thu Jan 24 13:11:42 2008
From: Alan.Bateman at Sun.COM (Alan Bateman)
Date: Thu, 24 Jan 2008 13:11:42 +0000
Subject: Selector cleanup
In-Reply-To: <47987F18.4040309@univ-mlv.fr>
References: <47987F18.4040309@univ-mlv.fr>
Message-ID: <47988E8E.3010403@sun.com>

R?mi Forax wrote:
> Hi all, i currently develop a small web server and  I think codes related
> to selectors can be improved just by changing some small pieces of code.
> To be crystal clear, i don't want to re-implement all selector related 
> stuffs but
> just patch some parts of the actual code.
>
> There are some allocations in JDK API  that can be removed,
> the code was badly retrofited to 1.5 and lot of field can be declared 
> final.
> Some methods/fields still 'use' raw types and doesn't take
> advantage of autoboxing.
You're right. Much of the code here dates back to 1.4 and we haven't 
gone back to clean-up things like this.

>
> Futhermore, there is some divergence between Windows and *nix
> code i don't understand.
> By example, WindowsSelectorImpl and PollSelectorImpl uses a pipe to
> implements wakeup but WindowsSelectorImpl  relies on Pipe
> and PollSelectorImpl on IOUtil.initPipe().
> I think this code should be the same.
Ideally we would use a socketpair for the wakeup mechanism but Windows 
doesn't support it. For this reason, Pipe is implemented as a loopback 
connection and this works okay for the wakeup mechanism too. One thing 
to mention is that PollSelectorImpl is only used now when running on the 
Linux 2.4 kernel (it's not used with the 2.6 kernel and isn't used on 
Solaris). I just mention this as someday it might become obsolete and we 
can remove it.

>
> in WindowsSelectorImpl:
>  - updateSelectedKeys() use an iterator to traverse the array
>    (an ArrayList). It should use an indexed loop instead
>    to avoid Iterator allocation.
>  - field threads should be declared as an ArrayList
>    because adjustThreadsCount() supose that i can be iterate
>    using an indexed loop.
>    Furthermore, it can be generified like this:
>    private final ArrayList<Thread> threads = new ArrayList<Thread>();
These clean-ups seem reasonable.

> - class FDMap,
>   I don't see why FdMap need to be a class, all methods can be moved
>   as member methods of WindowsSelectorImpl without problems.
>   Futhermore, the constructor of FdMap is private (get/put/remove too)
>   so the compiler stupidly inserts accessor methods (access$000 etc.).
>   Ok, the main point, here when the code was retrofited to 1.5,
>   The new Integer() was not transformed to use Integer.valueOf()
>   to share small integers and avoid allocation if file descriptor 
> value are small.
These are SOCKET types rather than file descriptors and unlikely to be 
in the range that Integer caches (actually it should be a Long but that 
is a story for another day).

> - In class MapEntry, ski should be declared final.
> - close(), set selectedKeys() to null doesn't allow the Set to be 
> collected
>    because publicSelectedKeys contains() a reference to it.
>
> in PollSelectorImpl:
>  - interruptLock should be final.
>  - close(), see WindowsSelectorImpl
>
> in EpollSelectorImpl:
>  - like in poll, interruptLock should be final.
>  - hashMap fdTokey should be generified and final.
>  - close(), see WindowsSelectorImpl
>  - implRegister/implDereg
>    - They should use Integer.valueOf() instead of new Integer().
>    - IOUtil.fdVal() is used spuriously, in implRegister but not
>      in implDereg.
These are integers so there could be some benefit (but probably very 
hard to measure).

>
>  - EPollArrayWrapper
>    - updateList is a LinkedList, a double linked list that stores 
> Updator object,
>      I think it's more efficient to add a field next in the Updator 
> object and
>      link updator by hand in order to avoid to create LinkedList$Entry .
Maybe but probably very hard to measure.

>   - Updataor.opcode and Updator.fd should be declared final.
>     - SelectorImpl:
>     key and selectedKeys should be LinkedHashSet instead of Set
>     because they are frequently iterated.
>   let discuss about that before I submit patchs.
The clean-ups you suggest seem reasonable so I would suggest going ahead 
and sending a patch. I'm happy to review and work with you to get the 
clean-ups integrated (once OpenJDK/jdk7 re-opens for changes of course).

-Alan.

PS: I don't know anything about your "small web server" but the simple 
server in com.sun.net.httpserver may be useful.


From forax at univ-mlv.fr  Thu Jan 24 15:41:52 2008
From: forax at univ-mlv.fr (=?ISO-8859-1?Q?R=E9mi_Forax?=)
Date: Thu, 24 Jan 2008 16:41:52 +0100
Subject: Selector cleanup
In-Reply-To: <47988E8E.3010403@sun.com>
References: <47987F18.4040309@univ-mlv.fr> <47988E8E.3010403@sun.com>
Message-ID: <4798B1C0.9020907@univ-mlv.fr>

Alan Bateman a ?crit :
> R?mi Forax wrote:
>> Hi all, i currently develop a small web server and  I think codes 
>> related
>> to selectors can be improved just by changing some small pieces of code.
>> To be crystal clear, i don't want to re-implement all selector 
>> related stuffs but
>> just patch some parts of the actual code.
>>
>> There are some allocations in JDK API  that can be removed,
>> the code was badly retrofited to 1.5 and lot of field can be declared 
>> final.
>> Some methods/fields still 'use' raw types and doesn't take
>> advantage of autoboxing.
> You're right. Much of the code here dates back to 1.4 and we haven't 
> gone back to clean-up things like this.
>
>>
>> Futhermore, there is some divergence between Windows and *nix
>> code i don't understand.
>> By example, WindowsSelectorImpl and PollSelectorImpl uses a pipe to
>> implements wakeup but WindowsSelectorImpl  relies on Pipe
>> and PollSelectorImpl on IOUtil.initPipe().
>> I think this code should be the same.
> Ideally we would use a socketpair for the wakeup mechanism but Windows 
> doesn't support it. For this reason, Pipe is implemented as a loopback 
> connection and this works okay for the wakeup mechanism too. One thing 
> to mention is that PollSelectorImpl is only used now when running on 
> the Linux 2.4 kernel (it's not used with the 2.6 kernel and isn't used 
> on Solaris). I just mention this as someday it might become obsolete 
> and we can remove it.
ok.
>
>>
>> in WindowsSelectorImpl:
>>  - updateSelectedKeys() use an iterator to traverse the array
>>    (an ArrayList). It should use an indexed loop instead
>>    to avoid Iterator allocation.
>>  - field threads should be declared as an ArrayList
>>    because adjustThreadsCount() supose that i can be iterate
>>    using an indexed loop.
>>    Furthermore, it can be generified like this:
>>    private final ArrayList<Thread> threads = new ArrayList<Thread>();
> These clean-ups seem reasonable.
>
>> - class FDMap,
>>   I don't see why FdMap need to be a class, all methods can be moved
>>   as member methods of WindowsSelectorImpl without problems.
>>   Futhermore, the constructor of FdMap is private (get/put/remove too)
>>   so the compiler stupidly inserts accessor methods (access$000 etc.).
>>   Ok, the main point, here when the code was retrofited to 1.5,
>>   The new Integer() was not transformed to use Integer.valueOf()
>>   to share small integers and avoid allocation if file descriptor 
>> value are small.
> These are SOCKET types rather than file descriptors and unlikely to be 
> in the range that Integer caches (actually it should be a Long but 
> that is a story for another day).
ok, no valueOf(), i'm not an expert in Windows API.
But are you agree that class FdMap is not necessary.
>
>> - In class MapEntry, ski should be declared final.
>> - close(), set selectedKeys() to null doesn't allow the Set to be 
>> collected
>>    because publicSelectedKeys contains() a reference to it.
>>
>> in PollSelectorImpl:
>>  - interruptLock should be final.
>>  - close(), see WindowsSelectorImpl
>>
>> in EpollSelectorImpl:
>>  - like in poll, interruptLock should be final.
>>  - hashMap fdTokey should be generified and final.
>>  - close(), see WindowsSelectorImpl
>>  - implRegister/implDereg
>>    - They should use Integer.valueOf() instead of new Integer().
>>    - IOUtil.fdVal() is used spuriously, in implRegister but not
>>      in implDereg.
> These are integers so there could be some benefit (but probably very 
> hard to measure).
yes, very hard to mesure until you span 1k thread with one selector each.
btw if you take a look to EPollArrayWrapper, idlSet already use boxing.
>
>>
>>  - EPollArrayWrapper
>>    - updateList is a LinkedList, a double linked list that stores 
>> Updator object,
>>      I think it's more efficient to add a field next in the Updator 
>> object and
>>      link updator by hand in order to avoid to create LinkedList$Entry .
> Maybe but probably very hard to measure.
>
>>   - Updataor.opcode and Updator.fd should be declared final.
>>     - SelectorImpl:
>>     key and selectedKeys should be LinkedHashSet instead of Set
>>     because they are frequently iterated.
>>   let discuss about that before I submit patchs.
> The clean-ups you suggest seem reasonable so I would suggest going 
> ahead and sending a patch. 
i will do that.
> I'm happy to review and work with you to get the clean-ups integrated 
> (once OpenJDK/jdk7 re-opens for changes of course).
Do you have any idea when openjdk will be reopen ?
>
> -Alan.
>
> PS: I don't know anything about your "small web server" but the simple 
> server in com.sun.net.httpserver may be useful.
My small server is a research project that embeds a non-blocking parser 
in a webserver and
claims to have the same performance than grizzly. I will post a blog 
entry about it
when all benchmarks will be finished.

R?mi


From Alan.Bateman at Sun.COM  Thu Jan 24 16:50:07 2008
From: Alan.Bateman at Sun.COM (Alan Bateman)
Date: Thu, 24 Jan 2008 16:50:07 +0000
Subject: Selector cleanup
In-Reply-To: <4798B1C0.9020907@univ-mlv.fr>
References: <47987F18.4040309@univ-mlv.fr> <47988E8E.3010403@sun.com>
	<4798B1C0.9020907@univ-mlv.fr>
Message-ID: <4798C1BF.8010205@sun.com>

R?mi Forax wrote:
> ok, no valueOf(), i'm not an expert in Windows API.
> But are you agree that class FdMap is not necessary.
I agree and I assume you will replace it with an embedded Map. I suspect 
it will be hard to see a difference (with the server VM anyway).

> yes, very hard to mesure until you span 1k thread with one selector each.
The typical NIO server tends to handle lots of concurrent connections so 
with a relatively small number of threads (one per core for example) and 
a small number of Selectors. It sounds like your server might be 
different. Selector creation is relatively expensive so you might run 
into issues there.

> btw if you take a look to EPollArrayWrapper, idlSet already use boxing.
The idle set is almost always empty and an aside from one case, there 
shouldn't be any boxing when the set is empty.

> Do you have any idea when openjdk will be reopen ?
Mark and others are working hard to make this happen very soon. As I 
understand it they have some infrastructure work to complete before they 
can allow changesets to be pushed.


> My small server is a research project that embeds a non-blocking 
> parser in a webserver and
> claims to have the same performance than grizzly. I will post a blog 
> entry about it
> when all benchmarks will be finished.
I look forward to it.

-Alan.


From mark at klomp.org  Fri Jan 25 13:14:32 2008
From: mark at klomp.org (Mark Wielaard)
Date: Fri, 25 Jan 2008 13:14:32 +0000 (UTC)
Subject: Null-terminated Unicode strings in java.io on Windows
References: <1200949971.6264.48.camel@mercury> <47951409.70805@sun.com>
	<1200952864.6264.53.camel@mercury> <47952072.2060600@sun.com>
	<1200956262.6264.65.camel@mercury>
	<1200987359.6488.3.camel@a1dmin.vola.spe.com.pl>
Message-ID: <loom.20080125T130942-864@post.gmane.org>

Krzysztof ?elechowski <program.spe at ...> writes:
> If the specification gets fixed so that GSC result MUST be z-term, 
> your VM will cease being conformant 
> so it will be fixed and no additional buffers will be needed. 

Eh, that doesn't seem right at all.
The specification currently doesn't guarantee that the result is a jchar array
that is zero terminated. So you can expect current runtimes not to do this. As
Roman said at least JamaicaVM doesn't do this. I just checked the
implementations gcj and jamvm, they both also don't make any such guarantee
(cacao does seem to add an extra 0 at the end of the result it returns though).
So "clarifying the spec" would break a lot of code of currently conforming
implementations. The code relying on this behavior seems to be just buggy and
should be fixed imho.

Cheers,

Mark


From forax at univ-mlv.fr  Fri Jan 25 13:20:29 2008
From: forax at univ-mlv.fr (=?ISO-8859-1?Q?R=E9mi_Forax?=)
Date: Fri, 25 Jan 2008 14:20:29 +0100
Subject: Selector cleanup
In-Reply-To: <4798C1BF.8010205@sun.com>
References: <47987F18.4040309@univ-mlv.fr> <47988E8E.3010403@sun.com>
	<4798B1C0.9020907@univ-mlv.fr> <4798C1BF.8010205@sun.com>
Message-ID: <4799E21D.6050908@univ-mlv.fr>

Alan Bateman a ?crit :
> R?mi Forax wrote:
>> ok, no valueOf(), i'm not an expert in Windows API.
>> But are you agree that class FdMap is not necessary.
> I agree and I assume you will replace it with an embedded Map. I 
> suspect it will be hard to see a difference (with the server VM anyway).
i'am pretty sure to see no perf difference but it will use less memory.
>
>> yes, very hard to mesure until you span 1k thread with one selector 
>> each.
> The typical NIO server tends to handle lots of concurrent connections 
> so with a relatively small number of threads (one per core for 
> example) and a small number of Selectors. It sounds like your server 
> might be different. Selector creation is relatively expensive so you 
> might run into issues there.
Selector are pre-created, during startup, so no problem.
We have observed that a selector doesn't work well with a lot of keys. 
That's why i use more threads than one per core.

I have found someone else saying the same thing:
see http://blogs.sun.com/oleksiys/entry/multiple_selector_read_threads_in
>
>> btw if you take a look to EPollArrayWrapper, idlSet already use boxing.
> The idle set is almost always empty and an aside from one case, there 
> shouldn't be any boxing when the set is empty.
I not agree, reading the code, idle set is used when setInterestOps(0) 
is called.
I'm not sure that case is not frequent.
By example, you can found this code in grizzly:

  // disable OP_READ on key before doing anything else
  key.interestOps(key.interestOps() & (~SelectionKey.OP_READ));

see 
http://weblogs.java.net/blog/jfarcand/archive/2006/06/tricks_and_tips.html

>
>> Do you have any idea when openjdk will be reopen ?
> Mark and others are working hard to make this happen very soon. As I 
> understand it they have some infrastructure work to complete before 
> they can allow changesets to be pushed.
>
>
>> My small server is a research project that embeds a non-blocking 
>> parser in a webserver and
>> claims to have the same performance than grizzly. I will post a blog 
>> entry about it
>> when all benchmarks will be finished.
> I look forward to it.
>
> -Alan.
R?mi


From program.spe at home.pl  Fri Jan 25 13:28:10 2008
From: program.spe at home.pl (Krzysztof =?UTF-8?Q?=C5=BBelechowski?=)
Date: Fri, 25 Jan 2008 14:28:10 +0100
Subject: Null-terminated Unicode strings in java.io on Windows
In-Reply-To: <loom.20080125T130942-864@post.gmane.org>
References: <1200949971.6264.48.camel@mercury> <47951409.70805@sun.com>
	<1200952864.6264.53.camel@mercury> <47952072.2060600@sun.com>
	<1200956262.6264.65.camel@mercury>
	<1200987359.6488.3.camel@a1dmin.vola.spe.com.pl>
	<loom.20080125T130942-864@post.gmane.org>
Message-ID: <1201267690.6482.4.camel@a1dmin.vola.spe.com.pl>


Dnia 25-01-2008, Pt o godzinie 13:14 +0000, Mark Wielaard pisze:
> Krzysztof ?elechowski <program.spe at ...> writes:
> > If the specification gets fixed so that GSC result MUST be z-term, 
> > your VM will cease being conformant 
> > so it will be fixed and no additional buffers will be needed. 
> 
> Eh, that doesn't seem right at all.
> The specification currently doesn't guarantee that the result is a jchar array
> that is zero terminated. So you can expect current runtimes not to do this. As
> Roman said at least JamaicaVM doesn't do this. I just checked the
> implementations gcj and jamvm, they both also don't make any such guarantee
> (cacao does seem to add an extra 0 at the end of the result it returns though).
> So "clarifying the spec" would break a lot of code of currently conforming
> implementations. The code relying on this behavior seems to be just buggy and
> should be fixed imho.

The specification is buggy
in that it does not take into account the operating system interface 
and makes correct memory management inefficient 
for the benefit of sparing one byte per buffer 
where an OS call is not needed.
Ridiculous.
The developers at Sun 
found the correct way to interpreting the specification; 
the other ones followed it blindfolded.  It is now time to repent.

Cheers,
Chris


From Alan.Bateman at Sun.COM  Fri Jan 25 13:59:25 2008
From: Alan.Bateman at Sun.COM (Alan Bateman)
Date: Fri, 25 Jan 2008 13:59:25 +0000
Subject: Selector cleanup
In-Reply-To: <4799E21D.6050908@univ-mlv.fr>
References: <47987F18.4040309@univ-mlv.fr> <47988E8E.3010403@sun.com>
	<4798B1C0.9020907@univ-mlv.fr> <4798C1BF.8010205@sun.com>
	<4799E21D.6050908@univ-mlv.fr>
Message-ID: <4799EB3D.5070807@sun.com>

R?mi Forax wrote:
> :
> We have observed that a selector doesn't work well with a lot of keys.
Is this just Windows? I ask because the Selector implementations on 
Solaris and Linux scale very well and there are many people using it on 
servers that are handling thousands of concurrent connections.

>> The idle set is almost always empty and an aside from one case, there 
>> shouldn't be any boxing when the set is empty.
> I not agree, reading the code, idle set is used when setInterestOps(0) 
> is called.
> I'm not sure that case is not frequent.
> By example, you can found this code in grizzly:
>
>  // disable OP_READ on key before doing anything else
>  key.interestOps(key.interestOps() & (~SelectionKey.OP_READ));
>
> see 
> http://weblogs.java.net/blog/jfarcand/archive/2006/06/tricks_and_tips.html 
>
I've only observed it on a few occasions. As it happens that fragment of 
Grizzly code is what lead us to add the idle set as I missed this case 
in the original implementation (see 
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6403933 for details).

-Alan.


From mark at klomp.org  Fri Jan 25 14:40:18 2008
From: mark at klomp.org (Mark Wielaard)
Date: Fri, 25 Jan 2008 14:40:18 +0000 (UTC)
Subject: Null-terminated Unicode strings in java.io on Windows
References: <1200949971.6264.48.camel@mercury> <47951409.70805@sun.com>
	<1200952864.6264.53.camel@mercury> <47952072.2060600@sun.com>
	<1200956262.6264.65.camel@mercury>
Message-ID: <loom.20080125T143515-476@post.gmane.org>

Hi Roman,
Roman Kennke <roman.kennke at ...> writes:
> Hmm, I'm not talking about fixing the spec (I've read that bug report
> while searching for clarfication on the spec actually). When the spec
> doesn't tell _that_ the returned array is zero terminated, I think we
> should assume that it isn't (and it seems to be the trend that the spec
> should be clarfied by saying that an implementation isn't required to
> return a zero-terminated array, but this is only speculation). What I'm
> asking is, should we fix the java.io C code to deal with
> non-zero-terminated jchar arrays? Unfortunately, this probably means
> allocating additional buffers, because we really need zero terminated
> strings here (AFAICS).

If you rewrite WITH_UNICODE_STRING to not use the runtime to allocate and
deallocate the jchar array through GetStringChars and ReleaseStringChars but
allocate and deallocate the jchar arrray yourself using GetStringLength (+ 1)
and then fill it through GetStringRegion() it looks like you don't really need
to allocate any additional buffers.

Cheers,

Mark


From Alan.Bateman at Sun.COM  Fri Jan 25 15:05:13 2008
From: Alan.Bateman at Sun.COM (Alan Bateman)
Date: Fri, 25 Jan 2008 15:05:13 +0000
Subject: Null-terminated Unicode strings in java.io on Windows
In-Reply-To: <1200952864.6264.53.camel@mercury>
References: <1200949971.6264.48.camel@mercury> <47951409.70805@sun.com>
	<1200952864.6264.53.camel@mercury>
Message-ID: <4799FAA9.70804@sun.com>

Roman Kennke wrote:
> Hi Alan,
>
> Am Montag, den 21.01.2008, 21:52 +0000 schrieb Alan Bateman:
>   
>> :
>> The GetStringChars implementation in HotSpot always returns a copy that 
>> is length+1 and zero terminated. There is a long-standing bug to clarify 
>> the JNI specification on this topic. I believe it should say that the 
>> returned array of Unicode characters is not required to be zero 
>> terminated and that one should use GetStringLength to determine the 
>> length. Steve Bohne (cc'ed) has done the recent maintenance on the JNI 
>> spec and may wish to comment. In any case, I did a quick cscope and 
>> aside from java.io, it only appears to impact a small number of places.
>>     
>
> So this is indeed a bug, right? Do you think it makes sense to go out
> and fix it?
>
>   
This is one of issues that has gone unnoticed for years because we don't 
test with other VMs and also the Windows code isn't used when porting to 
other platforms. So I'd suggest just doing it.  Mark Wielaard's mail 
provides a good suggestion. You'll probably want to check other areas of 
the code too (src/windows/native/java/lang/ProcessImpl_md.c for example) 
for other cases.

-Alan.


From rob.lougher at gmail.com  Fri Jan 25 17:08:39 2008
From: rob.lougher at gmail.com (Robert Lougher)
Date: Fri, 25 Jan 2008 17:08:39 +0000
Subject: Null-terminated Unicode strings in java.io on Windows
Message-ID: <d58509a80801250908p6038b8ffsb49996e60d167e37@mail.gmail.com>

Hi,

Apologies if you receive this twice.  I sent it via nabble and it's
now stuck awaiting moderation so I've subscribed.

<quote author="Krzysztof ?elechowski-2">

Dnia 25-01-2008, Pt o godzinie 13:14 +0000, Mark Wielaard pisze:
> Krzysztof ?elechowski <program.spe at ...> writes:
> > If the specification gets fixed so that GSC result MUST be z-term,
> > your VM will cease being conformant
> > so it will be fixed and no additional buffers will be needed.
>
> Eh, that doesn't seem right at all.
> The specification currently doesn't guarantee that the result is a jchar array
> that is zero terminated. So you can expect current runtimes not to do this. As
> Roman said at least JamaicaVM doesn't do this. I just checked the
> implementations gcj and jamvm, they both also don't make any such guarantee
> (cacao does seem to add an extra 0 at the end of the result it returns though).
> So "clarifying the spec" would break a lot of code of currently conforming
> implementations. The code relying on this behavior seems to be just buggy and
> should be fixed imho.

The specification is buggy
in that it does not take into account the operating system interface
and makes correct memory management inefficient
for the benefit of sparing one byte per buffer
where an OS call is not needed.
Ridiculous.
The developers at Sun
found the correct way to interpreting the specification;
the other ones followed it blindfolded.  It is now time to repent.
</quote>

Wrong!  Requiring null termimation will make things more inefficient.
This is because Strings within Java are not null-terminated.  So to
add the null the VM will have to copy the String chars into a new
buffer.

A more efficient approach is to simply return a pointer to the String
chars themselves.  However, this will not be null-terminated.

The JNI specification allows a VM to either copy the chars or return a
direct pointer.  The extra isCopy parameter can be used to find out
what it did.

The point is, if the programmer doesn't need a null-terminated string,
not copying is _much_ more efficient.  The programmer can always copy
and add the null if they need to.  But forcing the VM to
null-terminate will require a copy and slow it down it all cases.

If I was updating the spec, I would change it so that if a copy is
returned it is always null terminated.  If it isn't a copy then it may
or may not be.  It's likely no VMs will need changing, as I suspect
the ones that do not null-terminate are returning direct pointers
(e.g. JamVM).

And I doubt Sun makes a copy because of the null.  Giving out direct
heap pointers causes problems for VMs that move objects within the
heap (e.g. a compacting GC).  Either you've got to "pin" the object so
it can't move or you always copy.  Sun probably chose the latter.  In
JamVM, I decided to pin the String (it's unpinned in
ReleaseStringChars).

Rob.

P.S.  I hope your blindfold has been removed :) When implementing a VM
few things are as straight-forward as they may seem.

From roman at kennke.org  Fri Jan 25 17:17:03 2008
From: roman at kennke.org (Roman Kennke)
Date: Fri, 25 Jan 2008 18:17:03 +0100
Subject: Null-terminated Unicode strings in java.io on Windows
In-Reply-To: <d58509a80801250908p6038b8ffsb49996e60d167e37@mail.gmail.com>
References: <d58509a80801250908p6038b8ffsb49996e60d167e37@mail.gmail.com>
Message-ID: <1201281423.6277.86.camel@mercury>

Hi,

> The specification is buggy
> in that it does not take into account the operating system interface
> and makes correct memory management inefficient
> for the benefit of sparing one byte per buffer
> where an OS call is not needed.
> Ridiculous.
> The developers at Sun
> found the correct way to interpreting the specification;
> the other ones followed it blindfolded.  It is now time to repent.
> </quote>
> 
> Wrong!  Requiring null termimation will make things more inefficient.
> This is because Strings within Java are not null-terminated.

Unless the VM stores all strings with 0-termination internally, which is
possible, but arguably more inefficient on another level.

> If I was updating the spec, I would change it so that if a copy is
> returned it is always null terminated.  If it isn't a copy then it may
> or may not be.  It's likely no VMs will need changing, as I suspect
> the ones that do not null-terminate are returning direct pointers
> (e.g. JamVM).

Maybe we should all go to the original old bug (gosh! from 2001!) and
make some noise?

/Roman

-- 
http://kennke.org/blog/


From roman at kennke.org  Fri Jan 25 17:19:51 2008
From: roman at kennke.org (Roman Kennke)
Date: Fri, 25 Jan 2008 18:19:51 +0100
Subject: Null-terminated Unicode strings in java.io on Windows
In-Reply-To: <1201267690.6482.4.camel@a1dmin.vola.spe.com.pl>
References: <1200949971.6264.48.camel@mercury> <47951409.70805@sun.com>
	<1200952864.6264.53.camel@mercury> <47952072.2060600@sun.com>
	<1200956262.6264.65.camel@mercury>
	<1200987359.6488.3.camel@a1dmin.vola.spe.com.pl>
	<loom.20080125T130942-864@post.gmane.org>
	<1201267690.6482.4.camel@a1dmin.vola.spe.com.pl>
Message-ID: <1201281591.6277.88.camel@mercury>

Hi,

> The specification is buggy
> in that it does not take into account the operating system interface 
> and makes correct memory management inefficient 
> for the benefit of sparing one byte per buffer 
> where an OS call is not needed.
> Ridiculous.

Tom Tromey pointed out another possible problem on IRC: What if the
string itself contains the 0? Unlikely, but possible in the Java world.

Cheers, Roman
-- 
http://kennke.org/blog/


From program.spe at home.pl  Fri Jan 25 17:23:14 2008
From: program.spe at home.pl (Krzysztof =?UTF-8?Q?=C5=BBelechowski?=)
Date: Fri, 25 Jan 2008 18:23:14 +0100
Subject: Null-terminated Unicode strings in java.io on Windows
In-Reply-To: <d58509a80801250908p6038b8ffsb49996e60d167e37@mail.gmail.com>
References: <d58509a80801250908p6038b8ffsb49996e60d167e37@mail.gmail.com>
Message-ID: <1201281794.6482.17.camel@a1dmin.vola.spe.com.pl>


Dnia 25-01-2008, Pt o godzinie 17:08 +0000, Robert Lougher pisze:
> Hi,

Hi-aye.

> 
> Apologies if you receive this twice.  I sent it via nabble and it's
> now stuck awaiting moderation so I've subscribed.
> 
> <quote author="Krzysztof ?elechowski-2">
> 
> Dnia 25-01-2008, Pt o godzinie 13:14 +0000, Mark Wielaard pisze:
> > Krzysztof ?elechowski <program.spe at ...> writes:
> > > If the specification gets fixed so that GSC result MUST be z-term,
> > > your VM will cease being conformant
> > > so it will be fixed and no additional buffers will be needed.
> >
> > Eh, that doesn't seem right at all.
> > The specification currently doesn't guarantee that the result is a jchar array
> > that is zero terminated. So you can expect current runtimes not to do this. As
> > Roman said at least JamaicaVM doesn't do this. I just checked the
> > implementations gcj and jamvm, they both also don't make any such guarantee
> > (cacao does seem to add an extra 0 at the end of the result it returns though).
> > So "clarifying the spec" would break a lot of code of currently conforming
> > implementations. The code relying on this behavior seems to be just buggy and
> > should be fixed imho.
> 
> The specification is buggy
> in that it does not take into account the operating system interface
> and makes correct memory management inefficient
> for the benefit of sparing one byte per buffer
> where an OS call is not needed.
> Ridiculous.
> The developers at Sun
> found the correct way to interpreting the specification;
> the other ones followed it blindfolded.  It is now time to repent.
> </quote>
> 
> Wrong!  Requiring null termimation will make things more inefficient.
> This is because Strings within Java are not null-terminated.  

They are not z-term in the sense that they may contain zero inside, 
but nothing more.  
The implementation is free 
to affix zero to each and every string buffer 
and make that zero unavailable to Java as required by the specification.
It is an easy thing to do because strings are immutable.

> So to
> add the null the VM will have to copy the String chars into a new
> buffer.
> 
> A more efficient approach is to simply return a pointer to the String
> chars themselves.  However, this will not be null-terminated.

It depends on the implementation, as described above.

> 
> The JNI specification allows a VM to either copy the chars or return a
> direct pointer.  The extra isCopy parameter can be used to find out
> what it did.
> 
> The point is, if the programmer doesn't need a null-terminated string,
> not copying is _much_ more efficient.  The programmer can always copy
> and add the null if they need to.  But forcing the VM to
> null-terminate will require a copy and slow it down it all cases.

No, it will not, 
because all strings buffers will have an inaccessible zero at the end.

> 
> If I was updating the spec, I would change it so that if a copy is
> returned it is always null terminated.  If it isn't a copy then it may
> or may not be.  It's likely no VMs will need changing, as I suspect
> the ones that do not null-terminate are returning direct pointers
> (e.g. JamVM).

If I was updating the spec, 
I would say that 
strings are required to be inaccessibly z-term as above 
if the underlying OS expects them to be in most cases.

> 
> And I doubt Sun makes a copy because of the null.  

So do I, they apparently need not.

> 
> Rob.
> 
> P.S.  I hope your blindfold has been removed :) When implementing a VM
> few things are as straight-forward as they may seem.

So do I (that my blindfold has been removed).

Chris


From program.spe at home.pl  Fri Jan 25 17:26:49 2008
From: program.spe at home.pl (Krzysztof =?UTF-8?Q?=C5=BBelechowski?=)
Date: Fri, 25 Jan 2008 18:26:49 +0100
Subject: Null-terminated Unicode strings in java.io on Windows
In-Reply-To: <1201281591.6277.88.camel@mercury>
References: <1200949971.6264.48.camel@mercury> <47951409.70805@sun.com>
	<1200952864.6264.53.camel@mercury> <47952072.2060600@sun.com>
	<1200956262.6264.65.camel@mercury>
	<1200987359.6488.3.camel@a1dmin.vola.spe.com.pl>
	<loom.20080125T130942-864@post.gmane.org>
	<1201267690.6482.4.camel@a1dmin.vola.spe.com.pl>
	<1201281591.6277.88.camel@mercury>
Message-ID: <1201282009.6482.20.camel@a1dmin.vola.spe.com.pl>


Dnia 25-01-2008, Pt o godzinie 18:19 +0100, Roman Kennke pisze:
> Hi,
> 
> > The specification is buggy
> > in that it does not take into account the operating system interface 
> > and makes correct memory management inefficient 
> > for the benefit of sparing one byte per buffer 
> > where an OS call is not needed.
> > Ridiculous.
> 
> Tom Tromey pointed out another possible problem on IRC: What if the
> string itself contains the 0? Unlikely, but possible in the Java world.

I understand that parameters passed to the OS 
are subject to the limitations of the OS.
Not containing a zero inside may be just one of them.  
The Java specification claims nowhere 
that every string can be used to name every object.

Chris


From rob.lougher at gmail.com  Fri Jan 25 17:30:07 2008
From: rob.lougher at gmail.com (Robert Lougher)
Date: Fri, 25 Jan 2008 17:30:07 +0000
Subject: Null-terminated Unicode strings in java.io on Windows
In-Reply-To: <1201281794.6482.17.camel@a1dmin.vola.spe.com.pl>
References: <d58509a80801250908p6038b8ffsb49996e60d167e37@mail.gmail.com>
	<1201281794.6482.17.camel@a1dmin.vola.spe.com.pl>
Message-ID: <d58509a80801250930v78ac0703v1e561fda7e2f2359@mail.gmail.com>

On 1/25/08, Krzysztof ?elechowski <program.spe at home.pl> wrote:
>
> Dnia 25-01-2008, Pt o godzinie 17:08 +0000, Robert Lougher pisze:
> > Hi,
>
> Hi-aye.
>
> >
> > Apologies if you receive this twice.  I sent it via nabble and it's
> > now stuck awaiting moderation so I've subscribed.
> >
> > <quote author="Krzysztof ?elechowski-2">
> >
> > Dnia 25-01-2008, Pt o godzinie 13:14 +0000, Mark Wielaard pisze:
> > > Krzysztof ?elechowski <program.spe at ...> writes:
> > > > If the specification gets fixed so that GSC result MUST be z-term,
> > > > your VM will cease being conformant
> > > > so it will be fixed and no additional buffers will be needed.
> > >
> > > Eh, that doesn't seem right at all.
> > > The specification currently doesn't guarantee that the result is a jchar array
> > > that is zero terminated. So you can expect current runtimes not to do this. As
> > > Roman said at least JamaicaVM doesn't do this. I just checked the
> > > implementations gcj and jamvm, they both also don't make any such guarantee
> > > (cacao does seem to add an extra 0 at the end of the result it returns though).
> > > So "clarifying the spec" would break a lot of code of currently conforming
> > > implementations. The code relying on this behavior seems to be just buggy and
> > > should be fixed imho.
> >
> > The specification is buggy
> > in that it does not take into account the operating system interface
> > and makes correct memory management inefficient
> > for the benefit of sparing one byte per buffer
> > where an OS call is not needed.
> > Ridiculous.
> > The developers at Sun
> > found the correct way to interpreting the specification;
> > the other ones followed it blindfolded.  It is now time to repent.
> > </quote>
> >
> > Wrong!  Requiring null termimation will make things more inefficient.
> > This is because Strings within Java are not null-terminated.
>
> They are not z-term in the sense that they may contain zero inside,
> but nothing more.
> The implementation is free
> to affix zero to each and every string buffer
> and make that zero unavailable to Java as required by the specification.
> It is an easy thing to do because strings are immutable.
>
> > So to
> > add the null the VM will have to copy the String chars into a new
> > buffer.
> >
> > A more efficient approach is to simply return a pointer to the String
> > chars themselves.  However, this will not be null-terminated.
>
> It depends on the implementation, as described above.

No it doesn't.  An implementation would have to be truly stupid to
internally null-terminate.  How many Strings are in the heap?  How
many will the programmer access via GetStringChars?  The null will be
a overhead for all Strings for a miniscule percentage.

From program.spe at home.pl  Fri Jan 25 17:42:23 2008
From: program.spe at home.pl (Krzysztof =?UTF-8?Q?=C5=BBelechowski?=)
Date: Fri, 25 Jan 2008 18:42:23 +0100
Subject: Null-terminated Unicode strings in java.io on Windows
In-Reply-To: <d58509a80801250930v78ac0703v1e561fda7e2f2359@mail.gmail.com>
References: <d58509a80801250908p6038b8ffsb49996e60d167e37@mail.gmail.com>
	<1201281794.6482.17.camel@a1dmin.vola.spe.com.pl>
	<d58509a80801250930v78ac0703v1e561fda7e2f2359@mail.gmail.com>
Message-ID: <1201282944.6482.30.camel@a1dmin.vola.spe.com.pl>


Dnia 25-01-2008, Pt o godzinie 17:30 +0000, Robert Lougher pisze:
> No it doesn't.  An implementation would have to be truly stupid to
> internally null-terminate.  How many Strings are in the heap?  How
> many will the programmer access via GetStringChars?  The null will be
> a overhead for all Strings for a miniscule percentage.

Please observe: 

1. 
the amount of memory needed to manage the allocation 
is greater than the number of bytes 
needed to store one additional character, 
so the relative impact on memory usage will not be dramatic.

2. The string usually has much more characters then one.
That means, if strings take 10 characters on the average, 
the overhead is 10%, in the impossible worst case, as explained below.
This is an overhead I (and most programmers) can live with.

3. Memory is allocated in chunks.  
The size and alignment of the chunk is subject to various limitations.
If the characters of the string do not fill the chunk entirely, 
there is good chance 
that there will space for the terminating zero anyway.

Yours truly,
Chris


From roman at kennke.org  Fri Jan 25 17:44:54 2008
From: roman at kennke.org (Roman Kennke)
Date: Fri, 25 Jan 2008 18:44:54 +0100
Subject: Null-terminated Unicode strings in java.io on Windows
In-Reply-To: <1201282009.6482.20.camel@a1dmin.vola.spe.com.pl>
References: <1200949971.6264.48.camel@mercury> <47951409.70805@sun.com>
	<1200952864.6264.53.camel@mercury> <47952072.2060600@sun.com>
	<1200956262.6264.65.camel@mercury>
	<1200987359.6488.3.camel@a1dmin.vola.spe.com.pl>
	<loom.20080125T130942-864@post.gmane.org>
	<1201267690.6482.4.camel@a1dmin.vola.spe.com.pl>
	<1201281591.6277.88.camel@mercury>
	<1201282009.6482.20.camel@a1dmin.vola.spe.com.pl>
Message-ID: <1201283094.9468.4.camel@mercury>

Heyo,

> > The specification is buggy
> > > in that it does not take into account the operating system interface 
> > > and makes correct memory management inefficient 
> > > for the benefit of sparing one byte per buffer 
> > > where an OS call is not needed.
> > > Ridiculous.
> > 
> > Tom Tromey pointed out another possible problem on IRC: What if the
> > string itself contains the 0? Unlikely, but possible in the Java world.
> 
> I understand that parameters passed to the OS 
> are subject to the limitations of the OS.
> Not containing a zero inside may be just one of them.  
> The Java specification claims nowhere 
> that every string can be used to name every object.

Yeah, but GetStringChars() is a general purpuse JNI function and not at
all tied to the OS. Passing the string on to the OS for I/O purposes is
just one use case. Zero-terminating a Java string really doesn't right.
If you need it zero-terminated, then you can always do this in your code
by copying over the string in a static buffer or similar (as suggested
somewhere else in this thread). This is by no means incorrect memory
management, it only requires a little more thinking.

/Roman

-- 
http://kennke.org/blog/


From roman at kennke.org  Fri Jan 25 17:54:01 2008
From: roman at kennke.org (Roman Kennke)
Date: Fri, 25 Jan 2008 18:54:01 +0100
Subject: Null-terminated Unicode strings in java.io on Windows
In-Reply-To: <1201282944.6482.30.camel@a1dmin.vola.spe.com.pl>
References: <d58509a80801250908p6038b8ffsb49996e60d167e37@mail.gmail.com>
	<1201281794.6482.17.camel@a1dmin.vola.spe.com.pl>
	<d58509a80801250930v78ac0703v1e561fda7e2f2359@mail.gmail.com>
	<1201282944.6482.30.camel@a1dmin.vola.spe.com.pl>
Message-ID: <1201283641.9468.11.camel@mercury>

Hi,

> Please observe: 
> 
> 1. 
> the amount of memory needed to manage the allocation 
> is greater than the number of bytes 
> needed to store one additional character, 
> so the relative impact on memory usage will not be dramatic.

This is just ridiculous. An average Java app has tons of Strings in
them, most of which are _not_ used in GetStringChars. Allocating one
additional jchar for each String surely _does_ impact. Especially on
embedded systems (this is where I'm working on).

> 2. The string usually has much more characters then one.
> That means, if strings take 10 characters on the average, 
> the overhead is 10%, in the impossible worst case, as explained below.
> This is an overhead I (and most programmers) can live with.

Yeah, but not in the embedded/mobile world.

> 3. Memory is allocated in chunks.  
> The size and alignment of the chunk is subject to various limitations.
> If the characters of the string do not fill the chunk entirely, 
> there is good chance 
> that there will space for the terminating zero anyway.

Yeah, and if not? I can only speak for Jamaica, where memory is
allocated in chunks of 32 bytes, or 16 chars. There's a 1 out 16
(actually, 2 out of 16 because of some internal stuff) chance that
there's no trailing space for the zero, so should we allocate another
32bytes, only to get this zero termination for a JNI method that's only
rarely used? So much for the no-impact statement above...

Cheers, Roman

-- 
http://kennke.org/blog/


From rob.lougher at gmail.com  Fri Jan 25 17:55:34 2008
From: rob.lougher at gmail.com (Robert Lougher)
Date: Fri, 25 Jan 2008 17:55:34 +0000
Subject: Null-terminated Unicode strings in java.io on Windows
In-Reply-To: <1201282944.6482.30.camel@a1dmin.vola.spe.com.pl>
References: <d58509a80801250908p6038b8ffsb49996e60d167e37@mail.gmail.com>
	<1201281794.6482.17.camel@a1dmin.vola.spe.com.pl>
	<d58509a80801250930v78ac0703v1e561fda7e2f2359@mail.gmail.com>
	<1201282944.6482.30.camel@a1dmin.vola.spe.com.pl>
Message-ID: <d58509a80801250955m391337a5gd2855c16f7c51ac5@mail.gmail.com>

Hi Chris,

On 1/25/08, Krzysztof ?elechowski <program.spe at home.pl> wrote:
>
> Dnia 25-01-2008, Pt o godzinie 17:30 +0000, Robert Lougher pisze:
> > No it doesn't.  An implementation would have to be truly stupid to
> > internally null-terminate.  How many Strings are in the heap?  How
> > many will the programmer access via GetStringChars?  The null will be
> > a overhead for all Strings for a miniscule percentage.
>
> Please observe:
>
> 1.
> the amount of memory needed to manage the allocation
> is greater than the number of bytes
> needed to store one additional character,
> so the relative impact on memory usage will not be dramatic.
>
> 2. The string usually has much more characters then one.
> That means, if strings take 10 characters on the average,
> the overhead is 10%, in the impossible worst case, as explained below.
> This is an overhead I (and most programmers) can live with.
>
> 3. Memory is allocated in chunks.
> The size and alignment of the chunk is subject to various limitations.
> If the characters of the string do not fill the chunk entirely,
> there is good chance
> that there will space for the terminating zero anyway.
>

Yes, you're absolutely right.  However, consider for the sake of
argument the memory manager aligned on 4 byte boundaries.  Consider we
have 4 strings.  The first is 1 byte long, the second 2 bytes and so
on.  The first three strings will absorb the null due to alignment.
The fourth however, will require an extra 4 bytes because of the same
alignment. So we have a 4 byte overhead for 4 strings, or 1 byte per
string.

Rob.


byte, 2 bytes, 3 bytes and 4 bytes.
> Yours truly,
> Chris
>
>

From program.spe at home.pl  Fri Jan 25 18:14:28 2008
From: program.spe at home.pl (Krzysztof =?UTF-8?Q?=C5=BBelechowski?=)
Date: Fri, 25 Jan 2008 19:14:28 +0100
Subject: Null-terminated Unicode strings in java.io on Windows
In-Reply-To: <1201283641.9468.11.camel@mercury>
References: <d58509a80801250908p6038b8ffsb49996e60d167e37@mail.gmail.com>
	<1201281794.6482.17.camel@a1dmin.vola.spe.com.pl>
	<d58509a80801250930v78ac0703v1e561fda7e2f2359@mail.gmail.com>
	<1201282944.6482.30.camel@a1dmin.vola.spe.com.pl>
	<1201283641.9468.11.camel@mercury>
Message-ID: <1201284868.6482.45.camel@a1dmin.vola.spe.com.pl>


Dnia 25-01-2008, Pt o godzinie 18:54 +0100, Roman Kennke pisze:
> Hi,
> 
> > Please observe: 
> > 
> > 1. 
> > the amount of memory needed to manage the allocation 
> > is greater than the number of bytes 
> > needed to store one additional character, 
> > so the relative impact on memory usage will not be dramatic.
> 
> This is just ridiculous. An average Java app has tons of Strings in
> them, most of which are _not_ used in GetStringChars. Allocating one
> additional jchar for each String surely _does_ impact. Especially on
> embedded systems (this is where I'm working on).

I never said there will be no impact.

Aside: wouldn't it be cheaper if the device worked without Java on it?  
(another ridiculous question, I am afraid)

> 
> > 2. The string usually has much more characters then one.
> > That means, if strings take 10 characters on the average, 
> > the overhead is 10%, in the impossible worst case, as explained below.
> > This is an overhead I (and most programmers) can live with.
> 
> Yeah, but not in the embedded/mobile world.

Well, in that case it seems a fork is needed.
The desktop code can assume that string buffers are z-term.
The mobile code has to copy.

> 
> > 3. Memory is allocated in chunks.  
> > The size and alignment of the chunk is subject to various limitations.
> > If the characters of the string do not fill the chunk entirely, 
> > there is good chance 
> > that there will space for the terminating zero anyway.
> 
> Yeah, and if not? I can only speak for Jamaica, where memory is
> allocated in chunks of 32 bytes, or 16 chars. There's a 1 out 16
> (actually, 2 out of 16 because of some internal stuff) chance that
> there's no trailing space for the zero, so should we allocate another
> 32bytes, only to get this zero termination for a JNI method that's only
> rarely used? So much for the no-impact statement above...

So that accumulated memory cost is linear in the numer of strings?  
Good point, statement 3 is invalid.

Chris


From program.spe at home.pl  Fri Jan 25 18:18:34 2008
From: program.spe at home.pl (Krzysztof =?UTF-8?Q?=C5=BBelechowski?=)
Date: Fri, 25 Jan 2008 19:18:34 +0100
Subject: Null-terminated Unicode strings in java.io on Windows
In-Reply-To: <1201283094.9468.4.camel@mercury>
References: <1200949971.6264.48.camel@mercury> <47951409.70805@sun.com>
	<1200952864.6264.53.camel@mercury> <47952072.2060600@sun.com>
	<1200956262.6264.65.camel@mercury>
	<1200987359.6488.3.camel@a1dmin.vola.spe.com.pl>
	<loom.20080125T130942-864@post.gmane.org>
	<1201267690.6482.4.camel@a1dmin.vola.spe.com.pl>
	<1201281591.6277.88.camel@mercury>
	<1201282009.6482.20.camel@a1dmin.vola.spe.com.pl>
	<1201283094.9468.4.camel@mercury>
Message-ID: <1201285114.6482.46.camel@a1dmin.vola.spe.com.pl>


Dnia 25-01-2008, Pt o godzinie 18:44 +0100, Roman Kennke pisze:
> Heyo,
> 
> > > The specification is buggy
> > > > in that it does not take into account the operating system interface 
> > > > and makes correct memory management inefficient 
> > > > for the benefit of sparing one byte per buffer 
> > > > where an OS call is not needed.
> > > > Ridiculous.
> > > 
> > > Tom Tromey pointed out another possible problem on IRC: What if the
> > > string itself contains the 0? Unlikely, but possible in the Java world.
> > 
> > I understand that parameters passed to the OS 
> > are subject to the limitations of the OS.
> > Not containing a zero inside may be just one of them.  
> > The Java specification claims nowhere 
> > that every string can be used to name every object.
> 
> Yeah, but GetStringChars() is a general purpuse JNI function and not at
> all tied to the OS. Passing the string on to the OS for I/O purposes is
> just one use case. Zero-terminating a Java string really doesn't right.
> If you need it zero-terminated, then you can always do this in your code
> by copying over the string in a static buffer or similar (as suggested
> somewhere else in this thread). This is by no means incorrect memory
> management, it only requires a little more thinking.
> 

Static buffers are not re?ntrant and unwieldy: 
they are either too large or too small.

It has been argued that excessive copying is inefficient 
and can be easily avoided with proper setup.

Chris


From rob.lougher at gmail.com  Fri Jan 25 19:01:22 2008
From: rob.lougher at gmail.com (Robert Lougher)
Date: Fri, 25 Jan 2008 19:01:22 +0000
Subject: Null-terminated Unicode strings in java.io on Windows
In-Reply-To: <1201284868.6482.45.camel@a1dmin.vola.spe.com.pl>
References: <d58509a80801250908p6038b8ffsb49996e60d167e37@mail.gmail.com>
	<1201281794.6482.17.camel@a1dmin.vola.spe.com.pl>
	<d58509a80801250930v78ac0703v1e561fda7e2f2359@mail.gmail.com>
	<1201282944.6482.30.camel@a1dmin.vola.spe.com.pl>
	<1201283641.9468.11.camel@mercury>
	<1201284868.6482.45.camel@a1dmin.vola.spe.com.pl>
Message-ID: <d58509a80801251101u624f0282g38edb5136c8fad45@mail.gmail.com>

Hi,

This is getting a bit hostile for no reason....  Thinking about
alignment gives an interesting solution.

1) Strings are not null-terminated
2) For most strings the alignment gives the VM room to terminate in
place when GetStringChars is called
3) Copy strings that can't be terminated in place.

On average, you'll need to copy 1/<alignment> strings.

Rob.


From linuxhippy at gmail.com  Fri Jan 25 19:29:44 2008
From: linuxhippy at gmail.com (Clemens Eisserer)
Date: Fri, 25 Jan 2008 20:29:44 +0100
Subject: Null-terminated Unicode strings in java.io on Windows
In-Reply-To: <d58509a80801251101u624f0282g38edb5136c8fad45@mail.gmail.com>
References: <d58509a80801250908p6038b8ffsb49996e60d167e37@mail.gmail.com>
	<1201281794.6482.17.camel@a1dmin.vola.spe.com.pl>
	<d58509a80801250930v78ac0703v1e561fda7e2f2359@mail.gmail.com>
	<1201282944.6482.30.camel@a1dmin.vola.spe.com.pl>
	<1201283641.9468.11.camel@mercury>
	<1201284868.6482.45.camel@a1dmin.vola.spe.com.pl>
	<d58509a80801251101u624f0282g38edb5136c8fad45@mail.gmail.com>
Message-ID: <194f62550801251129p625dac1cm28e3c53de3c7dde8@mail.gmail.com>

Hi there,

> This is getting a bit hostile for no reason....  Thinking about
> alignment gives an interesting solution.
>
> 1) Strings are not null-terminated
> 2) For most strings the alignment gives the VM room to terminate in
> place when GetStringChars is called
> 3) Copy strings that can't be terminated in place.

However GetStringChars() as far as I know always returns a copy
because hotspot does not support pinning (or at least I think so) - at
least for the moving GCs. So if one byte more is allocated or not on
the JNI side should not make much difference even if its never needed.

lg Clemens


From rob.lougher at gmail.com  Fri Jan 25 16:45:51 2008
From: rob.lougher at gmail.com (Robert Lougher)
Date: Fri, 25 Jan 2008 08:45:51 -0800 (PST)
Subject: Null-terminated Unicode strings in java.io on Windows
In-Reply-To: <1201267690.6482.4.camel@a1dmin.vola.spe.com.pl>
References: <1200949971.6264.48.camel@mercury> <47951409.70805@sun.com>
	<1200952864.6264.53.camel@mercury> <47952072.2060600@sun.com>
	<1200956262.6264.65.camel@mercury>
	<1200987359.6488.3.camel@a1dmin.vola.spe.com.pl>
	<loom.20080125T130942-864@post.gmane.org>
	<1201267690.6482.4.camel@a1dmin.vola.spe.com.pl>
Message-ID: <15091812.post@talk.nabble.com>


Hi,


Krzysztof ?elechowski-2 wrote:
> 
> 
> Dnia 25-01-2008, Pt o godzinie 13:14 +0000, Mark Wielaard pisze:
>> Krzysztof ?elechowski <program.spe at ...> writes:
>> > If the specification gets fixed so that GSC result MUST be z-term, 
>> > your VM will cease being conformant 
>> > so it will be fixed and no additional buffers will be needed. 
>> 
>> Eh, that doesn't seem right at all.
>> The specification currently doesn't guarantee that the result is a jchar
>> array
>> that is zero terminated. So you can expect current runtimes not to do
>> this. As
>> Roman said at least JamaicaVM doesn't do this. I just checked the
>> implementations gcj and jamvm, they both also don't make any such
>> guarantee
>> (cacao does seem to add an extra 0 at the end of the result it returns
>> though).
>> So "clarifying the spec" would break a lot of code of currently
>> conforming
>> implementations. The code relying on this behavior seems to be just buggy
>> and
>> should be fixed imho.
> 
> The specification is buggy
> in that it does not take into account the operating system interface 
> and makes correct memory management inefficient 
> for the benefit of sparing one byte per buffer 
> where an OS call is not needed.
> Ridiculous.
> The developers at Sun 
> found the correct way to interpreting the specification; 
> the other ones followed it blindfolded.  It is now time to repent.
> 

Wrong!  Requiring null termimation will make things more inefficient.  This
is because Strings within Java are not null-terminated.  So to add the null
the VM will have to copy the String chars into a new buffer.

A more efficient approach is to simply return a pointer to the String chars
themselves.  However, this will not be null-terminated.

The JNI specification allows a VM to either copy the chars or return a
direct pointer.  The extra isCopy parameter can be used to find out what it
did.

The point is, if the programmer doesn't need a null-terminated string, not
copying is _much_ more efficient.  The programmer can always copy and add
the null if they need to.  But forcing the VM to null-terminate will require
a copy and slow it down it all cases.

If I was updating the spec, I would change it so that if a copy is returned
it is always null terminated.  If it isn't a copy then it may or may not be. 
It's likely no VMs will need changing, as I suspect the ones that do not
null-terminate are returning direct pointers (e.g. JamVM).

And I doubt Sun makes a copy because of the null.  Giving out direct heap
pointers causes problems for VMs that move objects within the heap (e.g. a
compacting GC).  Either you've got to "pin" the object so it can't move or
you always copy.  Sun probably chose the latter.  In JamVM, I decided to pin
the String (it's unpinned in ReleaseStringChars).

Rob.

P.S.  I hope your blindfold has been removed :) When implementing a VM few
things are as straight-forward as they may seem.

-- 
View this message in context: http://www.nabble.com/Null-terminated-Unicode-strings-in-java.io-on-Windows-tp15006673p15091812.html
Sent from the OpenJDK Core Libraries mailing list archive at Nabble.com.


From matthias at mernst.org  Fri Jan 25 19:52:42 2008
From: matthias at mernst.org (Matthias Ernst)
Date: Fri, 25 Jan 2008 20:52:42 +0100
Subject: Selector cleanup
In-Reply-To: <22ec15240801251151h1ddb1a44oe037096586596d79@mail.gmail.com>
References: <47987F18.4040309@univ-mlv.fr> <47988E8E.3010403@sun.com>
	<4798B1C0.9020907@univ-mlv.fr> <4798C1BF.8010205@sun.com>
	<4799E21D.6050908@univ-mlv.fr> <4799EB3D.5070807@sun.com>
	<22ec15240801251151h1ddb1a44oe037096586596d79@mail.gmail.com>
Message-ID: <22ec15240801251152g5b5142a9o4737d9990b8a9a3d@mail.gmail.com>

On Jan 25, 2008 2:59 PM, Alan Bateman <Alan.Bateman at sun.com> wrote:

> > I not agree, reading the code, idle set is used when setInterestOps(0)
> > is called.
> > I'm not sure that case is not frequent.

> I've only observed it on a few occasions.

Hmm? I thought a boilerplate selection loop looks more or less like this:

while(true) {
 select();
 for k in selectedKeys: {
   k.setInterestOps(0);    <===========
   pool.execute({
     newInterest = handle(k);
     k.interestOps(newInterest);
   });
 }
}

Matthias


From rob.lougher at gmail.com  Fri Jan 25 19:54:33 2008
From: rob.lougher at gmail.com (Robert Lougher)
Date: Fri, 25 Jan 2008 19:54:33 +0000
Subject: Null-terminated Unicode strings in java.io on Windows
In-Reply-To: <194f62550801251129p625dac1cm28e3c53de3c7dde8@mail.gmail.com>
References: <d58509a80801250908p6038b8ffsb49996e60d167e37@mail.gmail.com>
	<1201281794.6482.17.camel@a1dmin.vola.spe.com.pl>
	<d58509a80801250930v78ac0703v1e561fda7e2f2359@mail.gmail.com>
	<1201282944.6482.30.camel@a1dmin.vola.spe.com.pl>
	<1201283641.9468.11.camel@mercury>
	<1201284868.6482.45.camel@a1dmin.vola.spe.com.pl>
	<d58509a80801251101u624f0282g38edb5136c8fad45@mail.gmail.com>
	<194f62550801251129p625dac1cm28e3c53de3c7dde8@mail.gmail.com>
Message-ID: <d58509a80801251154y45a45aa3t87b44691a924dad6@mail.gmail.com>

On 1/25/08, Clemens Eisserer <linuxhippy at gmail.com> wrote:
> Hi there,
>
> > This is getting a bit hostile for no reason....  Thinking about
> > alignment gives an interesting solution.
> >
> > 1) Strings are not null-terminated
> > 2) For most strings the alignment gives the VM room to terminate in
> > place when GetStringChars is called
> > 3) Copy strings that can't be terminated in place.
>
> However GetStringChars() as far as I know always returns a copy
> because hotspot does not support pinning (or at least I think so) - at
> least for the moving GCs. So if one byte more is allocated or not on
> the JNI side should not make much difference even if its never needed.
>

Yes, I already mentioned Sun probably chose to copy to avoid pinning.
 I did the opposite in JamVM and pinned the String to avoid the copy.
It appears that other VMs such as gcj and Jamaica also do not copy the
string in GetStringChars (however, I do not know if they have a moving
GC or not).

The above was a solution to the problem of null-terminating the string
chars without having to copy.

Rob.

> lg Clemens
>


From roman at kennke.org  Fri Jan 25 20:37:41 2008
From: roman at kennke.org (Roman Kennke)
Date: Fri, 25 Jan 2008 21:37:41 +0100
Subject: [PATCH] Move Solaris specific classes to solaris/
Message-ID: <1201293461.9468.21.camel@mercury>

Hi,

there are some classes in the jdk/share tree, that seem to be Solaris
specific. I suggest moving them to the jdk/solaris tree instead. Or am I
wrong here?

/Roman

-- 
http://kennke.org/blog/
-------------- next part --------------
# HG changeset patch
# User Roman Kennke <kennke at aicas.com>
# Date 1201293270 -3600
# Node ID db9384d2f46857b26ae306b4a0e1d25a049c634e
# Parent  2b6c2ce8cd88445d9e3ea709069bf26d53039223
Moved Solaris specific NIO Java classes to the solaris subdir

diff -r 2b6c2ce8cd88 -r db9384d2f468 src/share/classes/sun/nio/ch/AbstractPollSelectorImpl.java
--- a/src/share/classes/sun/nio/ch/AbstractPollSelectorImpl.java	Tue Dec 18 15:30:58 2007 +0100
+++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
@@ -1,187 +0,0 @@
-/*
- * Copyright 2001-2004 Sun Microsystems, Inc.  All Rights Reserved.
- * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
- *
- * This code is free software; you can redistribute it and/or modify it
- * under the terms of the GNU General Public License version 2 only, as
- * published by the Free Software Foundation.  Sun designates this
- * particular file as subject to the "Classpath" exception as provided
- * by Sun in the LICENSE file that accompanied this code.
- *
- * This code is distributed in the hope that it will be useful, but WITHOUT
- * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
- * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
- * version 2 for more details (a copy is included in the LICENSE file that
- * accompanied this code).
- *
- * You should have received a copy of the GNU General Public License version
- * 2 along with this work; if not, write to the Free Software Foundation,
- * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA.
- *
- * Please contact Sun Microsystems, Inc., 4150 Network Circle, Santa Clara,
- * CA 95054 USA or visit www.sun.com if you need additional information or
- * have any questions.
- */
-
-package sun.nio.ch;
-
-import java.io.IOException;
-import java.nio.channels.*;
-import java.nio.channels.spi.*;
-import java.util.*;
-import sun.misc.*;
-
-
-/**
- * An abstract selector impl.
- */
-
-abstract class AbstractPollSelectorImpl
-    extends SelectorImpl
-{
-
-    // The poll fd array
-    PollArrayWrapper pollWrapper;
-
-    // Initial capacity of the pollfd array
-    protected final int INIT_CAP = 10;
-
-    // The list of SelectableChannels serviced by this Selector
-    protected SelectionKeyImpl[] channelArray;
-
-    // In some impls the first entry of channelArray is bogus
-    protected int channelOffset = 0;
-
-    // The number of valid channels in this Selector's poll array
-    protected int totalChannels;
-
-    // True if this Selector has been closed
-    private boolean closed = false;
-
-    AbstractPollSelectorImpl(SelectorProvider sp, int channels, int offset) {
-        super(sp);
-        this.totalChannels = channels;
-        this.channelOffset = offset;
-    }
-
-    void putEventOps(SelectionKeyImpl sk, int ops) {
-        pollWrapper.putEventOps(sk.getIndex(), ops);
-    }
-
-    public Selector wakeup() {
-        pollWrapper.interrupt();
-        return this;
-    }
-
-    protected abstract int doSelect(long timeout) throws IOException;
-
-    protected void implClose() throws IOException {
-        if (!closed) {
-            closed = true;
-            // Deregister channels
-            for(int i=channelOffset; i<totalChannels; i++) {
-                SelectionKeyImpl ski = channelArray[i];
-                assert(ski.getIndex() != -1);
-                ski.setIndex(-1);
-                deregister(ski);
-                SelectableChannel selch = channelArray[i].channel();
-                if (!selch.isOpen() && !selch.isRegistered())
-                    ((SelChImpl)selch).kill();
-            }
-            implCloseInterrupt();
-            pollWrapper.free();
-            pollWrapper = null;
-            selectedKeys = null;
-            channelArray = null;
-            totalChannels = 0;
-        }
-    }
-
-    protected abstract void implCloseInterrupt() throws IOException;
-
-    /**
-     * Copy the information in the pollfd structs into the opss
-     * of the corresponding Channels. Add the ready keys to the
-     * ready queue.
-     */
-    protected int updateSelectedKeys() {
-        int numKeysUpdated = 0;
-        // Skip zeroth entry; it is for interrupts only
-        for (int i=channelOffset; i<totalChannels; i++) {
-            int rOps = pollWrapper.getReventOps(i);
-            if (rOps != 0) {
-                SelectionKeyImpl sk = channelArray[i];
-                pollWrapper.putReventOps(i, 0);
-                if (selectedKeys.contains(sk)) {
-                    if (sk.channel.translateAndSetReadyOps(rOps, sk)) {
-                        numKeysUpdated++;
-                    }
-                } else {
-                    sk.channel.translateAndSetReadyOps(rOps, sk);
-                    if ((sk.nioReadyOps() & sk.nioInterestOps()) != 0) {
-                        selectedKeys.add(sk);
-                        numKeysUpdated++;
-                    }
-                }
-            }
-        }
-        return numKeysUpdated;
-    }
-
-    protected void implRegister(SelectionKeyImpl ski) {
-        // Check to see if the array is large enough
-        if (channelArray.length == totalChannels) {
-            // Make a larger array
-            int newSize = pollWrapper.totalChannels * 2;
-            SelectionKeyImpl temp[] = new SelectionKeyImpl[newSize];
-            // Copy over
-            for (int i=channelOffset; i<totalChannels; i++)
-                temp[i] = channelArray[i];
-            channelArray = temp;
-            // Grow the NativeObject poll array
-            pollWrapper.grow(newSize);
-        }
-        channelArray[totalChannels] = ski;
-        ski.setIndex(totalChannels);
-        pollWrapper.addEntry(ski.channel);
-        totalChannels++;
-        keys.add(ski);
-    }
-
-    protected void implDereg(SelectionKeyImpl ski) throws IOException {
-        // Algorithm: Copy the sc from the end of the list and put it into
-        // the location of the sc to be removed (since order doesn't
-        // matter). Decrement the sc count. Update the index of the sc
-        // that is moved.
-        int i = ski.getIndex();
-        assert (i >= 0);
-        if (i != totalChannels - 1) {
-            // Copy end one over it
-            SelectionKeyImpl endChannel = channelArray[totalChannels-1];
-            channelArray[i] = endChannel;
-            endChannel.setIndex(i);
-            pollWrapper.release(i);
-            PollArrayWrapper.replaceEntry(pollWrapper, totalChannels - 1,
-                                          pollWrapper, i);
-        } else {
-            pollWrapper.release(i);
-        }
-        // Destroy the last one
-        channelArray[totalChannels-1] = null;
-        totalChannels--;
-        pollWrapper.totalChannels--;
-        ski.setIndex(-1);
-        // Remove the key from keys and selectedKeys
-        keys.remove(ski);
-        selectedKeys.remove(ski);
-        deregister((AbstractSelectionKey)ski);
-        SelectableChannel selch = ski.channel();
-        if (!selch.isOpen() && !selch.isRegistered())
-            ((SelChImpl)selch).kill();
-    }
-
-    static {
-        Util.load();
-    }
-
-}
diff -r 2b6c2ce8cd88 -r db9384d2f468 src/share/classes/sun/nio/ch/DevPollSelectorProvider.java
--- a/src/share/classes/sun/nio/ch/DevPollSelectorProvider.java	Tue Dec 18 15:30:58 2007 +0100
+++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
@@ -1,42 +0,0 @@
-/*
- * Copyright 2001-2003 Sun Microsystems, Inc.  All Rights Reserved.
- * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
- *
- * This code is free software; you can redistribute it and/or modify it
- * under the terms of the GNU General Public License version 2 only, as
- * published by the Free Software Foundation.  Sun designates this
- * particular file as subject to the "Classpath" exception as provided
- * by Sun in the LICENSE file that accompanied this code.
- *
- * This code is distributed in the hope that it will be useful, but WITHOUT
- * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
- * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
- * version 2 for more details (a copy is included in the LICENSE file that
- * accompanied this code).
- *
- * You should have received a copy of the GNU General Public License version
- * 2 along with this work; if not, write to the Free Software Foundation,
- * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA.
- *
- * Please contact Sun Microsystems, Inc., 4150 Network Circle, Santa Clara,
- * CA 95054 USA or visit www.sun.com if you need additional information or
- * have any questions.
- */
-
-package sun.nio.ch;
-
-import java.io.IOException;
-import java.nio.channels.*;
-import java.nio.channels.spi.*;
-
-public class DevPollSelectorProvider
-    extends SelectorProviderImpl
-{
-    public AbstractSelector openSelector() throws IOException {
-        return new DevPollSelectorImpl(this);
-    }
-
-    public Channel inheritedChannel() throws IOException {
-        return InheritedChannel.getChannel();
-    }
-}
diff -r 2b6c2ce8cd88 -r db9384d2f468 src/share/classes/sun/nio/ch/PollSelectorProvider.java
--- a/src/share/classes/sun/nio/ch/PollSelectorProvider.java	Tue Dec 18 15:30:58 2007 +0100
+++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
@@ -1,42 +0,0 @@
-/*
- * Copyright 2001-2003 Sun Microsystems, Inc.  All Rights Reserved.
- * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
- *
- * This code is free software; you can redistribute it and/or modify it
- * under the terms of the GNU General Public License version 2 only, as
- * published by the Free Software Foundation.  Sun designates this
- * particular file as subject to the "Classpath" exception as provided
- * by Sun in the LICENSE file that accompanied this code.
- *
- * This code is distributed in the hope that it will be useful, but WITHOUT
- * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
- * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
- * version 2 for more details (a copy is included in the LICENSE file that
- * accompanied this code).
- *
- * You should have received a copy of the GNU General Public License version
- * 2 along with this work; if not, write to the Free Software Foundation,
- * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA.
- *
- * Please contact Sun Microsystems, Inc., 4150 Network Circle, Santa Clara,
- * CA 95054 USA or visit www.sun.com if you need additional information or
- * have any questions.
- */
-
-package sun.nio.ch;
-
-import java.io.IOException;
-import java.nio.channels.*;
-import java.nio.channels.spi.*;
-
-public class PollSelectorProvider
-    extends SelectorProviderImpl
-{
-    public AbstractSelector openSelector() throws IOException {
-        return new PollSelectorImpl(this);
-    }
-
-    public Channel inheritedChannel() throws IOException {
-        return InheritedChannel.getChannel();
-    }
-}
diff -r 2b6c2ce8cd88 -r db9384d2f468 src/solaris/classes/sun/nio/ch/AbstractPollSelectorImpl.java
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/src/solaris/classes/sun/nio/ch/AbstractPollSelectorImpl.java	Fri Jan 25 21:34:30 2008 +0100
@@ -0,0 +1,187 @@
+/*
+ * Copyright 2001-2004 Sun Microsystems, Inc.  All Rights Reserved.
+ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
+ *
+ * This code is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License version 2 only, as
+ * published by the Free Software Foundation.  Sun designates this
+ * particular file as subject to the "Classpath" exception as provided
+ * by Sun in the LICENSE file that accompanied this code.
+ *
+ * This code is distributed in the hope that it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+ * version 2 for more details (a copy is included in the LICENSE file that
+ * accompanied this code).
+ *
+ * You should have received a copy of the GNU General Public License version
+ * 2 along with this work; if not, write to the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA.
+ *
+ * Please contact Sun Microsystems, Inc., 4150 Network Circle, Santa Clara,
+ * CA 95054 USA or visit www.sun.com if you need additional information or
+ * have any questions.
+ */
+
+package sun.nio.ch;
+
+import java.io.IOException;
+import java.nio.channels.*;
+import java.nio.channels.spi.*;
+import java.util.*;
+import sun.misc.*;
+
+
+/**
+ * An abstract selector impl.
+ */
+
+abstract class AbstractPollSelectorImpl
+    extends SelectorImpl
+{
+
+    // The poll fd array
+    PollArrayWrapper pollWrapper;
+
+    // Initial capacity of the pollfd array
+    protected final int INIT_CAP = 10;
+
+    // The list of SelectableChannels serviced by this Selector
+    protected SelectionKeyImpl[] channelArray;
+
+    // In some impls the first entry of channelArray is bogus
+    protected int channelOffset = 0;
+
+    // The number of valid channels in this Selector's poll array
+    protected int totalChannels;
+
+    // True if this Selector has been closed
+    private boolean closed = false;
+
+    AbstractPollSelectorImpl(SelectorProvider sp, int channels, int offset) {
+        super(sp);
+        this.totalChannels = channels;
+        this.channelOffset = offset;
+    }
+
+    void putEventOps(SelectionKeyImpl sk, int ops) {
+        pollWrapper.putEventOps(sk.getIndex(), ops);
+    }
+
+    public Selector wakeup() {
+        pollWrapper.interrupt();
+        return this;
+    }
+
+    protected abstract int doSelect(long timeout) throws IOException;
+
+    protected void implClose() throws IOException {
+        if (!closed) {
+            closed = true;
+            // Deregister channels
+            for(int i=channelOffset; i<totalChannels; i++) {
+                SelectionKeyImpl ski = channelArray[i];
+                assert(ski.getIndex() != -1);
+                ski.setIndex(-1);
+                deregister(ski);
+                SelectableChannel selch = channelArray[i].channel();
+                if (!selch.isOpen() && !selch.isRegistered())
+                    ((SelChImpl)selch).kill();
+            }
+            implCloseInterrupt();
+            pollWrapper.free();
+            pollWrapper = null;
+            selectedKeys = null;
+            channelArray = null;
+            totalChannels = 0;
+        }
+    }
+
+    protected abstract void implCloseInterrupt() throws IOException;
+
+    /**
+     * Copy the information in the pollfd structs into the opss
+     * of the corresponding Channels. Add the ready keys to the
+     * ready queue.
+     */
+    protected int updateSelectedKeys() {
+        int numKeysUpdated = 0;
+        // Skip zeroth entry; it is for interrupts only
+        for (int i=channelOffset; i<totalChannels; i++) {
+            int rOps = pollWrapper.getReventOps(i);
+            if (rOps != 0) {
+                SelectionKeyImpl sk = channelArray[i];
+                pollWrapper.putReventOps(i, 0);
+                if (selectedKeys.contains(sk)) {
+                    if (sk.channel.translateAndSetReadyOps(rOps, sk)) {
+                        numKeysUpdated++;
+                    }
+                } else {
+                    sk.channel.translateAndSetReadyOps(rOps, sk);
+                    if ((sk.nioReadyOps() & sk.nioInterestOps()) != 0) {
+                        selectedKeys.add(sk);
+                        numKeysUpdated++;
+                    }
+                }
+            }
+        }
+        return numKeysUpdated;
+    }
+
+    protected void implRegister(SelectionKeyImpl ski) {
+        // Check to see if the array is large enough
+        if (channelArray.length == totalChannels) {
+            // Make a larger array
+            int newSize = pollWrapper.totalChannels * 2;
+            SelectionKeyImpl temp[] = new SelectionKeyImpl[newSize];
+            // Copy over
+            for (int i=channelOffset; i<totalChannels; i++)
+                temp[i] = channelArray[i];
+            channelArray = temp;
+            // Grow the NativeObject poll array
+            pollWrapper.grow(newSize);
+        }
+        channelArray[totalChannels] = ski;
+        ski.setIndex(totalChannels);
+        pollWrapper.addEntry(ski.channel);
+        totalChannels++;
+        keys.add(ski);
+    }
+
+    protected void implDereg(SelectionKeyImpl ski) throws IOException {
+        // Algorithm: Copy the sc from the end of the list and put it into
+        // the location of the sc to be removed (since order doesn't
+        // matter). Decrement the sc count. Update the index of the sc
+        // that is moved.
+        int i = ski.getIndex();
+        assert (i >= 0);
+        if (i != totalChannels - 1) {
+            // Copy end one over it
+            SelectionKeyImpl endChannel = channelArray[totalChannels-1];
+            channelArray[i] = endChannel;
+            endChannel.setIndex(i);
+            pollWrapper.release(i);
+            PollArrayWrapper.replaceEntry(pollWrapper, totalChannels - 1,
+                                          pollWrapper, i);
+        } else {
+            pollWrapper.release(i);
+        }
+        // Destroy the last one
+        channelArray[totalChannels-1] = null;
+        totalChannels--;
+        pollWrapper.totalChannels--;
+        ski.setIndex(-1);
+        // Remove the key from keys and selectedKeys
+        keys.remove(ski);
+        selectedKeys.remove(ski);
+        deregister((AbstractSelectionKey)ski);
+        SelectableChannel selch = ski.channel();
+        if (!selch.isOpen() && !selch.isRegistered())
+            ((SelChImpl)selch).kill();
+    }
+
+    static {
+        Util.load();
+    }
+
+}
diff -r 2b6c2ce8cd88 -r db9384d2f468 src/solaris/classes/sun/nio/ch/DevPollSelectorProvider.java
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/src/solaris/classes/sun/nio/ch/DevPollSelectorProvider.java	Fri Jan 25 21:34:30 2008 +0100
@@ -0,0 +1,42 @@
+/*
+ * Copyright 2001-2003 Sun Microsystems, Inc.  All Rights Reserved.
+ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
+ *
+ * This code is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License version 2 only, as
+ * published by the Free Software Foundation.  Sun designates this
+ * particular file as subject to the "Classpath" exception as provided
+ * by Sun in the LICENSE file that accompanied this code.
+ *
+ * This code is distributed in the hope that it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+ * version 2 for more details (a copy is included in the LICENSE file that
+ * accompanied this code).
+ *
+ * You should have received a copy of the GNU General Public License version
+ * 2 along with this work; if not, write to the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA.
+ *
+ * Please contact Sun Microsystems, Inc., 4150 Network Circle, Santa Clara,
+ * CA 95054 USA or visit www.sun.com if you need additional information or
+ * have any questions.
+ */
+
+package sun.nio.ch;
+
+import java.io.IOException;
+import java.nio.channels.*;
+import java.nio.channels.spi.*;
+
+public class DevPollSelectorProvider
+    extends SelectorProviderImpl
+{
+    public AbstractSelector openSelector() throws IOException {
+        return new DevPollSelectorImpl(this);
+    }
+
+    public Channel inheritedChannel() throws IOException {
+        return InheritedChannel.getChannel();
+    }
+}
diff -r 2b6c2ce8cd88 -r db9384d2f468 src/solaris/classes/sun/nio/ch/PollSelectorProvider.java
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/src/solaris/classes/sun/nio/ch/PollSelectorProvider.java	Fri Jan 25 21:34:30 2008 +0100
@@ -0,0 +1,42 @@
+/*
+ * Copyright 2001-2003 Sun Microsystems, Inc.  All Rights Reserved.
+ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
+ *
+ * This code is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License version 2 only, as
+ * published by the Free Software Foundation.  Sun designates this
+ * particular file as subject to the "Classpath" exception as provided
+ * by Sun in the LICENSE file that accompanied this code.
+ *
+ * This code is distributed in the hope that it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+ * version 2 for more details (a copy is included in the LICENSE file that
+ * accompanied this code).
+ *
+ * You should have received a copy of the GNU General Public License version
+ * 2 along with this work; if not, write to the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA.
+ *
+ * Please contact Sun Microsystems, Inc., 4150 Network Circle, Santa Clara,
+ * CA 95054 USA or visit www.sun.com if you need additional information or
+ * have any questions.
+ */
+
+package sun.nio.ch;
+
+import java.io.IOException;
+import java.nio.channels.*;
+import java.nio.channels.spi.*;
+
+public class PollSelectorProvider
+    extends SelectorProviderImpl
+{
+    public AbstractSelector openSelector() throws IOException {
+        return new PollSelectorImpl(this);
+    }
+
+    public Channel inheritedChannel() throws IOException {
+        return InheritedChannel.getChannel();
+    }
+}

From Alan.Bateman at Sun.COM  Fri Jan 25 21:10:14 2008
From: Alan.Bateman at Sun.COM (Alan Bateman)
Date: Fri, 25 Jan 2008 21:10:14 +0000
Subject: [PATCH] Move Solaris specific classes to solaris/
In-Reply-To: <1201293461.9468.21.camel@mercury>
References: <1201293461.9468.21.camel@mercury>
Message-ID: <479A5036.9060701@sun.com>

Roman Kennke wrote:
> Hi,
>
> there are some classes in the jdk/share tree, that seem to be Solaris
> specific. I suggest moving them to the jdk/solaris tree instead. Or am I
> wrong here?
>
> /Roman
>
>   
Yes, they should be in the src/solaris tree (although only 
DevPollSelectorProvider is Solaris specific).

-Alan.


From mark at klomp.org  Fri Jan 25 22:16:51 2008
From: mark at klomp.org (Mark Wielaard)
Date: Fri, 25 Jan 2008 22:16:51 +0000 (UTC)
Subject: Null-terminated Unicode strings in java.io on Windows
References: <d58509a80801250908p6038b8ffsb49996e60d167e37@mail.gmail.com>
	<1201281794.6482.17.camel@a1dmin.vola.spe.com.pl>
	<d58509a80801250930v78ac0703v1e561fda7e2f2359@mail.gmail.com>
	<1201282944.6482.30.camel@a1dmin.vola.spe.com.pl>
	<1201283641.9468.11.camel@mercury>
	<1201284868.6482.45.camel@a1dmin.vola.spe.com.pl>
	<d58509a80801251101u624f0282g38edb5136c8fad45@mail.gmail.com>
Message-ID: <loom.20080125T220539-473@post.gmane.org>

Hi Robert,

Robert Lougher <rob.lougher at ...> writes:
> This is getting a bit hostile for no reason....  Thinking about
> alignment gives an interesting solution.
> 
> 1) Strings are not null-terminated
> 2) For most strings the alignment gives the VM room to terminate in
> place when GetStringChars is called
> 3) Copy strings that can't be terminated in place.

Note that Strings have a backing [j]char array which can be shared between
different Strings, and often are when read in in one go and then split in
different sub-String objects. All these Strings have a shared slice of this
backing jchar array, so there isn't any place to terminate it because that place
will overlap with another slice that can belong to another String.

You should know, because I learned all I know about this and pinning of the
backing storage of a String (not the String object itself) by reading your jamvm
code! :)

BTW. I would really recommend anybody wanting to know how the VM and JNI specs
truly work/can be implemented in practice take a look at jamvm, it is a truly
remarkable clear, concise and small implementation. Nothing bad about other
runtimes, but jamvm is small enough that you can read the code, sit down with
the spec and compare them almost directly to get a really nice insight in how
things are/can be done.

Cheers,

Mark


From rob.lougher at gmail.com  Sat Jan 26 00:00:51 2008
From: rob.lougher at gmail.com (Robert Lougher)
Date: Sat, 26 Jan 2008 00:00:51 +0000
Subject: Null-terminated Unicode strings in java.io on Windows
In-Reply-To: <loom.20080125T220539-473@post.gmane.org>
References: <d58509a80801250908p6038b8ffsb49996e60d167e37@mail.gmail.com>
	<1201281794.6482.17.camel@a1dmin.vola.spe.com.pl>
	<d58509a80801250930v78ac0703v1e561fda7e2f2359@mail.gmail.com>
	<1201282944.6482.30.camel@a1dmin.vola.spe.com.pl>
	<1201283641.9468.11.camel@mercury>
	<1201284868.6482.45.camel@a1dmin.vola.spe.com.pl>
	<d58509a80801251101u624f0282g38edb5136c8fad45@mail.gmail.com>
	<loom.20080125T220539-473@post.gmane.org>
Message-ID: <d58509a80801251600y783644a5g54235b2ab02046f3@mail.gmail.com>

Hi Mark,

On Jan 25, 2008 10:16 PM, Mark Wielaard <mark at klomp.org> wrote:
> Hi Robert,
>
> Robert Lougher <rob.lougher at ...> writes:
> > This is getting a bit hostile for no reason....  Thinking about
> > alignment gives an interesting solution.
> >
> > 1) Strings are not null-terminated
> > 2) For most strings the alignment gives the VM room to terminate in
> > place when GetStringChars is called
> > 3) Copy strings that can't be terminated in place.
>
> Note that Strings have a backing [j]char array which can be shared between
> different Strings, and often are when read in in one go and then split in
> different sub-String objects. All these Strings have a shared slice of this
> backing jchar array, so there isn't any place to terminate it because that place
> will overlap with another slice that can belong to another String.
>

Whoops, you're right :)

> You should know, because I learned all I know about this and pinning of the
> backing storage of a String (not the String object itself) by reading your jamvm
> code! :)
>
> BTW. I would really recommend anybody wanting to know how the VM and JNI specs
> truly work/can be implemented in practice take a look at jamvm, it is a truly
> remarkable clear, concise and small implementation. Nothing bad about other
> runtimes, but jamvm is small enough that you can read the code, sit down with
> the spec and compare them almost directly to get a really nice insight in how
> things are/can be done.
>

How many beers did we agree I'll buy you at FOSDEM? ;)

Rob.

> Cheers,
>
> Mark
>
>


From program.spe at home.pl  Mon Jan 28 08:24:26 2008
From: program.spe at home.pl (Krzysztof =?UTF-8?Q?=C5=BBelechowski?=)
Date: Mon, 28 Jan 2008 09:24:26 +0100
Subject: Null-terminated Unicode strings in java.io on Windows
In-Reply-To: <loom.20080125T220539-473@post.gmane.org>
References: <d58509a80801250908p6038b8ffsb49996e60d167e37@mail.gmail.com>
	<1201281794.6482.17.camel@a1dmin.vola.spe.com.pl>
	<d58509a80801250930v78ac0703v1e561fda7e2f2359@mail.gmail.com>
	<1201282944.6482.30.camel@a1dmin.vola.spe.com.pl>
	<1201283641.9468.11.camel@mercury>
	<1201284868.6482.45.camel@a1dmin.vola.spe.com.pl>
	<d58509a80801251101u624f0282g38edb5136c8fad45@mail.gmail.com>
	<loom.20080125T220539-473@post.gmane.org>
Message-ID: <1201508666.6550.9.camel@a1dmin.vola.spe.com.pl>


Dnia 25-01-2008, Pt o godzinie 22:16 +0000, Mark Wielaard pisze:
> Hi Robert,
> 
> Robert Lougher <rob.lougher at ...> writes:
> > This is getting a bit hostile for no reason....  Thinking about
> > alignment gives an interesting solution.
> > 
> > 1) Strings are not null-terminated
> > 2) For most strings the alignment gives the VM room to terminate in
> > place when GetStringChars is called
> > 3) Copy strings that can't be terminated in place.
> 
> Note that Strings have a backing [j]char array which can be shared between
> different Strings, and often are when read in in one go and then split in
> different sub-String objects. All these Strings have a shared slice of this
> backing jchar array, so there isn't any place to terminate it because that place
> will overlap with another slice that can belong to another String.

That changes the picture dramatically.  
Indeed, taking a prefix of an unmodifiable string 
requires copying data, as of the C language.  
I should have thought of that earlier, 
particularly because I gave run into this problem 
when I tried to make the source code for gmake straight.
I suppose K&R (or whoever they inherited the concept after) 
chose to z-term strings 
because they found out that writing a 0 at the end 
takes less memory than keeping a separate pointer to the end.
(This is true for character data only, 
that is why strings are so special).

Sorry for wasting your time.
Chris


From msa at allman.ms  Wed Jan 30 09:20:06 2008
From: msa at allman.ms (Michael Allman)
Date: Wed, 30 Jan 2008 01:20:06 -0800 (PST)
Subject: purpose of FileDispatcher.preClose()
Message-ID: <20080130011129.U28274@yvyyl.pfbsg.arg>

Hello,

Can someone with knowledge of such matters explain what 
FileDispatcher.preClose() is supposed to do on Solaris/Linux.  I mean, I 
see the code, but I don't understand why it exists or what problem it's 
supposed to avoid or something.

I ask because I'm trying to fix a file-locking problem on soylatte and it 
seems the solution to that problem is to remove this code (on that 
platform).  But before I charge ahead, I need a better understanding of 
why this code exists.

In particular, I'm really interested in the stuff that happens in 
FileDispatcher.c, functions Java_sun_nio_ch_FileDispatcher_init and 
Java_sun_nio_ch_FileDispatcher_preClose0.  They're setting something up 
that looks important, but I just don't get it.

Cheers,

Michael


From Alan.Bateman at Sun.COM  Wed Jan 30 11:35:07 2008
From: Alan.Bateman at Sun.COM (Alan Bateman)
Date: Wed, 30 Jan 2008 11:35:07 +0000
Subject: purpose of FileDispatcher.preClose()
In-Reply-To: <20080130011129.U28274@yvyyl.pfbsg.arg>
References: <20080130011129.U28274@yvyyl.pfbsg.arg>
Message-ID: <47A060EB.8080200@sun.com>

Michael Allman wrote:
> Hello,
>
> Can someone with knowledge of such matters explain what 
> FileDispatcher.preClose() is supposed to do on Solaris/Linux.  I mean, 
> I see the code, but I don't understand why it exists or what problem 
> it's supposed to avoid or something.
>
> I ask because I'm trying to fix a file-locking problem on soylatte and 
> it seems the solution to that problem is to remove this code (on that 
> platform).  But before I charge ahead, I need a better understanding 
> of why this code exists.
>
> In particular, I'm really interested in the stuff that happens in 
> FileDispatcher.c, functions Java_sun_nio_ch_FileDispatcher_init and 
> Java_sun_nio_ch_FileDispatcher_preClose0.  They're setting something 
> up that looks important, but I just don't get it.
In a multi-threaded application it is always difficult to know when you 
can safely close and release a file descriptor (or other resource). If 
one thread is using a file descriptor to read or write and another 
thread releases (closes) it then it it possible for the first thread to 
read or write to the wrong file or socket in the event that the file 
descriptor is recycled quickly. The approach that we use in both classic 
networking and NIO is to use a two-step process. In the first step we 
duplicate (dup2) the file descriptor to another that is one end of a 
half shutdown socket pair. Other threads that are reading or writing but 
haven't called the read or write system calls yet will get an immediate 
EOF or pipe error when they do so. As the threads complete the read or 
write method then they examine their state. If there is a close pending 
then the last one releases the file descriptor.  Hopefully this brief 
overview gives you some idea what this code is about. The 
FileDescriptor#init method is where the socketpair is created, and that 
preClose0 method does the dup2. I haven't been following the Soylatte 
port very closely so I'm curious what problem you are seeing - when you 
say "file locking" do you mean FileChannel#lock? If so then the issue 
may be that the asynchronous close mechanism isn't completely extended 
to FileChannel yet.

-Alan.


From msa at allman.ms  Wed Jan 30 06:30:14 2008
From: msa at allman.ms (Michael Allman)
Date: Tue, 29 Jan 2008 22:30:14 -0800 (PST)
Subject: [PATCH] FileChannelImpl.c.Java_sun_nio_ch_FileChannelImpl_truncate0
Message-ID: <20080129220705.T49011@yvyyl.pfbsg.arg>

This must have been on somebody's plate for a long time.

Attached please find a patch to correct an apparently unreported bug.  At 
least, I couldn't find one.  The problem is that if a FileChannel is 
truncated and its position was previously set beyond the new length of the 
file, the position should be but isn't set to the new length of the file.

Heads up.  I have kinda sorta tested this patch.  I run a Mac OS X Leopard 
system.  I have tested this patch on that system, as applied to the 
soylatte source code repository.  More info on soylatte here: 
http://landonf.bikemonkey.org/static/soylatte/.  The gist of it is that 
soylatte is a port of Sun's JDK 6 to Mac OS X.  My test procedure was as 
follows:

1.  Get jdk7/jdk/test/java/nio/channels/FileChannel/Truncate.java from the 
OpenJDK repository.

2.  Compile and run Truncate on soylatte 1.0.1 (which is based on Sun's 
JDK 6 something).  Test reports failure as such:

Exception in thread "main" java.lang.RuntimeException: Position greater 
than size
 	at Truncate.main(Truncate.java:68)

3.  Run Truncate on a patched version of soylatte (patch essentially 
identical to attached file).  Test completes normally without output.  I 
guess this means it passed.

I'm sending this in as a patch to OpenJDK and not soylatte because I know 
this is a problem on Solaris, too.  That is, I ran Truncate on jdk6u4 on 
solaris 11 and it failed.

Obviously, this is not the only way to fix this problem.  We could also do 
this with a patch to FileChannelImpl.java.  I'll let whoever's in charge 
here make that call.

So, I hope this is helpful.  I am ready and willing to respond to 
feedback.  I have tried to follow the guidelines in 
http://openjdk.java.net/contribute/.

Cheers,

Michael

(CCing Landon Fuller because he runs the Soylatte project.)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: truncate.patch
Type: text/x-diff
Size: 1406 bytes
Desc: 
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20080129/56383b28/truncate.patch>

From Tim.Bell at Sun.COM  Wed Jan 30 19:44:20 2008
From: Tim.Bell at Sun.COM (Tim Bell)
Date: Wed, 30 Jan 2008 11:44:20 -0800
Subject: [PATCH]
	FileChannelImpl.c.Java_sun_nio_ch_FileChannelImpl_truncate0
In-Reply-To: <20080129220705.T49011@yvyyl.pfbsg.arg>
References: <20080129220705.T49011@yvyyl.pfbsg.arg>
Message-ID: <47A0D394.8010603@sun.com>

Hi Michael Allman wrote:
> This must have been on somebody's plate for a long time.
> 
> Attached please find a patch

Thanks for sending your suggested fix our way.  I will do some additional searching to see if I 
can locate an existing Bug-ID for this issue.

I don't find your name on the SCA list:
   https://sca.dev.java.net/CA_signatories

Please sign the Sun Contributor's Agreement.  You will find the latest version of the SCA here:
   http://www.sun.com/software/opensource/sca.pdf

The FAQ about the SCA and its ramifications is here:
   http://www.sun.com/software/opensource/contributor_agreement.jsp

After reading and signing the agreement, fax it to +1-408-715-2540, or scan it and e-mail the 
result to sun_ca (at) sun.com.

If you have already done this, please contact me offline.  If may be sitting in our queue of 
incoming SCAs.

Thanks, and Best Regards-

Tim Bell


From msa at allman.ms  Wed Jan 30 20:14:11 2008
From: msa at allman.ms (Michael Allman)
Date: Wed, 30 Jan 2008 12:14:11 -0800 (PST)
Subject: purpose of FileDispatcher.preClose()
In-Reply-To: <47A060EB.8080200@sun.com>
References: <20080130011129.U28274@yvyyl.pfbsg.arg> <47A060EB.8080200@sun.com>
Message-ID: <20080130114658.S99697@yvyyl.pfbsg.arg>

On Wed, 30 Jan 2008, Alan Bateman wrote:

> Michael Allman wrote:
>> Hello,
>> 
>> Can someone with knowledge of such matters explain what 
>> FileDispatcher.preClose() is supposed to do on Solaris/Linux.  I mean, I 
>> see the code, but I don't understand why it exists or what problem it's 
>> supposed to avoid or something.
>> 
>> I ask because I'm trying to fix a file-locking problem on soylatte and it 
>> seems the solution to that problem is to remove this code (on that 
>> platform).  But before I charge ahead, I need a better understanding of why 
>> this code exists.
>> 
>> In particular, I'm really interested in the stuff that happens in 
>> FileDispatcher.c, functions Java_sun_nio_ch_FileDispatcher_init and 
>> Java_sun_nio_ch_FileDispatcher_preClose0.  They're setting something up 
>> that looks important, but I just don't get it.
> In a multi-threaded application it is always difficult to know when you can 
> safely close and release a file descriptor (or other resource). If one thread 
> is using a file descriptor to read or write and another thread releases 
> (closes) it then it it possible for the first thread to read or write to the 
> wrong file or socket in the event that the file descriptor is recycled 
> quickly. The approach that we use in both classic networking and NIO is to 
> use a two-step process. In the first step we duplicate (dup2) the file 
> descriptor to another that is one end of a half shutdown socket pair. Other 
> threads that are reading or writing but haven't called the read or write 
> system calls yet will get an immediate EOF or pipe error when they do so. As 
> the threads complete the read or write method then they examine their state. 
> If there is a close pending then the last one releases the file descriptor. 
> Hopefully this brief overview gives you some idea what this code is about. 
> The FileDescriptor#init method is where the socketpair is created, and that 
> preClose0 method does the dup2. I haven't been following the Soylatte port 
> very closely so I'm curious what problem you are seeing - when you say "file 
> locking" do you mean FileChannel#lock? If so then the issue may be that the 
> asynchronous close mechanism isn't completely extended to FileChannel yet.

I think I get it.  So let me explain the problem I'm seeing here.

If I close a file channel on which I have acquired (but not released) a 
file lock, I get an IOException: Bad file descriptor.  For example, the 
Lock regression test does this and fails (on soylatte).

I think the problem here is that FileChannelImpl.implCloseChannel() calls 
nd.preClose(fd) before the block that releases its file locks.  On 
non-windows, nd.preClose(fd) doesn't just "pre close" fd, it closes it. 
Then implCloseChannel() tries to release its file locks.  fd now points to 
a socket descriptor and on Solaris/Linux, such attempt seems to be 
harmless.  On Mac OS X, it complains with the EBADF error code.

It seems that the preClose semantics are not correctly handled by the 
FileChannelImpl.implCloseChannel() method.  On non-windows, it attempts to 
release file locks that no longer exist (because preClose() releases 
them).  It seems that the file lock release block should be moved into 
NativeDispatcher.preClose().  It will be run on Windows, but will not be 
run on non-Windows.  That seems correct to me, given that on non-Windows, 
preClose0 releases the file locks.

Obviously, this kind of change is much more than a soylatte patch.  It 
changes code that already works on Windows, Solaris, and Linux.  But if my 
analysis is correct, it looks like it's just a silent bug.

Thoughts?

Michael


From Alan.Bateman at Sun.COM  Wed Jan 30 20:59:13 2008
From: Alan.Bateman at Sun.COM (Alan Bateman)
Date: Wed, 30 Jan 2008 20:59:13 +0000
Subject: [PATCH]
	FileChannelImpl.c.Java_sun_nio_ch_FileChannelImpl_truncate0
In-Reply-To: <20080129220705.T49011@yvyyl.pfbsg.arg>
References: <20080129220705.T49011@yvyyl.pfbsg.arg>
Message-ID: <47A0E521.6070007@sun.com>

Michael Allman wrote:
> This must have been on somebody's plate for a long time.
>
> Attached please find a patch to correct an apparently unreported bug.  
> At least, I couldn't find one.  
I think this is the bug you are looking for:
    http://bugs.sun.com/view_bug.do?bug_id=6191269

It is fixed in jdk7/OpenJDK. If I understand correctly you are running 
jdk7/OpenJDK's regression tests on a jdk6 port. In that case you will 
likely see other failures because there are many fixes and updated tests 
in jdk7/OpenJDK that aren't in jdk6.

-Alan.


From Alan.Bateman at Sun.COM  Wed Jan 30 21:12:12 2008
From: Alan.Bateman at Sun.COM (Alan Bateman)
Date: Wed, 30 Jan 2008 21:12:12 +0000
Subject: purpose of FileDispatcher.preClose()
In-Reply-To: <20080130114658.S99697@yvyyl.pfbsg.arg>
References: <20080130011129.U28274@yvyyl.pfbsg.arg> <47A060EB.8080200@sun.com>
	<20080130114658.S99697@yvyyl.pfbsg.arg>
Message-ID: <47A0E82C.2020004@sun.com>

Michael Allman wrote:
> :
>
> I think the problem here is that FileChannelImpl.implCloseChannel() 
> calls nd.preClose(fd) before the block that releases its file locks.  
> On non-windows, nd.preClose(fd) doesn't just "pre close" fd, it closes 
> it. Then implCloseChannel() tries to release its file locks.  fd now 
> points to a socket descriptor and on Solaris/Linux, such attempt seems 
> to be harmless.  On Mac OS X, it complains with the EBADF error code.
Yes, this is a known issue but hasn't been a problem to date. I don't 
know Mac OS X well but if closing a file causes all advisory locks on 
the file to be removed then the simplest solution for your port is 
probably to just comment out the call to release0 that is called from 
the inner class in implCloseChannel. As you've found this will otherwise 
attempt the unlock on the dup'ed file descriptor and fail.

-Alan.


From eliasen at mindspring.com  Thu Jan 31 04:22:57 2008
From: eliasen at mindspring.com (Alan Eliasen)
Date: Wed, 30 Jan 2008 21:22:57 -0700
Subject: BigInteger performance improvements
Message-ID: <47A14D21.8020807@mindspring.com>


   I'm planning on tackling the performance issues in the BigInteger
class.  In short, inefficient algorithms are used for
multiplication, exponentiation, conversion to strings, etc.  I intend to
improve this by adding algorithms with better asymptotic behavior that
will work better for large numbers, while preserving the existing
algorithms for use with smaller numbers.

   This encompasses a lot of different bug reports:

4228681:  Some BigInteger operations are slow with very large numbers
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4228681

   (This was closed but never fixed.)


4837946: Implement Karatsuba multiplication algorithm in BigInteger
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4837946

    I've already done the work on this one.  My implementation is
intended to be easy to read, understand, and check.  It significantly
improves multiplication performance for large numbers.


4646474: BigInteger.pow() algorithm slow in 1.4.0
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4646474

   This will be improved in a couple ways:

   * Rewrite pow() to use the above Karatsuba multiplication
   * Implement Karatsuba squaring
   * Finding a good threshhold for Karatsuba squaring
   * Rewrite pow() to use Karatsuba squaring
   * Add an optimization to use left-shifting for multiples of 2 in the
base.  This improves speed by thousands of times for things like
Mersenne numbers.


4641897: BigInteger.toString() algorithm slow for large numbers
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4641897

    This algorithm uses a very inefficient algorithm for large numbers.
 I plan to replace it with a recursive divide-and-conquer algorithm
devised by Schoenhage and Strassen.  I have developed and tested this in
my own software.  This operates hundreds or thousands of times faster
than the current version for large numbers.  It will also benefit from
faster multiplication and exponentiation.


   In the future, we should also add multiplication routines that are
even more efficient for very large numbers, such as Toom-Cook
multiplication, which is more efficient than Karatsuba multiplication
for even larger numbers.

   Has anyone else worked on these?  Is this the right group?

   I will probably submit the Karatsuba multiplication patch soon.
Would it be more preferable to implement *all* of these parts first and
submit one large patch?

-- 
  Alan Eliasen              |  "Furious activity is no substitute
  eliasen at mindspring.com    |    for understanding."
  http://futureboy.us/      |           --H.H. Williams