RFR: 8352316: More MergeStoreBench

Shaojin Wen swen at openjdk.org
Thu Mar 27 05:20:10 UTC 2025


On Wed, 26 Mar 2025 01:57:23 GMT, Shaojin Wen <swen at openjdk.org> wrote:

>> Added performance tests related to String.getBytes/String.getChars/StringBuilder.append/System.arraycopy in constant scenarios to verify whether MergeStore works
>
> I'm a developer of fastjson2. According to third-party benchmarks from https://github.com/fabienrenaud/java-json-benchmark, our library demonstrates the best performance. I would like to contribute some of these optimization techniques to OpenJDK, ideally by having C2 (the JIT compiler) directly support them.
> 
> Below is an example related to this PR. We have a JavaBean that needs to be serialized to a JSON string:
> 
> 
> * JavaBean
> 
> class Bean {
> 	public int value;
> }
> 
> 
> * Target JSON Output
> 
> {"value":123}
> 
> 
> * CodeGen-Generated JSONSerializer
> fastjson2 uses ASM to generate a serializer class like the following. The methods writeNameValue0, writeNameValue1, and writeNameValue2 are candidate implementations. Among them, writeNameValue2 is the fastest when the field name length is 8, as it leverages UNSAFE.putLong for direct memory operations:
> 
> 
> class BeanJSONSerializer {
> 	private static final String name = ""value":";
> 	private static final byte[] nameBytes = name.getBytes();
> 	private satic final long nameLong = UNSAFE.getLong(nameBytes, ARRAY_BYTE_BASE_OFFSET);	
> 
> 	int writeNameValue0(byte[] bytes, int off, int value) {
> 		name.getBytes(0, 8, bytes, off);
> 		off += 8;
> 		return writeInt32(bytes, off, value);
> 	}
> 
> 	int writeNameValue1(byte[] bytes, int off, int value) {
> 		System.arraycopy(nameBytes, 0, bytes, off, 8);
> 		off += 8;
> 		return writeInt32(bytes, off, value);
> 	}
> 
> 
> 	int writeNameValue2(byte[] bytes, int off, int value) {
> 		UNSAFE.putLong(bytes, ARRAY_BYTE_BASE_OFFSET + off, nameLong);
> 		off += 8;
> 		return writeInt32(bytes, off, value);
> 	}
> }
> 
> 
> We propose that the C2 compiler could optimize cases where the field name length is 4 or 8 bytes by automatically using direct memory operations similar to writeNameValue2. This would eliminate the need for manual unsafe operations in user code and improve serialization performance for common patterns.

> @wenshao Do you have any insight from this benchmark? What was your motivation for it?
> 
> I also wonder if an IR test for some of the cases would be helpful. IR tests give us more info about what the compiler produced, and if there is a change in VM behaviour the IR test catches it in regular testing. Benchmarks are not run regularly, and regressions would therefore not be caught.

I submitted this benchmark to prove that the performance of System.arraycopy or String.getBytes can be improved by Unsafe.putInt/putLong. I hope C2 can do this optimization automatically.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/24108#issuecomment-2756725833


More information about the hotspot-compiler-dev mailing list