Null-terminated Unicode strings in java.io on Windows

Robert Lougher rob.lougher at gmail.com
Fri Jan 25 17:55:34 UTC 2008


Hi Chris,

On 1/25/08, Krzysztof Żelechowski <program.spe at home.pl> wrote:
>
> Dnia 25-01-2008, Pt o godzinie 17:30 +0000, Robert Lougher pisze:
> > No it doesn't.  An implementation would have to be truly stupid to
> > internally null-terminate.  How many Strings are in the heap?  How
> > many will the programmer access via GetStringChars?  The null will be
> > a overhead for all Strings for a miniscule percentage.
>
> Please observe:
>
> 1.
> the amount of memory needed to manage the allocation
> is greater than the number of bytes
> needed to store one additional character,
> so the relative impact on memory usage will not be dramatic.
>
> 2. The string usually has much more characters then one.
> That means, if strings take 10 characters on the average,
> the overhead is 10%, in the impossible worst case, as explained below.
> This is an overhead I (and most programmers) can live with.
>
> 3. Memory is allocated in chunks.
> The size and alignment of the chunk is subject to various limitations.
> If the characters of the string do not fill the chunk entirely,
> there is good chance
> that there will space for the terminating zero anyway.
>

Yes, you're absolutely right.  However, consider for the sake of
argument the memory manager aligned on 4 byte boundaries.  Consider we
have 4 strings.  The first is 1 byte long, the second 2 bytes and so
on.  The first three strings will absorb the null due to alignment.
The fourth however, will require an extra 4 bytes because of the same
alignment. So we have a 4 byte overhead for 4 strings, or 1 byte per
string.

Rob.



byte, 2 bytes, 3 bytes and 4 bytes.
> Yours truly,
> Chris
>
>


More information about the core-libs-dev mailing list