Null-terminated Unicode strings in java.io on Windows
Robert Lougher
rob.lougher at gmail.com
Fri Jan 25 17:55:34 UTC 2008
Hi Chris,
On 1/25/08, Krzysztof Żelechowski <program.spe at home.pl> wrote:
>
> Dnia 25-01-2008, Pt o godzinie 17:30 +0000, Robert Lougher pisze:
> > No it doesn't. An implementation would have to be truly stupid to
> > internally null-terminate. How many Strings are in the heap? How
> > many will the programmer access via GetStringChars? The null will be
> > a overhead for all Strings for a miniscule percentage.
>
> Please observe:
>
> 1.
> the amount of memory needed to manage the allocation
> is greater than the number of bytes
> needed to store one additional character,
> so the relative impact on memory usage will not be dramatic.
>
> 2. The string usually has much more characters then one.
> That means, if strings take 10 characters on the average,
> the overhead is 10%, in the impossible worst case, as explained below.
> This is an overhead I (and most programmers) can live with.
>
> 3. Memory is allocated in chunks.
> The size and alignment of the chunk is subject to various limitations.
> If the characters of the string do not fill the chunk entirely,
> there is good chance
> that there will space for the terminating zero anyway.
>
Yes, you're absolutely right. However, consider for the sake of
argument the memory manager aligned on 4 byte boundaries. Consider we
have 4 strings. The first is 1 byte long, the second 2 bytes and so
on. The first three strings will absorb the null due to alignment.
The fourth however, will require an extra 4 bytes because of the same
alignment. So we have a 4 byte overhead for 4 strings, or 1 byte per
string.
Rob.
byte, 2 bytes, 3 bytes and 4 bytes.
> Yours truly,
> Chris
>
>
More information about the core-libs-dev
mailing list