Text Block / String manipulation with Constant Functions.

Aaron Scott-Boddendijk talden at gmail.com
Wed Aug 14 01:21:31 UTC 2019


Quite a number of discussions (on and off these lists) about the proposed
Text Blocks and earlier proposal run into the combinatorial problems of
supporting all the ways in which the string should be converted from it's
in-source representation for differing uses.

I wonder if we could take a page from a feature I've found useful in Rust
('const functions') and in C++ (constexpr functions).

For Java this would mean annotating methods and providing compiler support
for those methods that, if the compiler find they act entirely on constant
inputs, will be guaranteed to have a zero runtime cost by computing the
result during compilation and only storing the result in the output.

Eg
|     var s = "   test   ".trim();

If trim() was marked as a 'const function' then the actual String in the
class-file will be "test".

For String there are quite a few methods that this could be useful for,
that play well with Text Blocks and other String literals - indent, length,
replace, strip, trim and valueOf methods spring to mind. Adding appendLine,
trimLeft, trimRight might prove useful too.

Eg
| var s = """
|     This
|         is
|     a
|     test
| """.appendLine().indent(2).replace("\n", "\r\n");
|
| s.equals("  This\r\n      is\r\n  a\r\n  test\r\n");

This allows the Text Block to consider only embedded indentation and avoid
the, I believe usually superfluous, trailing line-separator before the
terminating """. All of the transformations are guaranteed to be applied by
the compiler so there's no runtime cost (beyond final storage space in the
class file which might actually be less).

This encodes API detail into the language-level to a greater degree than
before since a method may be a constant function in language level N but
not N-1 (though this just moves the cost to runtime). This might be seen as
undesirable and potentially surprising.

It's not clear if there are reasonable situations in which the constant
form might not be desired (eg. such as bloating the clasfile with
"".indent(1_000).replace(" ", "\n").indent(1_000_000)). Would guaranteeing
execution at class-load be acceptable instead - sometimes perhaps but
that's just a different part of runtime.

Obviously we still want Text-blocks (and any eventual 'raw string',
'interpolated string' features) to be useful without the verbosity of the
trailing functions in most cases but adding this would give developers
control without requiring the grammar to become a swiss-army-knife.

This feature could similarly be applied to other classes (some
java.lang.Math methods for example, just extending how some constant
expressions are compile-time evaluated).

--
Aaron Scott-Boddendijk


More information about the amber-dev mailing list