RFR - JDK-8210717 String::detab, String::entab (Code Review)
Dan Smith
daniel.smith at oracle.com
Fri Nov 30 19:50:50 UTC 2018
> On Sep 18, 2018, at 11:52 AM, Jim Laskey <james.laskey at oracle.com> wrote:
>
> Please review the code for String::detab and String::entab. Used to expand tabs into spaces, and spaces back to tabs.
>
> webrev: http://cr.openjdk.java.net/~jlaskey/8210717/webrev/index.html
> jbs: https://bugs.openjdk.java.net/browse/JDK-8210717
> csr: https://bugs.openjdk.java.net/browse/JDK-8210718 <https://bugs.openjdk.java.net/browse/JDK-8210718>
For detab: here's an attempt at a brief formal description of the behavior, if it's useful.
"Replace each tab code point (U+0009) in the string with one or more space code points (U+0020). The number of spaces that replace a tab varies between 1 and {@code tabWidth}, whatever is necessary so that the last replacement space is the <em>nth</em> code point of the {@link #lines() line}, where {@code n % tabWidth == 0}."
---
For entab: I noticed that something like the following doesn't behave how you'd probably expect.
" int x = 23;".entab(4)
--> "\tint\tx =\t23;"
For other strings, the intent is ambiguous (are the first and last name two fields, or only one?):
"Dan Smith daniel.smith".entab(4)
--> "Dan\tSmith\tdaniel.smith"
Seems like there are multiple possible approaches:
- Entab all spaces that align with a tab stop (current behavior)
- Only entab sequences of 2 or more spaces (implies 'tabWidth >= 2')
- Only entab prefix spaces
We can support all of these via different overloads or parameters, or we can choose the best one for most use cases. But I don't know that we really have pressing use cases—detab is the one that's easy to envision using. It might be best to just drop the method unless/until compelling use cases arise.
More information about the core-libs-dev
mailing list