<i18n dev> java.util.Fomatter precision and surrogate pairs

Jason Mehrens jason_mehrens at hotmail.com
Fri Feb 12 17:24:35 UTC 2021


Hello i18n-dev,

In the documentation for java.util.Formatter, precision is allowed for 's' general conversions but not 'c' conversions.  What is the rationale for this requirement?

When using surrogate pairs the general conversion will tear surrogate pairs, which seems correct given the type of conversion.
However, if you want to limit the precision and not tear surrogate pairs there no out of the box way to do this.
My understanding is that you have to implement a Formattable which is not ideal.

Reading the docs it looks like 'c' deals with Unicode but doesn't allow setting a precision (substring).
Would it make sense to allow precision for character conversions and enforce a precision limit based on the code point count and not the length?  
I think it would be helpful for java.util.logging.SimpleFormatter and alike where a user wants to limit the max line size but not see garbage characters in the logs.

Thanks,

Jason

====
public class FormatterTester {

	public static void main(String[] args) {
		String nb = "\ud83c\udf09\ud83c\udf09";
		//'s', 'S'	general	If arg implements Formattable, 
		///then arg.formatTo is invoked. 
		//Otherwise, the result is obtained by invoking arg.toString().
		println("s", nb); 
		
		//'c', 'C'|character|The result is a Unicode character
		//For general argument types, the precision is the 
		//maximum number of characters to be written to the output.
		//... if a precision is provided, an exception will be thrown.
		println("c", nb); 
	}
	
	private static void println(String conversion, String arg) {
		System.err.println("len=" + arg.length() 
			+ " count=" + arg.codePointCount(0, arg.length()));
		for (int i = 1, p = arg.length() + 4; i < p; ++i) {
			try {
				System.err.println(String.format("%1$."+ i + conversion, arg));
			} catch (IllegalArgumentException iae) {
				System.err.println("p=" + i + ' ' + iae);
			}
		}
	}
}
====


More information about the i18n-dev mailing list