[OpenJDK 2D-Dev] Printing to Postscript doesn't support dieresis

Wed Dec 10 23:36:28 UTC 2014

> the character is encoded as <c3a4> (which is correct ihmo)
> but then mapped to ISOLatin1Encoding.

\u00e4 (Umlaut) encoded as 8859 should just be "e4".
What you have above is UTF-8, whereas the PS printing path is
definitely expecting 8859-1. I looked and found that when I reviewed this change
I commented it probably should be 8859-1 but didn't make a sufficient point of it :-(
I thought that since we returned latin1 for the charset name we'd get the right encoding
but apparently not, and I imagine what testing was done either didn't cover this range
or the bug was overlooked.

The following is the quick fix I think we need since I think printing and ONLY printing
ever uses this code when we are using fontconfig :-

diff --git a/src/java.desktop/unix/classes/sun/font/FcFontConfiguration.java b/src/java.desktop/unix/classes/sun/font/FcFontConfiguration.java
--- a/src/java.desktop/unix/classes/sun/font/FcFontConfiguration.java
+++ b/src/java.desktop/unix/classes/sun/font/FcFontConfiguration.java
@@ -180,7 +180,7 @@
          String[] componentFaceNames = cfi[idx].getComponentFaceNames();
          FontDescriptor[] ret = new FontDescriptor[componentFaceNames.length];
          for (int i = 0; i < componentFaceNames.length; i++) {
-            ret[i] = new FontDescriptor(componentFaceNames[i], StandardCharsets.UTF_8.newEncoder(), new int[0]);
+            ret[i] = new FontDescriptor(componentFaceNames[i], StandardCharsets.ISO_8859_1.newEncoder(), new int[0]);
          }
  
          return ret;

-phil.



On 11/07/2014 08:36 AM, Mario Torre wrote:
> Hi all,
>
> I've been working on a strange issue recently, this seems to affect all
> recent version of OpenJDK as well as Oracle JDK.
>
> The issue appears to be related to this change:
>
> http://hg.openjdk.java.net/jdk7u/jdk7u/jdk/rev/fbe9320339ea
>
> The issue as I could find by debugging OpenJDK is a mix of a couple of
> things.
>
> This change was addressing postscript size explosion, where missing font
> descriptor in version prior to this fix were causing characters to be
> rendered as paths.
>
> The new code creates an actual descriptor array, so fonts can be
> rendered directly by postscript. However, it seems that the postscript
> code assumes ISO_8859_1 encoding, so if I pass some characters with,
> say, umlaut, like 'ä', instead of creating a patch the character is
> encoded as <c3a4> (which is correct ihmo) but then mapped to
> ISOLatin1Encoding.
>
> This is a snippet of the generated postscript file, the file is
> generated using a modified verion of the PrintSE.java test in OpenJDK:
>
> http://cr.openjdk.java.net/~neugens/psDieresisBug/PrintSEUmlauts.java
>
> /ISOF {
>       dup findfont dup length 1 add dict begin {
>               1 index /FID eq {pop pop} {D} ifelse
>       } forall /Encoding ISOLatin1Encoding D
>       currentdict end definefont
> } BD
> /NZ {dup 1 lt {pop 1} if} BD
> /S {
>       moveto 1 index stringwidth pop NZ sub
>       1 index length 1 sub NZ div 0
>       3 2 roll ashow newpath} BD
> 12.0 12 F
> <c3a4> 7.44 100.0 100.0 S
> pgSave restore
>
> I'm not really confident with Postscript at this level, so I would like
> some hints of where to look for an actual fix.
>
> I have a workaround that seems to work, something like:
>
> GlyphVector gv = font.createGlyphVector(frc, "ä");
> g2d.drawGlyphVector(gv, 250, 220);
>
> which basically forces the glyph path again. And of course I could
> revert the original change, but in either case it doesn't seem correct.
>
> My guess is that we should either somehow force ISO_8859_1 when calling
> CharsetString[] makeMultiCharsetString from PSPrinterJob, or have a
> proper fix for the Postscript file.
>
> Any idea of hint is very much appreciated.
>
> Cheers,
> Mario
>
>