[OpenJDK 2D-Dev] Rendering images from PDF files slower in OpenJDK

Laurent Bourgès bourges.laurent at gmail.com
Fri Oct 5 06:45:19 UTC 2018


Phil,

Le ven. 5 oct. 2018 à 08:06, Philip Race <philip.race at oracle.com> a écrit :

> Yep .. I vaguely remembered that we had such a report but would also have
> had to hunt to locate it.
>
> So we have the probable reason but not the solution.
>
> FWIW I think it would be a fun project for a LCMS developer ...
>

I looks like a call to let me have a look. Maybe I could inspect what makes
LCMS slow (lerp?) ... and optimize the C code or at least tune gcc options.
I would not make MT in lcms as it can affect MT in java.
Maybe PDFImage could convert tiles instead of the all image ?

Anyway I will publish my lknux perf report showing time spent in kcms /
lcms asap.

PS: I tested OpenJDK8 and perf is on par with OpenJDK9.

Laurent



> On 10/4/18, 10:47 PM, Daniel Persson wrote:
>
> Hi Phil.
>
> Well it seems like you've been in this discussion before
>
> https://bugs.openjdk.java.net/browse/JDK-8041125
>
> Wasn't aware that PDFBox PDF2Image used the Kcms Provider per default.
> You may close this issue as we have figured out the reason.
>
> Best regards
> Daniel
>
> On Fri, Oct 5, 2018 at 7:27 AM Philip Race <philip.race at oracle.com> wrote:
>
>>
>>
>> On 10/4/18, 10:22 PM, Daniel Persson wrote:
>>
>> Hi Laurent
>>
>> Well that seems like a reasonable assumption.
>>
>> https://github.com/kalaspuffar/ColorConvTest/blob/master/KCMSTest.md
>>
>> The test with a "blank" image has a 1 seconds difference.
>>
>> And the test with an image from the PDF in question have a 52 seconds
>> difference.
>>
>>
>> I tried playing with different image data but I didn't see a sensitivity
>> to that.
>> Maybe I needed to try something more complex.
>>
>>
>> So why don't OpenJDK 9 and forward have KcmsServiceProvider bundled?
>> Does this provider make a worse result on the image?
>>
>> It is not open source. It cannot be part of OpenJDK. Ever.
>> And see my other email for the other reasons.
>> So there is no quick or easy solution.
>>
>> FWIW the #1 reason I left KCMS in Oracle 8 and even 9 was because of the
>> MT performance
>> issue, but as we now converge Oracle JDK & OpenJDK that was a non-starter
>> and it was
>> removed along with other closed source components.
>>
>> -phil.
>>
>> Best regards
>> Daniel
>>
>>
>>
>>
>> On Fri, Oct 5, 2018 at 6:55 AM Laurent Bourgès <bourges.laurent at gmail.com>
>> wrote:
>>
>>> Phil,
>>> I just gg a bit and got the PDFImage source:
>>>
>>> public static void main( String[] args ) throws IOException
>>> 79     {
>>> 80         try
>>> 81         {
>>> 82             // force KCMS (faster than LCMS) if available
>>> 83             Class.forName("sun.java2d.cmm.kcms.KcmsServiceProvider");
>>> 84             System.setProperty("sun.java2d.cmm",
>>> "sun.java2d.cmm.kcms.KcmsServiceProvider");
>>> 85         }
>>> 86         catch (ClassNotFoundException e)
>>> 87         {
>>> 88             LOG.debug("KCMS service not found - using LCMS", e);
>>> 89         }
>>> 90
>>>
>>>
>>> https://svn.apache.org/viewvc/pdfbox/trunk/tools/src/main/java/org/apache/pdfbox/tools/PDFToImage.java?revision=1829374&view=markup
>>>
>>> That's all folks !
>>>
>>> Le ven. 5 oct. 2018 à 01:00, Philip Race <philip.race at oracle.com> a
>>> écrit :
>>>
>>>> Yep. LCMS is the default in 8u.
>>>>
>>>> And although KCMS is a lot faster  on my CConv test ...
>>>>
>>>> ~/jdk8u181/bin/java CConv
>>>> 13289
>>>>
>>>>  ~/jdk8u181/bin/java
>>>> -Dsun.java2d.cmm=sun.java2d.cmm.kcms.KcmsServiceProvider CConv
>>>> 5131
>>>>
>>>>
>>>> It makes no difference on the pdf conversion :
>>>>
>>>> ~/jdk8u181/bin/java -jar pdfbox-app-2.0.11.jar PDFToImage  -time
>>>> test.pdf Rendered 1 page in 4985ms
>>>>
>>>> ~/jdk8u181/bin/java
>>>> -Dsun.java2d.cmm=sun.java2d.cmm.kcms.KcmsServiceProvider -jar
>>>> pdfbox-app-2.0.11.jar PDFToImage  -time test.pdf
>>>> Rendered 1 page in 4723ms
>>>>
>>>>
>>>> Note: KCMS maybe faster on CConv but it has no support for modern ICC
>>>> profiles
>>>> and I haven't checked if it is even applying the pdfbox one properly.
>>>> But it does have support to split a job into concurrent tasks for
>>>> sub-images
>>>> which can help on the larger images like the one I am using in CConv.
>>>>
>>>> -phil.
>>>>
>>>> On 10/4/18, 2:24 PM, Philip Race wrote:
>>>>
>>>> I might be losing it, but I am 99% sure that LCMS is the color
>>>> conversion engine in 8.
>>>> KCMS was there only for backup. You'd have to know the magic flag to
>>>> get it and
>>>> no one has said anything to the effect that they are using it.
>>>>
>>>> -phil.
>>>>
>>>> On 10/4/18, 11:33 AM, Laurent Bourgès wrote:
>>>>
>>>> Phil,
>>>> I wondered if ang RenderingHint defaults changed since 8...
>>>>
>>>> Moreover I started playing with linux perf + jit agent and it is easy
>>>> than before wigh oprofile + jvmtiagent.
>>>>
>>>> I noticed that OracleJDK8 uses KCMS and OpenJDK11 uses LCMS for color
>>>> conversion as does OpenJDK8, that could explain the performance gap.
>>>>
>>>> Finally PDFImage test is run only once so the overhead may come from
>>>> warmup (jit, g1)...
>>>>
>>>> More later,
>>>> Laurent
>>>>
>>>> Le jeu. 4 oct. 2018 à 20:03, Phil Race <philip.race at oracle.com> a
>>>> écrit :
>>>>
>>>>>
>>>>>
>>>>> On 10/03/2018 11:58 PM, Laurent Bourgès wrote:
>>>>>
>>>>> Hi,
>>>>> I will get the code and add debugging logs: env & system properties
>>>>> and java2d RenderingHints.
>>>>>
>>>>>
>>>>> The code in pdfbox passes null for the hints. So there should be no
>>>>> difference attributable to that.
>>>>>
>>>>> -phil.
>>>>>
>>>>>
>>>>> I suspect these hints are different or have a noticiable impact: color
>>>>> interpolation & rendering quality.
>>>>>
>>>>> I suppose the backend corresponds to software loops but some 2d
>>>>> operations can be accelerated ?
>>>>>
>>>>> Anyway I will push any change in the code.
>>>>>
>>>>> PS: I can run linux perf to profile both java & native code....
>>>>>
>>>>> Cheers,
>>>>> Laurent
>>>>>
>>>>> Le jeu. 4 oct. 2018 à 07:50, Daniel Persson <mailto.woden at gmail.com>
>>>>> a écrit :
>>>>>
>>>>>> Hi Philip and Laurent.
>>>>>>
>>>>>> I've talked with Tilman and Andreas from the PDFBox team and they see
>>>>>> similar connections to the ColorConvertOp filter but wanted to try with one
>>>>>> of the images of the PDF as a raster.
>>>>>>
>>>>>> As we try different things I thought it good for collaboration to
>>>>>> create a repository with the code so all can contribute.
>>>>>>
>>>>>> https://github.com/kalaspuffar/ColorConvTest
>>>>>>
>>>>>> I've run the 3 different tests on my Machine (Thinkpad P51s) with
>>>>>> custom Gentoo installed, if important to the conversation.
>>>>>>
>>>>>> I tried to invite you all as collaborators to this repository if you
>>>>>> think this is a bad Idea let me know.
>>>>>>
>>>>>> Best regards
>>>>>> Daniel
>>>>>>
>>>>>> On Wed, Oct 3, 2018 at 7:51 PM Laurent Bourgès <
>>>>>> bourges.laurent at gmail.com> wrote:
>>>>>>
>>>>>>> Very good job, phil.
>>>>>>>
>>>>>>> I will try your CCONV test on my linux machine to see if it is
>>>>>>> platform dependent ... or hw ?
>>>>>>>
>>>>>>> Laurent
>>>>>>>
>>>>>>> Le mer. 3 oct. 2018 à 19:19, Philip Race <philip.race at oracle.com> a
>>>>>>> écrit :
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 10/3/18, 1:15 AM, Laurent Bourgès wrote:
>>>>>>>>
>>>>>>>> Phil,
>>>>>>>>
>>>>>>>> If you look at the given pdf file, it has large images that exceed
>>>>>>>> 2k so such ones may be more costly to convert.
>>>>>>>>
>>>>>>>>
>>>>>>>> FWIW the one I profiled was by far the largest at 2577x1540.
>>>>>>>> The rest are more like 100x100, 200x200 or 500x500 - all
>>>>>>>> approximations.
>>>>>>>>
>>>>>>>>
>>>>>>>> As jpeg decoder in openjdk11 is different than oraclejdk8, it may
>>>>>>>> cause more ColorConvertOp filter operations ... if color profiles are
>>>>>>>> different.
>>>>>>>>
>>>>>>>>
>>>>>>>> That doesn't seem likely and in fact since I  instrumented
>>>>>>>> ColorConvertOp in 8 & 11,  I know exactly how many times it was invoked
>>>>>>>> by pdfbox, (11 times in both cases) and that all the image data is
>>>>>>>> the same. SRC and DEST are the same types etc.
>>>>>>>>
>>>>>>>> Also the version of LCMS is the same in 8 and 11 (v2.9).
>>>>>>>>
>>>>>>>> -phil
>>>>>>>>
>>>>>>>>
>>>>>>>> Anyway this performance is not related to Marlin renderer, so I can
>>>>>>>> not help much except in its diagnostic.
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> Laurent
>>>>>>>>
>>>>>>>> Le mar. 2 oct. 2018 à 23:35, Philip Race <philip.race at oracle.com>
>>>>>>>> a écrit :
>>>>>>>>
>>>>>>>>> I've spent some time examining what pdfbox is passing to
>>>>>>>>> ColorConvertOp
>>>>>>>>> It is called about 10 or 11 times in this test with images
>>>>>>>>> typically 1-2K in each dimension.
>>>>>>>>> The input image is a Custom BufferedImage which uses an
>>>>>>>>> ICC_ColorSpace constructed
>>>>>>>>> from a color profile file that is embedded in pdfbox which is an
>>>>>>>>> open source equivalent
>>>>>>>>> of what Acrobat uses. It has a 4 component raster and is opaque
>>>>>>>>>
>>>>>>>>> This is filtered into a 3 component standard INT_RGB ColorModel.
>>>>>>>>>
>>>>>>>>> I've distilled this down into a small program which has an copy of
>>>>>>>>> the method
>>>>>>>>> that is defined in pdfbox and is invoking the supposedly slow
>>>>>>>>> ColorConvertOp.
>>>>>>>>>
>>>>>>>>> So I believe this is all exactly what is happening in pdfbox.
>>>>>>>>>
>>>>>>>>> What I find is that it is actually much faster on JDK11 than JDK 8.
>>>>>>>>>
>>>>>>>>> prrubuntu:~$ ~/jdk-11/bin/java CConv
>>>>>>>>> 4881
>>>>>>>>> prrubuntu:~$ ~/jdk8u181/bin/java CConv
>>>>>>>>> 12529
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I can't say why that would be but the results are clear.
>>>>>>>>> So I am left to suppose that pdfbox really is doing something
>>>>>>>>> different in 8 vs 11.
>>>>>>>>> Or that this not the real problem. What do others see ?
>>>>>>>>>
>>>>>>>>> I've attached the program. The 1Mb color profile file can be got
>>>>>>>>> from the pdfbox sources.
>>>>>>>>>
>>>>>>>>> -phil.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 10/2/18, 9:35 AM, Laurent Bourgès wrote:
>>>>>>>>>
>>>>>>>>> Hi Daniel,
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Let's not compare apples and oranges. What I can see it takes the
>>>>>>>>>> same route and behave similarly.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>  I agree, I did not take enough time to get accurate profiles,
>>>>>>>>> sorry.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> If you look at
>>>>>>>>>> http://uhash.com/java_reg/Call_Tree_java_8.html
>>>>>>>>>> http://uhash.com/java_reg/Call_Tree_java_11.html
>>>>>>>>>>
>>>>>>>>>> You can see that ConvertOp.filter takes 1.5s longer on Java 11.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I confirm: 1.8s vs 300ms.
>>>>>>>>>
>>>>>>>>> Philip, do you know what could have change in this 2d area ?
>>>>>>>>>
>>>>>>>>> I imagine ColorConvertOp delegates to native code so color profile
>>>>>>>>> (ICC) or hidpi support may have an impact here (or just compiler options
>>>>>>>>> may be different) ...
>>>>>>>>>
>>>>>>>>> If needed, I could profile native code using oprofile / perf.
>>>>>>>>>
>>>>>>>>> Laurent
>>>>>>>>>
>>>>>>>>>
>>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/2d-dev/attachments/20181005/96b910a4/attachment-0001.html>


More information about the 2d-dev mailing list