[OpenJDK 2D-Dev] Rendering images from PDF files slower in OpenJDK

Daniel Persson mailto.woden at gmail.com
Fri Oct 5 05:47:37 UTC 2018


Hi Phil.

Well it seems like you've been in this discussion before

https://bugs.openjdk.java.net/browse/JDK-8041125

Wasn't aware that PDFBox PDF2Image used the Kcms Provider per default.
You may close this issue as we have figured out the reason.

Best regards
Daniel

On Fri, Oct 5, 2018 at 7:27 AM Philip Race <philip.race at oracle.com> wrote:

>
>
> On 10/4/18, 10:22 PM, Daniel Persson wrote:
>
> Hi Laurent
>
> Well that seems like a reasonable assumption.
>
> https://github.com/kalaspuffar/ColorConvTest/blob/master/KCMSTest.md
>
> The test with a "blank" image has a 1 seconds difference.
>
> And the test with an image from the PDF in question have a 52 seconds
> difference.
>
>
> I tried playing with different image data but I didn't see a sensitivity
> to that.
> Maybe I needed to try something more complex.
>
>
> So why don't OpenJDK 9 and forward have KcmsServiceProvider bundled?
> Does this provider make a worse result on the image?
>
> It is not open source. It cannot be part of OpenJDK. Ever.
> And see my other email for the other reasons.
> So there is no quick or easy solution.
>
> FWIW the #1 reason I left KCMS in Oracle 8 and even 9 was because of the
> MT performance
> issue, but as we now converge Oracle JDK & OpenJDK that was a non-starter
> and it was
> removed along with other closed source components.
>
> -phil.
>
> Best regards
> Daniel
>
>
>
>
> On Fri, Oct 5, 2018 at 6:55 AM Laurent Bourgès <bourges.laurent at gmail.com>
> wrote:
>
>> Phil,
>> I just gg a bit and got the PDFImage source:
>>
>> public static void main( String[] args ) throws IOException
>> 79     {
>> 80         try
>> 81         {
>> 82             // force KCMS (faster than LCMS) if available
>> 83             Class.forName("sun.java2d.cmm.kcms.KcmsServiceProvider");
>> 84             System.setProperty("sun.java2d.cmm",
>> "sun.java2d.cmm.kcms.KcmsServiceProvider");
>> 85         }
>> 86         catch (ClassNotFoundException e)
>> 87         {
>> 88             LOG.debug("KCMS service not found - using LCMS", e);
>> 89         }
>> 90
>>
>>
>> https://svn.apache.org/viewvc/pdfbox/trunk/tools/src/main/java/org/apache/pdfbox/tools/PDFToImage.java?revision=1829374&view=markup
>>
>> That's all folks !
>>
>> Le ven. 5 oct. 2018 à 01:00, Philip Race <philip.race at oracle.com> a
>> écrit :
>>
>>> Yep. LCMS is the default in 8u.
>>>
>>> And although KCMS is a lot faster  on my CConv test ...
>>>
>>> ~/jdk8u181/bin/java CConv
>>> 13289
>>>
>>>  ~/jdk8u181/bin/java
>>> -Dsun.java2d.cmm=sun.java2d.cmm.kcms.KcmsServiceProvider CConv
>>> 5131
>>>
>>>
>>> It makes no difference on the pdf conversion :
>>>
>>> ~/jdk8u181/bin/java -jar pdfbox-app-2.0.11.jar PDFToImage  -time
>>> test.pdf Rendered 1 page in 4985ms
>>>
>>> ~/jdk8u181/bin/java
>>> -Dsun.java2d.cmm=sun.java2d.cmm.kcms.KcmsServiceProvider -jar
>>> pdfbox-app-2.0.11.jar PDFToImage  -time test.pdf
>>> Rendered 1 page in 4723ms
>>>
>>>
>>> Note: KCMS maybe faster on CConv but it has no support for modern ICC
>>> profiles
>>> and I haven't checked if it is even applying the pdfbox one properly.
>>> But it does have support to split a job into concurrent tasks for
>>> sub-images
>>> which can help on the larger images like the one I am using in CConv.
>>>
>>> -phil.
>>>
>>> On 10/4/18, 2:24 PM, Philip Race wrote:
>>>
>>> I might be losing it, but I am 99% sure that LCMS is the color
>>> conversion engine in 8.
>>> KCMS was there only for backup. You'd have to know the magic flag to get
>>> it and
>>> no one has said anything to the effect that they are using it.
>>>
>>> -phil.
>>>
>>> On 10/4/18, 11:33 AM, Laurent Bourgès wrote:
>>>
>>> Phil,
>>> I wondered if ang RenderingHint defaults changed since 8...
>>>
>>> Moreover I started playing with linux perf + jit agent and it is easy
>>> than before wigh oprofile + jvmtiagent.
>>>
>>> I noticed that OracleJDK8 uses KCMS and OpenJDK11 uses LCMS for color
>>> conversion as does OpenJDK8, that could explain the performance gap.
>>>
>>> Finally PDFImage test is run only once so the overhead may come from
>>> warmup (jit, g1)...
>>>
>>> More later,
>>> Laurent
>>>
>>> Le jeu. 4 oct. 2018 à 20:03, Phil Race <philip.race at oracle.com> a
>>> écrit :
>>>
>>>>
>>>>
>>>> On 10/03/2018 11:58 PM, Laurent Bourgès wrote:
>>>>
>>>> Hi,
>>>> I will get the code and add debugging logs: env & system properties and
>>>> java2d RenderingHints.
>>>>
>>>>
>>>> The code in pdfbox passes null for the hints. So there should be no
>>>> difference attributable to that.
>>>>
>>>> -phil.
>>>>
>>>>
>>>> I suspect these hints are different or have a noticiable impact: color
>>>> interpolation & rendering quality.
>>>>
>>>> I suppose the backend corresponds to software loops but some 2d
>>>> operations can be accelerated ?
>>>>
>>>> Anyway I will push any change in the code.
>>>>
>>>> PS: I can run linux perf to profile both java & native code....
>>>>
>>>> Cheers,
>>>> Laurent
>>>>
>>>> Le jeu. 4 oct. 2018 à 07:50, Daniel Persson <mailto.woden at gmail.com> a
>>>> écrit :
>>>>
>>>>> Hi Philip and Laurent.
>>>>>
>>>>> I've talked with Tilman and Andreas from the PDFBox team and they see
>>>>> similar connections to the ColorConvertOp filter but wanted to try with one
>>>>> of the images of the PDF as a raster.
>>>>>
>>>>> As we try different things I thought it good for collaboration to
>>>>> create a repository with the code so all can contribute.
>>>>>
>>>>> https://github.com/kalaspuffar/ColorConvTest
>>>>>
>>>>> I've run the 3 different tests on my Machine (Thinkpad P51s) with
>>>>> custom Gentoo installed, if important to the conversation.
>>>>>
>>>>> I tried to invite you all as collaborators to this repository if you
>>>>> think this is a bad Idea let me know.
>>>>>
>>>>> Best regards
>>>>> Daniel
>>>>>
>>>>> On Wed, Oct 3, 2018 at 7:51 PM Laurent Bourgès <
>>>>> bourges.laurent at gmail.com> wrote:
>>>>>
>>>>>> Very good job, phil.
>>>>>>
>>>>>> I will try your CCONV test on my linux machine to see if it is
>>>>>> platform dependent ... or hw ?
>>>>>>
>>>>>> Laurent
>>>>>>
>>>>>> Le mer. 3 oct. 2018 à 19:19, Philip Race <philip.race at oracle.com> a
>>>>>> écrit :
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 10/3/18, 1:15 AM, Laurent Bourgès wrote:
>>>>>>>
>>>>>>> Phil,
>>>>>>>
>>>>>>> If you look at the given pdf file, it has large images that exceed
>>>>>>> 2k so such ones may be more costly to convert.
>>>>>>>
>>>>>>>
>>>>>>> FWIW the one I profiled was by far the largest at 2577x1540.
>>>>>>> The rest are more like 100x100, 200x200 or 500x500 - all
>>>>>>> approximations.
>>>>>>>
>>>>>>>
>>>>>>> As jpeg decoder in openjdk11 is different than oraclejdk8, it may
>>>>>>> cause more ColorConvertOp filter operations ... if color profiles are
>>>>>>> different.
>>>>>>>
>>>>>>>
>>>>>>> That doesn't seem likely and in fact since I  instrumented
>>>>>>> ColorConvertOp in 8 & 11,  I know exactly how many times it was invoked
>>>>>>> by pdfbox, (11 times in both cases) and that all the image data is
>>>>>>> the same. SRC and DEST are the same types etc.
>>>>>>>
>>>>>>> Also the version of LCMS is the same in 8 and 11 (v2.9).
>>>>>>>
>>>>>>> -phil
>>>>>>>
>>>>>>>
>>>>>>> Anyway this performance is not related to Marlin renderer, so I can
>>>>>>> not help much except in its diagnostic.
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Laurent
>>>>>>>
>>>>>>> Le mar. 2 oct. 2018 à 23:35, Philip Race <philip.race at oracle.com> a
>>>>>>> écrit :
>>>>>>>
>>>>>>>> I've spent some time examining what pdfbox is passing to
>>>>>>>> ColorConvertOp
>>>>>>>> It is called about 10 or 11 times in this test with images
>>>>>>>> typically 1-2K in each dimension.
>>>>>>>> The input image is a Custom BufferedImage which uses an
>>>>>>>> ICC_ColorSpace constructed
>>>>>>>> from a color profile file that is embedded in pdfbox which is an
>>>>>>>> open source equivalent
>>>>>>>> of what Acrobat uses. It has a 4 component raster and is opaque
>>>>>>>>
>>>>>>>> This is filtered into a 3 component standard INT_RGB ColorModel.
>>>>>>>>
>>>>>>>> I've distilled this down into a small program which has an copy of
>>>>>>>> the method
>>>>>>>> that is defined in pdfbox and is invoking the supposedly slow
>>>>>>>> ColorConvertOp.
>>>>>>>>
>>>>>>>> So I believe this is all exactly what is happening in pdfbox.
>>>>>>>>
>>>>>>>> What I find is that it is actually much faster on JDK11 than JDK 8.
>>>>>>>>
>>>>>>>> prrubuntu:~$ ~/jdk-11/bin/java CConv
>>>>>>>> 4881
>>>>>>>> prrubuntu:~$ ~/jdk8u181/bin/java CConv
>>>>>>>> 12529
>>>>>>>>
>>>>>>>>
>>>>>>>> I can't say why that would be but the results are clear.
>>>>>>>> So I am left to suppose that pdfbox really is doing something
>>>>>>>> different in 8 vs 11.
>>>>>>>> Or that this not the real problem. What do others see ?
>>>>>>>>
>>>>>>>> I've attached the program. The 1Mb color profile file can be got
>>>>>>>> from the pdfbox sources.
>>>>>>>>
>>>>>>>> -phil.
>>>>>>>>
>>>>>>>>
>>>>>>>> On 10/2/18, 9:35 AM, Laurent Bourgès wrote:
>>>>>>>>
>>>>>>>> Hi Daniel,
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Let's not compare apples and oranges. What I can see it takes the
>>>>>>>>> same route and behave similarly.
>>>>>>>>>
>>>>>>>>
>>>>>>>>  I agree, I did not take enough time to get accurate profiles,
>>>>>>>> sorry.
>>>>>>>>
>>>>>>>>
>>>>>>>>> If you look at
>>>>>>>>> http://uhash.com/java_reg/Call_Tree_java_8.html
>>>>>>>>> http://uhash.com/java_reg/Call_Tree_java_11.html
>>>>>>>>>
>>>>>>>>> You can see that ConvertOp.filter takes 1.5s longer on Java 11.
>>>>>>>>>
>>>>>>>>
>>>>>>>> I confirm: 1.8s vs 300ms.
>>>>>>>>
>>>>>>>> Philip, do you know what could have change in this 2d area ?
>>>>>>>>
>>>>>>>> I imagine ColorConvertOp delegates to native code so color profile
>>>>>>>> (ICC) or hidpi support may have an impact here (or just compiler options
>>>>>>>> may be different) ...
>>>>>>>>
>>>>>>>> If needed, I could profile native code using oprofile / perf.
>>>>>>>>
>>>>>>>> Laurent
>>>>>>>>
>>>>>>>>
>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/2d-dev/attachments/20181005/e5da67d6/attachment-0001.html>


More information about the 2d-dev mailing list