[OpenJDK Rasterizer] AWT & gcc 4.8 optimization options

Thu Dec 10 09:24:47 UTC 2015

Sergey,

Thanks a lot for your advices, I will definitely try your approach to read
the 'preprocessed' C code (as I do not like much macros).

I think I will have some time during winter holidays to implement correct
gamma correction = pow(2.2), blend then pow(1/2.2) (using precomputed
tables) in that C code.

If you modify the Maskfill C code, could you explain me how it works as
>> I would like implementing in the future the correct gamma correction in
>> this software loop ?
>>
>
> From the current source code point of view it is not an easy task to
> understand how it works. The easiest way to study it is to compile the jdk
> using this option in AWT2dLibraries.gmk
>
> --- a/make/lib/Awt2dLibraries.gmk Tue Dec 08 19:50:14 2015 +0300
> +++ b/make/lib/Awt2dLibraries.gmk Wed Dec 09 17:10:55 2015 +0300
> @@ -242,7 +242,7 @@
> EXCLUDES := $(LIBAWT_EXCLUDES), \
> EXCLUDE_FILES := $(LIBAWT_EXFILES), \
> OPTIMIZATION := LOW, \
> - CFLAGS := $(CFLAGS_JDKLIB) $(LIBAWT_CFLAGS), \
> + CFLAGS := -save-temps $(CFLAGS_JDKLIB) $(LIBAWT_CFLAGS), \
> DISABLED_WARNINGS_gcc := sign-compare unused-result maybe-uninitialized \
> format-nonliteral parentheses, \
> DISABLED_WARNINGS_clang := logical-op-parentheses extern-initializer, \
>
> This will save result of preprocessor. Also it will save an assembler code
> which can be useful to investigate how the compiler optimize the code,
> especially in case of vectorization.
>
> When you take a look to the code after preprocessor you will be able to
> understand the DSL which is used in the AlphaMacros.h for the
> "DEFINE_ALPHA_MASKBLIT"
>
>
> There are a bunch of files in the
> java.desktop/share/native/libawt/java2d/loops/. Some of them have the
> general code like LoopMacros.h, AlphaMacros.h, others have implementation
> for a some specific types.
>
>
> For example take a look to the IntRgb.c
> It have 2 parts:
> - The array IntRgbPrimitives, which contain the list of supported
> operations(it will register the functions which should be called in
> MaskBlit.c for some particular types). For example it contains
> REGISTER_ALPHA_MASKBLIT from/to a different types.
> - Definitions of the functions like DEFINE_SRCOVER_MASKBLIT(IntArgb,
> IntRgb, 4ByteArgb); This macros provide a function which will support the
> maskblit IntArgb->IntRgb;
>
> So to understand how it work you need to trace these calls:
> - MaskBlit.java -> MaskBlit(.....)
> - MaskBlit.c -> *pPrim->funcs.maskblit
> - The function which is generated from the DEFINE_SRCOVER_MASKBLIT for a
> particular type.
>
> Note that if for some reason we have no specific implementation of
> DEFINE_SRCOVER_MASKBLIT will meant that General MaskBlit from the
> MaskBlit.java will be used and it is quite slow.
>
> I am on the road of investigation...
>

Excellent ! Maybe you should compare the preprocessor outputs between gcc
4.3.2 (JDK8) that was faster than gcc 4.8.4 (JDK9) !

I guess it is related to the loop vectorization (simd) that seems slower in
gcc >= 4.4 (known regression ?)

Good luck & keep me informed about your investigations,
Laurent
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/graphics-rasterizer-dev/attachments/20151210/295baefa/attachment-0001.html>