Project Trinity

Karthik Ganesan karthik.ganesan at
Mon Apr 24 20:26:32 UTC 2017

Hi Ryan,

On 4/24/2017 12:46 PM, LaMothe, Ryan R wrote:
> I think it would be worthwhile to reach out to the Aparapi, Sumatra, etc. folks to find out what hurdles they encountered trying to implement the same capabilities you are proposing, and why those efforts either slowed, stalled or stopped altogether. It turns out, the devil is truly in the details when you finally start down this road, and many of us (myself included), would be more than willing to knowledge share with you to help you out.
We really appreciate this and I will reach out. Thank you. But, I want 
to reemphasize one point here: we have very little redundancy with 
Sumatra or Aparapi. We are only going to provide a better interface 
instead of what Stream API provides and will only make it easier for 
realizing these existing projects.
> In my personal opinion, what you are proposing does not need a new project but a revival of existing efforts (i.e. Sumatra), as others have pointed out.
Trinity is much more focused than Sumatra. We use APIs to provide the 
information needed for offloads, whereas existing projects like Sumatra 
rely on detecting code patterns and do dynamic code generation. Sumatra 
is completely GPU/APU focused with a significant chunk of the Java 
runtime functioning on the GPU. That is very clear from the project 
description. Trinity focuses on a wider range of accelerators, including 
DAX which is very different from a GPU.

DAX has a fixed set of specific operations, that can be composed 
together. To give some perspective, example DAX operations include scan, 
select, sort etc., much similar to SQL queries and are a specific class 
of bulk aggregate operations. One can compose such operations together 
to form a pipeline of operations working on a Stream of data. Much of 
these also apply for an efficient offload to SIMD vector units on 
general purpose cores, multithreaded implementation on general purpose 
cores, Graphic processors, data processing on FPGAs etc.
> As for Sumatra, the actual goal was to convert Java code to HSAIL, so a reviving of that effort to additionally output CUDA, VHDL, DAX, as appropriate would be welcome by many of us.
That is clearly not the goal of this project. This project focuses on 
the front end and does not target any kind of dynamic code generation, 
which is left with the ongoing/existing projects.

I hope this clearly shows how this project is complementary to the 
existing offload related projects without much redundancy. We hope to 
work with all the existing/ongoing offload related projects as I 
strongly believe that these projects leveraging each other can have a 
significant impact than any one of them solving a small part of the 
puzzle. I kindly request your support and participation in Trinity.

>   If you could additionally convince the powers-that-be to support contiguous multi-dimensional arrays ([[ ]]) as part of your may even make new best friends :)
> -Ryan
> On 4/24/17, 8:00 AM, "discuss on behalf of Karthik Ganesan" <discuss-bounces at on behalf of karthik.ganesan at> wrote:
>      I would like to thank Paul Sandoz, Christian Thalinger, Doug Simon,
>      Mario Torre and Volker Simonis for their support and the insightful
>      questions.
>      What we are proposing to do as part of this project is complementary to
>      existing efforts that enable offload to GPUs like Sumatra, AparAPI etc.
>      These existing projects provide implementations translating existing
>      Java API via Bytecodes to GPU language. Trinity extends these efforts
>      and takes it one step further by readily providing the building blocks
>      for programmers to construct complex bulk data/stream based algorithms
>      in Java that can be easily offloaded by these existing projects. While
>      having a route to offload to hardware accelerators is useful, but making
>      it easier for programmers to leverage will take it one step closer to
>      adoption.
>      Projects like Sumatra and AparAPI use the the Stream ForEach() method to
>      show case offloads. Trinity will offer more such methods with richer
>      functionality, making it easier for these existing projects to leverage
>      and deliver hardware capabilities to be readily consumed by programmers.
>      Unlike the existing Streams API, the library for this new API is
>      envisioned to have a stronger focus on performance, a dedicated
>      implementation that will be offload friendly and cover more functions
>      that are relevant to this domain of programmers.
>      Also, please note that Trinity casts a wider a net when it comes to
>      accelerators, not just GPUs/APUs. These accelerators can include
>      Analytics accelerators like DAX, SIMD units on general purpose cores,
>      FPGA based accelerators for bulk aggregate operations, GPUs and whatever
>      more the future holds in terms of heterogeneous computing for bulk data
>      processing.
>      Inspired by the existing Streams API that brings succinct functional
>      programming to Java using lambdas, this project will try to retain such
>      rich features, significantly simplifying programming in Java for the
>      performance oriented developers focusing on bulk data processing.
>      Regards,
>      Karthik
>      On 4/24/2017 4:09 AM, Doug Simon wrote:
>      >> On 24 Apr 2017, at 10:50, Volker Simonis <volker.simonis at> wrote:
>      >>
>      >> On Sun, Apr 23, 2017 at 1:39 PM, Doug Simon <doug.simon at> wrote:
>      >>>> On 21 Apr 2017, at 23:54, Christian Thalinger <cthalinger at> wrote:
>      >>>>
>      >>>>
>      >>>>> On Apr 21, 2017, at 11:41 AM, Karthik Ganesan <karthik.ganesan at> wrote:
>      >>>>>
>      >>>>> Hi Christian,
>      >>>>>
>      >>>>> Thanks for your interest. This question was brought up previously in the discussion email thread for this project:
>      >>>>>
>      >>>>> Project Sumatra was aimed at translation of Java byte code to execute on
>      >>>>> GPU, which was an ambitious goal and a challenging task to take up. In this
>      >>>>> project, we aim to come up with APIs targeting the most common Analytics
>      >>>>> operations that can be readily offloaded to accelerators transparently. Most
>      >>>>> of the information needed for offload to the accelerator is expected to be
>      >>>>> readily provided by the API semantics and there by, simplifying the need to
>      >>>>> do tedious byte code analysis.
>      >>>> I disagree.  The first paragraph on the Sumatra project page says:
>      >>>>
>      >>>> "This primary goal of this project is to enable Java applications to take advantage of graphics processing units (GPUs) and accelerated processing units (APUs)--whether they are discrete devices or integrated with a CPU--to improve performance.”
>      >>>>
>      >>>> while you state:
>      >>>>
>      >>>> "This Project would explore enhanced execution of bulk
>      >>>> aggregate calculations over Streams through offloading
>      >>>> calculations to hardware accelerators.”
>      >>>>
>      >>>> It’s the same thing.  I just don’t see the need to spin up yet-another OpenJDK project that aims at the same goal.
>      >>> Maybe this is just a discrepancy between the officially stated aims. I understood Sumatra to be about *automatic* offloading work for existing APIs (such as the Streams API) to a GPU where as Trinity seems to be more about designing an explicit API for GPU offloading.
>      >>>
>      >> So if this is about a explicit API for GPU offloading, will this be a
>      >> Java implementation/wrapper for already existing C/C++ APIs like
>      >> CUDA/OpenCL. Designing a completely new, Java-specific API seems to be
>      >> not very promising to me.
>      > I agree.
>      >
>      > Karthik, maybe you could discuss the differences/similarities between Trinity and the Arapapi project (
>      >
>      > -Doug

More information about the discuss mailing list