Call for Discussion: New Project: Babylon

Wed Oct 11 08:07:34 UTC 2023

Hi Paul,

    This sounds great. We (the TornadoVM team at the University of 
Manchester) would like to collaborate and support this project moving 
forward.

Juan

On 14/09/2023 00:31, Paul Sandoz wrote:
> Hi Juan,
>
>> On Sep 13, 2023, at 10:03 AM, Juan Fumero <juan.fumero at paravox.ai> wrote:
>>
>> Hi Paul,
>>    I think this is a great initiative and very well-needed in the 
>> Java world. I have a few questions.
>>
>> 1)
>> /> Babylon will ensure that code reflection is fit for purpose by 
>> creating a  GPU programming model for Java that leverages code 
>> reflection and is implemented as a Java library./
>>
>> Does this mean that one of the goals of the project is to define how 
>> GPUs should be programmed using the Code Reflection API, or for Java 
>> in General?
>>
>
> The intent is a general approach that depends on the support of code 
> reflection (and Panama FFM).
>
> I think it is up to us, as members of the OpenJDK community, to 
> determine where we head with regards to the GPU programming model, any 
> concrete artifacts that could be produced, and where the dividing 
> lines may be between APIs, implementations, and vendors. Gary can 
> speak more to this than I.
>
>> Is Babylon limited to GPUs? Are you also considering other types of 
>> accelerators (e.g., AI accelerators, RISC-V accelerators, etc).
>>
>
> In principle it's not limited. As you have shown with TornadoVM the 
> same programming model for GPUs can apply to other forms of hardware 
> that are highly parallel processors, like FPGAs where a program is 
> “printed out” (?) or uniquely arranged in some malleable hardware. In 
> this case, assuming the programming model is applicable, it seems 
> predominantly an area of implementation focus someone could choose to 
> take on in their own implementations.
>
> I think the more specialized the hardware the more limited the 
> programming. So in some cases a parallel programming model may not 
> apply, like with hardware that specializes only in multiplying 
> tensors, which in effect reduces to some form of library calls.
>
>> We have other programming models such as TornadoVM [1], which can be 
>> programmed using different styles (e.g., loop parallel programs and 
>> kernel APIs). How the new model/s will accommodate existing 
>> solutions? Is this to be defined?
>>
>
> Again Gary can speak more to this, but I suspect the design will focus 
> predominantly on a range-based kernel model (similar to Tornado’s 
> kernel API). But, in principle I imagine it may be possible to plugin 
> different kernel models (or copy parts of the design) where code 
> reflection could be applied with different and more sophisticated 
> approaches to program analysis and compilation, such as for a 
> loop-based kernel model.
>
> Two key ares of focus I see are:
>
> 1) the extraction of kernel call graphs using code reflection, as 
> discussed in Gary’s JVMLS talk. Thus a developer does not have to 
> explicitly build a task graph (as currently required by TornadoVM) and 
> instead a specialized compiler does that work. (Note, it does not 
> render any existing task graph API redundant, it just moves it more 
> into the background as an important lower-level building block where 
> the developer is not required to use it).
>
> 2) the ability to call pre-defined “native” kernels that exist in some 
> where else e.g., GPU-enabled library, which may also be a solution for 
> leveraging more exotic but constrained limited hardware.
>
>> 2)
>> /> We do not currently plan to deliver the GPU programming model into 
>> the JDK. However, work on that model could identify JDK features and 
>> enhancements of general utility which could be addressed in future work./
>>
>> Does this mean that the GPU programming model will be only used as a 
>> motivation to develop the Code Reflection APIs for different use cases?
>>
>> 3) Is there any intent to support JVM languages with these models 
>> (e.g., R, Scala, etc), or will it be specific for the Java language?
>>
>
> It’s specific to the Java language and reflection of Java code.
>
>> 4) I believe we also need new types. As we discussed in JVMLS this 
>> year, we will also need NDArray and Tensor types, Vector types and 
>> Panama-based types for AI and Heterogeneous Computing. This is 
>> aligned to the Gary's talk at JVMLS [2] in which he proposed the HAT 
>> initiative (Heterogeneous Accelerator Toolkit) and Panama-based 
>> types. Will be this also part of the Babylon project?
>>
>
> I think we will inevitably explore some of that, and they may be of 
> such “general utility” we could decide to address in future work. 
> However, I am wary of overly focusing on imperfections in this effort, 
> esp. as in many of these cases there is a tendency to focus on syntax 
> rather than the underlying model e.g., arrays (which requires much 
> deeper and careful thinking, but result will be much better for that). 
> It won’t be perfect and we can feed those imperfections into possible 
> future work.
>
> Paul.
>
>
>> [1] 
>> https://tornadovm.readthedocs.io/en/latest/programming.html#core-programming
>>
>> [2] https://www.youtube.com/watch?v=lbKBu3lTftc
>>
>>
>> Thanks
>> Juan
>>
>>
>> On 13/09/2023 01:37, Paul Sandoz wrote:
>>> Hi Ethan,
>>>
>>> Current/prior work includes Mojo, MLIR, C# LINQ, Julia [1], Swift 
>>> for TensorFlow [2], Haskell [3].
>>>
>>> In the context of lunch and Python what I had in mind is machine 
>>> learning and all those frameworks, and I was also thinking about 
>>> introspection of Python code which IIUC is what TorchDynamo [4] does.
>>>
>>> Paul.
>>>
>>> [1] https://arxiv.org/abs/1712.03112
>>>
>>> [2] 
>>> https://llvm.org/devmtg/2018-10/slides/Hong-Lattner-SwiftForTensorFlowGraphProgramExtraction.pdf
>>>
>>> [3] http://conal.net/papers/essence-of-ad/essence-of-ad-icfp.pdf
>>>
>>> [4] https://pytorch.org/docs/stable/dynamo/index.html
>>>
>>>> On Sep 12, 2023, at 12:31 PM, Ethan McCue <ethan at mccue.dev> wrote:
>>>>
>>>> Can you elaborate more on prior work / the state of affairs in 
>>>> other language ecosystems? In the talk you reference Python "eating 
>>>> Java's lunch" - do they have a comparable set of features or some 
>>>> mechanism that serves the same goal (write code in Python, derive 
>>>> GPU kernel/autodiffed/etc. code)?
>>>>
>>>> On Wed, Sep 6, 2023 at 12:44 PM Paul Sandoz 
>>>> <paul.sandoz at oracle.com> wrote:
>>>>
>>>>     I hereby invite discussion of a new Project, Babylon, whose
>>>>     primary goal
>>>>     will be to extend the reach of Java to foreign programming
>>>>     models such as
>>>>     SQL, differentiable programming, machine learning models, and GPUs.
>>>>
>>>>     Focusing on the last example, suppose a Java developer wants to
>>>>     write a GPU
>>>>     kernel in Java and execute it on a GPU. The developer’s Java
>>>>     code must,
>>>>     somehow, be analyzed and transformed into an executable GPU
>>>>     kernel. A Java
>>>>     library could do that, but it requires access to the Java code
>>>>     in symbolic
>>>>     form. Such access is, however, currently limited to the use of
>>>>     non-standard
>>>>     APIs or to conventions at different points in the program’s
>>>>     life cycle
>>>>     (compile time or run time), and the symbolic forms available
>>>>     (abstract
>>>>     syntax trees or bytecodes) are often ill-suited to analysis and
>>>>     transformation.
>>>>
>>>>     Babylon will extend Java's reach to foreign programming models
>>>>     with an
>>>>     enhancement to reflective programming in Java, called code
>>>>     reflection. This
>>>>     will enable standard access, analysis, and transformation of
>>>>     Java code in a
>>>>     suitable form. Support for a foreign programming model can then
>>>>     be more
>>>>     easily implemented as a Java library.
>>>>
>>>>     Babylon will ensure that code reflection is fit for purpose by
>>>>     creating a
>>>>     GPU programming model for Java that leverages code reflection
>>>>     and is
>>>>     implemented as a Java library. To reduce the risk of bias we
>>>>     will also
>>>>     explore, or encourage the exploration of, other programming
>>>>     models such as
>>>>     SQL and differentiable programming, though we may do so less
>>>>     thoroughly.
>>>>
>>>>     Code reflection consists of three parts:
>>>>
>>>>     1) The modeling of Java programs as code models, suitable for
>>>>     access,
>>>>        analysis, and transformation.
>>>>     2) Enhancements to Java reflection, enabling access to code
>>>>     models at compile
>>>>        time and run time.
>>>>     3) APIs to build, analyze, and transform code models.
>>>>
>>>>     For further details please see the JVM Language Summit 2023
>>>>     presentations
>>>>     entitled "Code Reflection" [1] and "Java and GPU … are we
>>>>     nearly there yet?"
>>>>     [2].
>>>>
>>>>     I propose to lead this Project with an initial set of Reviewers
>>>>     that
>>>>     includes, but is not limited to, Maurizio Cimadamore, Gary
>>>>     Frost, and
>>>>     Sandhya Viswanathan.
>>>>
>>>>     For code reflection this Project will start with a clone of the
>>>>     current JDK
>>>>     main-line release, JDK 22, and track main-line releases going
>>>>     forward.
>>>>     For the GPU programming model this Project will create a
>>>>     separate repository,
>>>>     that is dependent on code reflection features as they are
>>>>     developed.
>>>>
>>>>     We expect to deliver Babylon over time, in a series of JEPs
>>>>     that will likely
>>>>     span multiple feature releases.
>>>>     We do not currently plan to deliver the GPU programming model
>>>>     into the JDK.
>>>>     However, work on that model could identify JDK features and
>>>>     enhancements of
>>>>     general utility which could be addressed in future work.
>>>>
>>>>     Comments?
>>>>
>>>>     Paul.
>>>>
>>>>     [1]
>>>>     https://cr.openjdk.org/~psandoz/conferences/2023-JVMLS/Code-Reflection-JVMLS-23-08-07.pdf
>>>>     https://youtu.be/xbk9_6XA_IY
>>>>     <https://urldefense.com/v3/__https://youtu.be/xbk9_6XA_IY__;!!ACWV5N9M2RV99hQ!Pi_JEFeTachQ7GPUzCbX43Gh_znVj4rdfF5nwlwB6Ge37ghWGq6BLIbq-KlIM2mmm18hSL0CdCRECtQy0Q$>
>>>>
>>>>     [2] https://youtu.be/lbKBu3lTftc
>>>>     <https://urldefense.com/v3/__https://youtu.be/lbKBu3lTftc__;!!ACWV5N9M2RV99hQ!Pi_JEFeTachQ7GPUzCbX43Gh_znVj4rdfF5nwlwB6Ge37ghWGq6BLIbq-KlIM2mmm18hSL0CdCRhl4eQWQ$>
>>>>
>>>
>
-- 
CTO, Paravox Ltd
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/discuss/attachments/20231011/9b742e97/attachment-0001.htm>