Reading binary data
Henry Jen
henry.jen at oracle.com
Thu Dec 5 19:43:37 UTC 2019
I am usually pretty lazy to participate in arguments not involve code, but I feels like to share my understanding.
Regarding to safety, I think it’s clear that we agree some limited unsafe access is inevitable. My understanding to what said is that, when doing pure Java code, you should not be able to crash VM. Once unsafe operation is involved, some opt-in mechanism need to be involved to make unsafe access explicit. And it should be possible for native bindings to encapsulate all unsafe operation so if user choose to use the binding, they know the risk, but they don’t need to use unsafe API.
As the argument of project priority, the project started with a simple goal, enable calling native functions without hassle to write JNI code. We took the “binder” approach, and like to stabilize the interface via annotations.
jextract is just a tool to get Java bindings with such interface quickly so we can stabilize that interface, which was basically just annotation with layout descriptors. We thought such “minimal” interface without involving API would be easier deliver. Based on that effort we came out some APIs necessary for access native bindings such as Pointer/Scope etc.
It feels like jextract is a priority is IMO simply an accident during the process to expand our validation over native libraries and we found it can be useful for general use.
Mauricio have being doing great works to slice the experience accumulated over the years and present a plan to start delivering into mainstream. The memory access API is the first step, and then the ABI work would be next. Without foreign-abi branch, that would be the “interface” for native inter-op we set to have at beginning, with VarHandle and MethodHandle, and I believe developers can create tools using their imagination to make use of that foundation.
With the proposed foreign-jextract branch, we can serve as a reference of how jexrtract is transformed to use that foundation instead of “annotations”. Whether that’s good enough to serve as a tools to be distributed is a different story.
Cheers,
Henry
> On Dec 5, 2019, at 7:52 AM, Maurizio Cimadamore <maurizio.cimadamore at oracle.com> wrote:
>
>
> On 05/12/2019 15:27, Ty Young wrote:
>>
>> On 12/4/19 10:55 AM, Maurizio Cimadamore wrote:
>>>
>>> On 04/12/2019 16:14, Samuel Audet wrote:
>>>> Hi, Maurizio,
>>>>
>>>> It's not a question of whether I like it or not, it's about having the tools that we need. Recently, both Ioannis and Ty have been trying to make you understand the same thing that I've been trying to make you understand about safety, and you're not taking them seriously either. How many people is it going to take for you to understand the importance of this?
>>> I think you are just mixing (as usual) pears with apples. But I will let Ty and Ioannis to speak for themselves, rather than try to put words into their mouths.
>>
>>
>> Ioannis and I's view on the current "safety" mentality seems to be the same: it just doesn't matter from a C/C++ bindings perspective.
>
> I think you are misinterpreting :-)
>
> I believe LWJGL tries very hard to inject bounds on otherwise unchecked pointers to achieve some form of safety for clients. So, while I agree that, in the most general case possible, 100% safety is a siren song when it comes to native libraries, there _have_ to be ways to take an unsafe address and put some structure to it.
>
>> You say the point of "safety" is to not crash the JVM, but the JVM runs on the system and C/C++ is often used for sensitive system level programming. Even if you did everything possible to ensure the integrity of the way the data is communicated and worked with between Java and C/C++, it literally does nothing once that data is processed in C/C++ land. You crash the system from C/C++ and the JVM will go down with it.
>>
>>
>> That said, I don't think "safety" is pointless and I recognize that there are many use cases where "safety" is very much welcomed. However, by not offering functionality because of "safety", you are shafting those that want to get "down and dirty" using Project Panama... something Java doesn't really have an official API for.
> I don't think we have ever said anywhere that 'unsafe' functionalities will _not_ be provided - just that there will be ways to opt-in into the unsafety (which is very different).
>>
>>
>>>>
>>>> We absolutely need unsafe access to memory regardless of the opinions or desires of anyone. Preventing us from having more usable APIs in that sense will continue to hamper us in our work, and will only weaken the Java platform vs other platforms when it comes to software that depends on, for example, accelerators like GPUs. If that's not where OpenJDK intends to take the Java platform **in priority**, that's perfectly fine, but please make it clear so that we can stop wasting time arguing like this.
>>>
>>> Who is *We*, *us* ? As for wasting time arguing - I'd very much like to stop (as you do). So - maybe stop arguing?
>>>
>>> You seem to be determined in having me admitting that some of the important use cases *you* (again, don't know who *you* is) have in mind is not on our radar - and you infer that from the fact that we'd like to have a clear separation between safe parts of the API and less safe parts of the API. That is simply an incorrect inference, which is more similar to FUD than to a real argument.
>>>
>>
>> IIRC some parts are still "unsafe" though, are they not?
> Yes - but they will still be present (see above)
>>
>>
>>>
>>>
>>>> and actually slower for AOT compilers. Moreover, they are always changing and because of this lack of stability, most of the community is not able to use them extensively enough to provide (more) feedback. It's not possible to do anything substantial with an API that is so unstable! Although you're not willing to talk about what you might have to do to support C++, I can imagine that it's going to end up making even more aggressive changes, but maybe not? Not knowing what your intentions are isn't helping.
>>>
>>> I think these is a big ball of excuses as to why _you_ are not doing anything with the API. Luckily, in this very mailing list, I've seen plenty of people who, unlike you, were willing to get their hands dirty.
>>>
>>> As for stability, the memory access API did not have any significant API change in months - and now it's a JEP. I'm also not aware of any breaking changes in the foreign API which is being distributed through EA. There's not a single piece of concrete evidence in anything you say. This fact alone is a big red herring.
>>
>>
>> In fairness, if one was to read this mailing list, it does sound like the API is still being hammered out. It's important to keep in mind that the people participating in the mailing lists aren't always reading the code or changesets and it isn't like there is a monthly progress report on these Java projects either. That's why those conferences that you OpenJDK/Oracle developers present at are so important - it lets people know what the current status, plans, and thought process are for things being worked on in Java.
> I think there are different milestones, at different level of readiness - the ABI support is still work in progress - but the memory access API isn't. Saying that one isn't ready because the step after that isn't is not a fair characterization either? Again, there are many use cases (tensors, persistent memory) which are looking forward to use the memory access API and are not necessarily concerned with native interop.
>>
>>
>>>>
>>>>
>>>>>> I admit my attitude isn't ideal, but I'm still trying to make you understand my point in this. Let's try to work on features that are most useful to everyone, first. We all have limited resources, so let's try to make the most of them, and that means working with the community! In any case, I'll keep trying to do my best and be as least annoying as possible. :)
>>>>>
>>>>> The real elephant in the room here is that we've been arguing for over an year _without you having written a single line of code_ using whatever we're doing. In reality, I don't see why there couldn't be an experimental version of JavaCPP targeting memory access VH and MH instead of JNI - then it would be good to have a discussion about what worked and what didn't. I think until you do something along those lines, not many people will really take you seriously around here, sorry.
>>>>
>>>> We've talked about this before. Once you're able to provide a viable alternative to JNI, that runs faster for both JIT and AOT compilers, whose API is somewhat stable, that allows full unsafe access somehow, and that is slated to become part of Java SE, even just as an initial draft of a JEP, then I will start to look at it.
>>>
>>> Well, (I start to sound like a broken record) the memory access API is a JEP now; any plans to replace JavaCPP JNI-based struct access with that? Surely that should be much faster?
>>>
>>> But in general your attitude of "I will touch it only when it's fully done" makes me think that you are essentially looking at the wrong mailing list. This is for people who likes to makes their hands dirty and play a bit with what's available - if nobody did that (and followed your approach), the feedback we'd get would be exactly zero, nada, nil - would you care to formulate how such an attitude is going to contribute to make whatever we are doing _better_ ?
>>
>>
>> It's kind of understandable that someone might not spend time converting when the thing they are potentially converting to doesn't appear to be hammered out yet.
> Sure - I'm not forcing _anyone_ - but it's also understandable if I don't take the "feedback" from someone who has never tried anything as seriously as the one from someone who has. In other words, every project in the world goes through several iterations to get things right - here it seems like both you and Samuel are complaining because some parts of the project are still iterating... maybe the solution, if you are really not prepared to live on the bleeding edge, is to wait few months and check back again?
>>
>>
>> ...and Project Panama is made up of various pieces that can't be easily assembled yet. Again, for one of the things I'm using Project Panama for personally, I'm using jextract primarily and eventually memory-access to fill in the missing parts. If there was a branch that included everything then people could checkout the various parts of Project Panama more easily and in more realistic situations.
>>
>>
>> Like, is everyone *really* going to use memory access to create bindings when they have the complete headers and can use jextract?
>
> jextract is going to use memory access and method handles underneath so... yes - although you might not realize that you are depending on those pieces.
>
> We have plans (sent an email earlier today) on how to move forward with EA so that a more representative set of the Panama features is delivered.
>
> Maurizio
>
>>
>>
>> Likewise, building the JDK from source takes a lot of processing power to do. About 7 minutes on my Ryzen 1800x, in fact. Github Actions could probably build the JDK every commit if someone was to set it up. Don't know how long it'd take...
>>
>>
>>>> Samuel
More information about the panama-dev
mailing list