Reading binary data

Maurizio Cimadamore maurizio.cimadamore at oracle.com
Thu Nov 7 16:06:35 UTC 2019


>
> Going by the Properties class, It doesn't look like all the JDK 
> developers got the memo on that.
>
>
>>
>>>
>>>
>>> Regarding the topic of safety, it feels like important context is 
>>> being forgotten... or maybe I'm missing something. C/C++ is 
>>> primarily used for sensitive, low level, and hardware level 
>>> programming. Any bad or incorrect API usage(integer overflows, for 
>>> example) could result in the entire system -including the JVM- to 
>>> explode and crash. The "safe" Pointer API, even with all of its 
>>> safety, isn't going to prevent system crashes from C/C++ code.
>>
>>> Heck, even within Java itself without Panama, it's possible to cause 
>>> system chaos because *all roads eventually lead to Rome*. Anything 
>>> that affects the system can affect the JVM and code executed in the 
>>> JVM can negatively affect the system.
>>
>> If you have examples of something that doesn't do any JNI whatsoever, 
>> and yet leads to a VM crash we'd like to know, do you have any 
>> examples in mind? (we typically treat these cases as VM or JDK bugs).
>
>
> I never said crash specifically, only that it can cause chaos. The 
> only thing I'm aware of on the top of my head is attempting to create 
> an insane Integer.MAX amount of threads and silently bugging the 
> entire system. Never heard of anyone doing it but I know it's possible.


I honestly don't know what to make of these comments. Again, if you are 
aware of _specific_ cases where using a 100% Java API results in hard VM 
crashes (not an exception, something like a segfault) please point to 
these specific examples.

Avoiding _crashes_ is what I'm concerned about with this API.


>
>
>>
>>>
>>>
>>> Which is why I was confused by both the presentation and 
>>> now(regarding your newest email) in regards to Panama preventing 
>>> crashing the VM directly or by extension. How is such a thing even 
>>> conceivably possible anymore than preventing an ordinary bug from 
>>> within pure Java code and crashing the system that way? Ignoring the 
>>> context of Panama, literally any buggy code from any library at any 
>>> point could cause huge failure.
>> The memory access API (leave aside the Pointer API for now) is an API 
>> for accessing off-heap memory. I think both you and Samuel are making 
>> the inference that off-heap means "native". But that doesn't have to 
>> be the case. There are plenty of use cases in the wild where 
>> developers just want to be able to deserialize an object graph 
>> off-heap, mapping it only a file, etc. These use cases have nothing 
>> inherently unsafe to them. And there's a lot of them, only you don't 
>> see them because right now they are catered by the ByteBuffer API 
>> (which does a so and so job at that), or some ad-hoc APIs such as 
>> Netty's ByteBuf or ChronicleBytes.
>>>
>>>
>>> Not saying "safety" doesn't matter, from a high level use case it 
>>> absolutely does and is important, just that hiding API functionality 
>>> that people want/need for this specific reason doesn't make a whole 
>>> lot of sense. Panama can't conceivable be a one API army that 
>>> somehow either fixes everyone's buggy code, prevents them from 
>>> writing it to begin with, and/or smooths over buggy code. Nothing 
>>> really can.
>>
>> (I'll skip over the "functionality that people want/need" - these 
>> comments are baseless, unless you claim to have some telepathic 
>> connection with all of the varied Java ecosystem - I sometimes wish I 
>> had that!)
>
>
> Going by Samuel's remark on the API not being finished it seems so.


Again, as I asked Samuel, I ask you the same - why?

Because the API doesn't let you do what you want to do, the _way_ you 
want to do it (e.g. w/o making you even think about the fact that you 
have just inserted something unsafe in your program) is your definition 
of 'completeness' ? I'm afraid here we disagree wildly, and I'm sure 
there will be others who might well see (again as I've already said) 
this argument in a completely different perspective than yours (e.g. if 
I can shoot myself in the foot, then it's not finished).

There's really nothing new to see here - as I said, it is simply a fact 
of life of being an API being part of the Java SE API; 'normal' APIs 
can't cause hard VM crashes. That's why we have Unsafe, and that's why 
we're trying to limit and reduce the use of Unsafe over time, or provide 
safe variants (where possible).

The fact that JNI code can happily work w/o any restriction (well, in 
reality if you work with a security manager, JNI is essentially disabled 
by default since you cannot load any libraries with the default profile) 
is more a _bug_ than a _feature_. So suggesting that we should do 
whatever choice was done for JNI out of arguable short-term gains is 
(such as avoiding to add a command line flag) is a questionable choice.

>
>
>>
>> So, are you saying that adding an extra compiler flag when compiling 
>> your program that requires privileged operation is 'too much' ?
>
>
> I'm not entirely sure what point is trying to be made. If it's 
> something that can be done without then it's best to avoid it, of 
> course. If it isn't... well, necessary evil.

Again, the operation we're really talking about here is one: being able 
to 'trust' random pointers, so that they can be dereferenced as ordinary 
pointers coming from the memory access API itself.

All the code that doesn't do native interop (and there's a lot) 
_couldn't care less_ about this use case. Then there's the code that 
wants to do native interop, call a function that returns a pointer to a 
struct, and wants to be able to dereference it. This can _still_ (as 
I've shown in the original email) be done. But such a functionality is 
put on a different package, and to get access to that you need to opt-in.

I wrote real code with the API, I've ported 2-3 real world libraries 
with it, and, honestly, adding the extra flag to javac/VM was something 
silly (in fact the IDE suggested it for me).


>
>
>> Honestly this discussion seems to be going in circles.
>
>
> Your two points about the recurring theme was in itself a circle. 
> Hiding functionality that someone might want is a burden on that 
> person. Ideas were thrown around that may or may not, given context, 
> make sense but at least it's *something*.
Well, I think that as long as the definition of "it's incomplete because 
it doesn't 100% do what I want the way I want it" and "I don't accept 
any extra burden" (without having even tried what we're talking about) 
is what gives me the impression that this discussion is not being very 
productive.
>
>
>> The memory access API has been out there for almost 6 months now - 
>> how much code have you written with it (same for Samuel) ?
>
>
> I had an idea of what I wanted to do with it but given that I need 
> both Pointer and memoryaccess APIs I kind of can't nor am I entirely 
> sure that it provides what I need. Is it possible to get an exported 
> function like getProcAddress does and refer to that address as a 
> function? Is there a source branch that contains both?

The foreign-abi branch contains both the ABI and the memory access part. 
There are some tests which show how it can be used to call some simple 
standard libraries:

http://hg.openjdk.java.net/panama/dev/file/19d9362b1cec/test/jdk/java/foreign/StdLibTest.java

>
>
>> Are you sure that what you now describe as a major problem isn't, in 
>> reality, something that will only be required by people requiring low 
>> level access to write frameworks like JavaCPP itself and JNR?
>
>
> Memory access is in itself a low level API, so yes? I get the point 
> being made but it's still a use case...
>
>
>> If so, are you sure that _everybody_ will care, as you claim they 
>> will? Or are you mistaking the 10% case with the 90% case?
>
>
> Never said such a thing.  And why divide things up like that when you 
> could cover 100% of the use cases? Panama has an opportunity to 
> provide functionality that Java hasn't ever been able to offer in any 
> easily accessible fashion. Why not offer more if possible for those 
> that need it?

I explained that to death as to why what you wish for is not possible: 
the Java SE API takes safety (and safety means essentially inability to 
put the VM in a bad state) very seriously. This is exactly the reason 
why the ByteBuffer API made the choices it had to do - e.g. avoiding an 
explicit 'close' method (which many people are asking for as you do now, 
albeit for a different feature).

We still aim to support 100% use cases (otherwise we wouldn't have put 
the method in, wouldn't we?) - we just think that users of certain 
methods need to be a little bit more aware of what the consequences 
could be, not just for themselves - but for all the clients they might 
acquire overtime.

As I said previously, a real world Java system is typically comprised by 
many many jars in the classpath (when I say a hundred I'm not joking). 
If any of those jars could start misbehaving and crash the world down 
(as in real crash - not just "chaos" as you say, which can be 
prevented), this would not be a very desirable thing to have. I 
appreciate that you come from a different perspective; this is the 
perspective where we come from, for better or worse

Maurizio

>
>
>>
>> Suggestion, write a lot of code using the memory access API, report 
>> back your experience and the chances your feedback will be taken more 
>> seriously will increase significantly (cross my heart!).
>>
>> Maurizio
>>
>>>
>>>
>>>>
>>>> I don't think the Pointer API has anything to create a Pointer from 
>>>> a long (unless you refer to the trick of writing a long in memory, 
>>>> and then reading it back as a Pointer).
>>>>
>>>> Maurizio
>>>>


More information about the panama-dev mailing list