Reading binary data
Brian Goetz
brian.goetz at oracle.com
Sat Nov 16 10:18:47 UTC 2019
The underlying argument you are hinting at here is simply flawed. Adding functionality has value; improving platform security also has value. The argument that “it was insecure before and I didn’t die, so improving security here is unacceptable” is misguided.
I get that improving security here may be inconvenient for some users; security often is. If what you are saying is “hey, be aware there are consequences for users”, rest assured we are well aware of the trade offs here.
In any case, I don’t think there’s much more to discuss on this right now; your point is noted. When there is a concrete proposal there may be more to discuss.
Sent from my MacBook Wheel
> On Nov 16, 2019, at 1:19 AM, Samuel Audet <samuel.audet at gmail.com> wrote:
>
> Maurizio,
>
> I think providing unrestricted access in some way is pretty fundamental. Could you elaborate on why these mechanisms to "gate" that functionality have not been worked on until now?
>
> I mean, we could have been using mechanisms like that since forever. It would be useful in the same way when some random dependency we were not aware of starts using JNI, and crashes. So why hasn't this been done before now? What are the reasons why this has not been a top priority?
>
> Samuel
>
>> On 11/8/19 9:52 PM, Maurizio Cimadamore wrote:
>>> On 08/11/2019 12:38, Samuel Audet wrote:
>>> Maurizio,
>>>
>>> Are you saying that ForeignUnsafe is going to be part of the Java SE specification API and that it will be accessible with a compiler flag? That sounds great, but I think you were talking about the --add-exports flag that we can use to access internal APIs? That's no better than sun.misc.Unsafe, so yeah, if that's your intention, people will just keep using the good old sun.misc.Unsafe, simply because it's already "standard". If I misunderstood, please point me in the right direction! This is pretty much the main reason why I'm not currently interested by that API.
>> Currently --add-exports flag is a crude way to get there. As you say, it prevents us from even adding the API to javadoc, which isn't great. We are discussing other mechanisms which leverage the module systems, and which, hopefully with a similarly simple (or even simpler) flag, will effectively give you the same effect.
>> For now, the best analogy which comes in mind is the --enable-preview flag that we have started to adopt in order to promote adoption of new language APIs which are in the Java SE language/VM/API but are not yet in a stable form (I'm not suggesting we reuse the flag as is, but that we do something similar). With --enable-preview, you can also mark which parts of the Java SE API are preview APIs, and that will even show up in the Javadoc.
>> While this exploration is not directly blocking the memory access API work (as pointed out, there are many legitimate cases for it which don't require native access), we will have to get it sorted before we can deliver the second stage, the foreign-abi work - so, _that_ part of the work, is not complete.
>>>
>>> Your considering as a "bug" JNI that works with no restrictions says a lot though. With that kind of a mindset, you'll never be able to get people to use Panama instead of JNI and sun.misc.Unsafe. Maybe the users you know don't care, but anything that can benefit from GPUs and such needs unrestricted hardware access exactly because CPUs are too slow. For example, we can allocate pinned memory with the CUDA driver, and that uses a 64-bit integer (actually two), and there are no safe alternatives at all, none, zilch, nothing, take it or leave it. Whether you like or not, there's nothing you or anyone at OpenJDK can do about it. (Well, we could add support for CUDA internally to the JDK, but you're not going to do that, because this is the "10% case", right?) In those cases, adding safety means paying a heavy price in performance, and that's completely and totally unacceptable, and I would argue that they are a lot of such applications becoming important even for enterprises and not just for scientists anymore. If OpenJDK is unwilling to accommodate those new users, someone else will, and you might not like the result, for example, in the case of mobile applications, something like the platform with a name that starts with an A.
>> I understand all the use cases you are talking about. And I understand the need for clients to access to functionalities that are only available in native code. You, on the other hand, need to understand that a random client of a library that is using (maybe by accident) something that offloads to CUDA, and then maybe crashes, might want to be informed about the fact that the jar or module listed as a dependency has some less transparent sides to it. It's a delicate balance, with no rights and wrongs. So I'm not saying developers won't be able to access all the features they want - all I'm saying is that some of these features might be *gated* (mechanism TBD, as said before, probably something similar to --enable-preview, roughly speaking).
>> Maurizio
>>>
>>> Samuel
>>>
>>> On 11/8/19 1:06 AM, Maurizio Cimadamore wrote:
>>>>
>>>>>
>>>>> Going by the Properties class, It doesn't look like all the JDK developers got the memo on that.
>>>>>
>>>>>
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Regarding the topic of safety, it feels like important context is being forgotten... or maybe I'm missing something. C/C++ is primarily used for sensitive, low level, and hardware level programming. Any bad or incorrect API usage(integer overflows, for example) could result in the entire system -including the JVM- to explode and crash. The "safe" Pointer API, even with all of its safety, isn't going to prevent system crashes from C/C++ code.
>>>>>>
>>>>>>> Heck, even within Java itself without Panama, it's possible to cause system chaos because *all roads eventually lead to Rome*. Anything that affects the system can affect the JVM and code executed in the JVM can negatively affect the system.
>>>>>>
>>>>>> If you have examples of something that doesn't do any JNI whatsoever, and yet leads to a VM crash we'd like to know, do you have any examples in mind? (we typically treat these cases as VM or JDK bugs).
>>>>>
>>>>>
>>>>> I never said crash specifically, only that it can cause chaos. The only thing I'm aware of on the top of my head is attempting to create an insane Integer.MAX amount of threads and silently bugging the entire system. Never heard of anyone doing it but I know it's possible.
>>>>
>>>>
>>>> I honestly don't know what to make of these comments. Again, if you are aware of _specific_ cases where using a 100% Java API results in hard VM crashes (not an exception, something like a segfault) please point to these specific examples.
>>>>
>>>> Avoiding _crashes_ is what I'm concerned about with this API.
>>>>
>>>>
>>>>>
>>>>>
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Which is why I was confused by both the presentation and now(regarding your newest email) in regards to Panama preventing crashing the VM directly or by extension. How is such a thing even conceivably possible anymore than preventing an ordinary bug from within pure Java code and crashing the system that way? Ignoring the context of Panama, literally any buggy code from any library at any point could cause huge failure.
>>>>>> The memory access API (leave aside the Pointer API for now) is an API for accessing off-heap memory. I think both you and Samuel are making the inference that off-heap means "native". But that doesn't have to be the case. There are plenty of use cases in the wild where developers just want to be able to deserialize an object graph off-heap, mapping it only a file, etc. These use cases have nothing inherently unsafe to them. And there's a lot of them, only you don't see them because right now they are catered by the ByteBuffer API (which does a so and so job at that), or some ad-hoc APIs such as Netty's ByteBuf or ChronicleBytes.
>>>>>>>
>>>>>>>
>>>>>>> Not saying "safety" doesn't matter, from a high level use case it absolutely does and is important, just that hiding API functionality that people want/need for this specific reason doesn't make a whole lot of sense. Panama can't conceivable be a one API army that somehow either fixes everyone's buggy code, prevents them from writing it to begin with, and/or smooths over buggy code. Nothing really can.
>>>>>>
>>>>>> (I'll skip over the "functionality that people want/need" - these comments are baseless, unless you claim to have some telepathic connection with all of the varied Java ecosystem - I sometimes wish I had that!)
>>>>>
>>>>>
>>>>> Going by Samuel's remark on the API not being finished it seems so.
>>>>
>>>>
>>>> Again, as I asked Samuel, I ask you the same - why?
>>>>
>>>> Because the API doesn't let you do what you want to do, the _way_ you want to do it (e.g. w/o making you even think about the fact that you have just inserted something unsafe in your program) is your definition of 'completeness' ? I'm afraid here we disagree wildly, and I'm sure there will be others who might well see (again as I've already said) this argument in a completely different perspective than yours (e.g. if I can shoot myself in the foot, then it's not finished).
>>>>
>>>> There's really nothing new to see here - as I said, it is simply a fact of life of being an API being part of the Java SE API; 'normal' APIs can't cause hard VM crashes. That's why we have Unsafe, and that's why we're trying to limit and reduce the use of Unsafe over time, or provide safe variants (where possible).
>>>>
>>>> The fact that JNI code can happily work w/o any restriction (well, in reality if you work with a security manager, JNI is essentially disabled by default since you cannot load any libraries with the default profile) is more a _bug_ than a _feature_. So suggesting that we should do whatever choice was done for JNI out of arguable short-term gains is (such as avoiding to add a command line flag) is a questionable choice.
>>>>
>>>>>
>>>>>
>>>>>>
>>>>>> So, are you saying that adding an extra compiler flag when compiling your program that requires privileged operation is 'too much' ?
>>>>>
>>>>>
>>>>> I'm not entirely sure what point is trying to be made. If it's something that can be done without then it's best to avoid it, of course. If it isn't... well, necessary evil.
>>>>
>>>> Again, the operation we're really talking about here is one: being able to 'trust' random pointers, so that they can be dereferenced as ordinary pointers coming from the memory access API itself.
>>>>
>>>> All the code that doesn't do native interop (and there's a lot) _couldn't care less_ about this use case. Then there's the code that wants to do native interop, call a function that returns a pointer to a struct, and wants to be able to dereference it. This can _still_ (as I've shown in the original email) be done. But such a functionality is put on a different package, and to get access to that you need to opt-in.
>>>>
>>>> I wrote real code with the API, I've ported 2-3 real world libraries with it, and, honestly, adding the extra flag to javac/VM was something silly (in fact the IDE suggested it for me).
>>>>
>>>>
>>>>>
>>>>>
>>>>>> Honestly this discussion seems to be going in circles.
>>>>>
>>>>>
>>>>> Your two points about the recurring theme was in itself a circle. Hiding functionality that someone might want is a burden on that person. Ideas were thrown around that may or may not, given context, make sense but at least it's *something*.
>>>> Well, I think that as long as the definition of "it's incomplete because it doesn't 100% do what I want the way I want it" and "I don't accept any extra burden" (without having even tried what we're talking about) is what gives me the impression that this discussion is not being very productive.
>>>>>
>>>>>
>>>>>> The memory access API has been out there for almost 6 months now - how much code have you written with it (same for Samuel) ?
>>>>>
>>>>>
>>>>> I had an idea of what I wanted to do with it but given that I need both Pointer and memoryaccess APIs I kind of can't nor am I entirely sure that it provides what I need. Is it possible to get an exported function like getProcAddress does and refer to that address as a function? Is there a source branch that contains both?
>>>>
>>>> The foreign-abi branch contains both the ABI and the memory access part. There are some tests which show how it can be used to call some simple standard libraries:
>>>>
>>>> http://hg.openjdk.java.net/panama/dev/file/19d9362b1cec/test/jdk/java/foreign/StdLibTest.java
>>>>
>>>>>
>>>>>
>>>>>> Are you sure that what you now describe as a major problem isn't, in reality, something that will only be required by people requiring low level access to write frameworks like JavaCPP itself and JNR?
>>>>>
>>>>>
>>>>> Memory access is in itself a low level API, so yes? I get the point being made but it's still a use case...
>>>>>
>>>>>
>>>>>> If so, are you sure that _everybody_ will care, as you claim they will? Or are you mistaking the 10% case with the 90% case?
>>>>>
>>>>>
>>>>> Never said such a thing. And why divide things up like that when you could cover 100% of the use cases? Panama has an opportunity to provide functionality that Java hasn't ever been able to offer in any easily accessible fashion. Why not offer more if possible for those that need it?
>>>>
>>>> I explained that to death as to why what you wish for is not possible: the Java SE API takes safety (and safety means essentially inability to put the VM in a bad state) very seriously. This is exactly the reason why the ByteBuffer API made the choices it had to do - e.g. avoiding an explicit 'close' method (which many people are asking for as you do now, albeit for a different feature).
>>>>
>>>> We still aim to support 100% use cases (otherwise we wouldn't have put the method in, wouldn't we?) - we just think that users of certain methods need to be a little bit more aware of what the consequences could be, not just for themselves - but for all the clients they might acquire overtime.
>>>>
>>>> As I said previously, a real world Java system is typically comprised by many many jars in the classpath (when I say a hundred I'm not joking). If any of those jars could start misbehaving and crash the world down (as in real crash - not just "chaos" as you say, which can be prevented), this would not be a very desirable thing to have. I appreciate that you come from a different perspective; this is the perspective where we come from, for better or worse
>>>>
>>>> Maurizio
>>>>
>>>>>
>>>>>
>>>>>>
>>>>>> Suggestion, write a lot of code using the memory access API, report back your experience and the chances your feedback will be taken more seriously will increase significantly (cross my heart!).
>>>>>>
>>>>>> Maurizio
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> I don't think the Pointer API has anything to create a Pointer from a long (unless you refer to the trick of writing a long in memory, and then reading it back as a Pointer).
>>>>>>>>
>>>>>>>> Maurizio
>>>>>>>>
>>>
>
More information about the panama-dev
mailing list