[foreign-memaccess/abi] where do unsafe operations live

Tue Oct 22 12:45:04 UTC 2019

On 22/10/2019 14:14, Maurizio Cimadamore wrote:
>
> On 22/10/2019 12:49, Jorn Vernee wrote:
>> This sounds like a decent idea, but I'm wondering if having the class 
>> be named ForeignUnsafe isn't already enough of a deterrent? The 
>> barrier of having a single `requires` in a module-info file doesn't 
>> seem much higher? I think if we just take away the need for 
>> --add-exports, then the barrier is already pretty low, and we need to 
>> make sure users understand if they are interacting with a safe or 
>> unsafe API. Maybe we could do something with an @UnsafeAPI annotation 
>> here (which IDEs/javac could warn about)? Or some javadoc note like 
>> "This api is _unsafe_. It's use could result in putting the VM in an 
>> invalid state, leading possibly to VM crashes".
>
> I agree, but I think that moving the unsafe stuff in its own module is 
> a precondition to what you describe. Note that a module will have its 
> own javadoc, where we can document that users can depend on this at 
> their own risk (and if the module is marked with some annotation, we 
> could issue a mandatory warning when importing it).
>
> While ForeignUnsafe might be a scary name as it is, I believe that 
> users should have an option to decide on which module they want to 
> depend on. There has to be a way to say "give me only the safe stuff, 
> please" and if I accidentally refer to the unsafe stuff, a compiler 
> error should be issued. I think that's valuable to have.
Wanting to get a warning is a fair use case. But, maybe a module for 4 
methods (currently) is a bit overkill... I think to get a warning an 
annotation would have the same effect? You'd get a warning unless you 
suppress it. Plus you'd be able to put the methods "where they belong", 
e.g. ofNull seems more at home on MemoryAddress itself. But, having such 
an annotation might need a JEP of it's own?
>
>>
>> Also, it seems a given that calling native code is unsafe. After all, 
>> anything can happen in native code. So, would all of SystemABI be in 
>> the unsafe module as well? 
> True - but calling JNI code can be considered equally unsafe.
>> I think it would be more descriptive to have a separate 
>> jdk.incubator.foreign.abi module in that case, which is documented to 
>> contain unsafe APIs. Basically, I'm wondering if there exists a more 
>> useful/descriptive classification than just "safe" <-> "unsafe" for 
>> that particular case.
>
> I think there are two things happening here, for which the term unsafe 
> might be used:
>
> - calling native code can result in pretty much anything. That is 
> already the case with JNI and I don't think SystemABI is any worse 
> than that. Since calling JNI methods doesn't require any particular 
> dance, I'd be concerned if, to create a native method handle, we 
> forced users through some unsafe conundrum (also note, if a client  
> really want to protect against this, a security manager could guard 
> library lookups and SystemABI calls)
>
> - creating a pointer out of thin air _without_ even invoking a native 
> method is something fundamentally new. And it is also unsafe, since 
> you can just use the memory access API to dereference it and crash 
> bang the VM.
>
> So I don't think the two cases are created equal - at least not in the 
> measure of how Java has always dealt with JNI.
For _calling_ a `native` method, you don't need to do anything extra, 
but the same applies with SystemABI, where you can just call the 
returned MethodHandle.

JNI requires a whole lot more ceremony to get going. You actually have 
to go write and compile a C/C++ glue-code library. You have to leave the 
safety of Java to get it working! For SystemABI we don't have to leave 
Java, so maybe we should have some warning for users that start linking 
native methods (e.g. it is possible to get the descriptors wrong. Since 
that metadata is not typically available in a DLL, we can not check if 
what the user passes is correct).

I just think it's weird to have a line drawn between the safe | unsafe, 
and have SystemABI seemingly be on the "safe" side of that line. That's 
why I think that safe| unsafe might not be the best classification for 
grouping APIs together. It seems that APIs should be grouped into 
modules/packages based on their functionality. And the safe/unsafe bits 
should be marked through some other means.

Jorn

>
>>
>> Jorn
>>
>> On 18/10/2019 19:44, Maurizio Cimadamore wrote:
>>> Hi,
>>> As I was using foreign-memaccess and foreign-abi in anger, it 
>>> occurred to me that the current layout of the incubating module is a 
>>> bit limiting in case the user wants to perform some unsafe operations.
>>>
>>> Right now, unsafe operations are defined in the class 
>>> 'ForeignUnsafe' which lives in the (non-exported) 
>>> jdk.incubating.foreign.unsafe package.
>>>
>>> Since the package is not exported, users that want to access unsafe 
>>> stuff need to add a manual --add-exports option.
>>>
>>> This is fine, and, in fact, it was intended to have something that 
>>> required an explicit opt-in, given that these are unsafe operations.
>>>
>>> That said, let's look at the situation from the perspective of a 
>>> library writer: if I write a Java bindings, it is very likely that I 
>>> might want to create MemoryAddress from longs, and the likes. But if 
>>> I do that, I need to break into the jdk.incubating.foreign module, 
>>> and _require that all my clients do the same_!
>>>
>>> I think this is excessively punitive. It's a good idea to have a 
>>> sharp separation between "safe" and "unsafe" parts of the API - and 
>>> it is also good if the "unsafe" parts of the API live in modules 
>>> that are non-resolvable by default (e.g. you need to "require" them 
>>> explicitly, or they won't be available).
>>>
>>> But if somebody is packaging a library in a module, it is kind of 
>>> sad that there's no way for the library writer to depend on the 
>>> 'unsafe' module, if he so wishes (and free the library clients from 
>>> having to make this explicitly).
>>>
>>> All things considered, I think that a better structure would be to 
>>> have _two_ modules:
>>>
>>> * jdk.incubating.foreign - usual, safe portion of the API
>>> * jdk.incubating.foreign.unsafe - unsafe portion of the foreign API 
>>> (requires jdk.incubating.foreign)
>>>
>>> This way, a library developer can just say "requires 
>>> jdk.incubating.foreign.unsafe" and be done. The library clients will 
>>> have to do nothing, assuming their module path is set up correctly.
>>>
>>> Thoughts?
>>>
>>> Maurizio
>>>