java.sql2 DynamicMultiOperation with interlaced exceptions

Douglas Surber douglas.surber at oracle.com
Fri Oct 13 15:55:40 UTC 2017


Some bind values could reasonably be streamed. It makes no sense to stream a set of bind parameters. There is a fixed number of parameters and and the large majority of them have small values. Supporting streams for those value types where it makes sense is a trivial extension to the API. 

Streaming batch statements makes no sense. To the best of my knowledge all vendors need to know the full scope of the batch before executing it. That is the entire batch must be queued for internal reasons so there is no benefit to using reactive streams.

Oracle Database does not support batching multiple statements. I don’t know about other vendors. Assuming no then there is no benefit to streaming statements to a batch. But in either case the operationProcessor mechanism I proposed in another thread or a simple extension would handle this.

The point is, just because something could be published to a stream doesn’t mean it should. The Reactive Stream spec (thanks Rossen) says

> The main goal of Reactive Streams is to govern the exchange of stream data across an asynchronous boundary – think passing elements on to another thread or thread-pool — while ensuring that the receiving side is not forced to buffer arbitrary amounts of data. In other words, backpressure is an integral part of this model in order to allow the queues which mediate between threads to be bounded. 

So shoehorning something into the reactive stream model just because you can is against the explicit goals of the project. Per the quote above “backpressure is an integral part of the model”. Using a reactive stream where back pressure is of no value is bad design.

Douglas


> On Oct 13, 2017, at 12:33 AM, Lukas Eder <lukas.eder at gmail.com> wrote:
> 
> Douglas,
> 
> 2017-10-13 0:34 GMT+02:00 Douglas Surber <douglas.surber at oracle.com>:
> Lukas,
> 
> Thanks for your comments. My reaction however is that this is an example of trying to shoehorn the database access problem into a reactive stream solution. For example bind variables are absolutely not a stream of any kind. They are best considered a map.
> 
> What about:
> 
> - Lobs, arrays
> - Table valued parameters
> - Cursor binds to stored procedures (e.g. in Oracle: SYS_REFCURSOR IN parameters)
> - XML or JSON document streams
> - Batch statements with bind variable batches
> - Statement batches (not batch statements) where a batch of statements, each with individual bind values, is sent to the server (although it would be fair to say that bind variables wouldn't be supported in this case)
> 
> I could definitely see those as being "published" to a stream. Drivers / servers could decide on their own if they want to run a statement with all the binds complete, or separate the sending of binds from the statement execution. Specifically when loading large amounts of data into the database using a batched INSERT statement, this could be valuable.
> 
> I remember the numerous times in previous jobs where I had to go through hoops to send PDF lobs into the database from some client, making sure the client doesn't completely shut down because it is blocking on the database (through all the dozens of layers in between). That stuff is really hard to get right, and I was really hoping, the new API would solve this as well, reactive or "simply "asynchronous...
> 
> If managing backpressure and asynchronicity of data sent to the server is valuable *enough* is one question. But I don't think you can say that bind variables are *absolutely not* a stream of any kind. Large LOBs were sent to servers through a synchronous java.io.InputStream
>  
> However rather that require construction of a Map the API allows passing each key/value pair separately. This is acceptable as the number of key/value pairs, the number of parameters, is part of the code (at least for the use cases we are targeting). Representing bind values as a stream is inappropriate. Sure lots of these concepts can be represented as publishing and subscribing, but that doesn’t make it the best idea.
> 
> Sure, perhaps you're right. I don't know the concrete use-cases you're targeting. What I'm suggesting, though, is that a driver spec could cover all use-cases:
> 
> - Fully reactive
> - "simply" asynchronous
> 
> The lowest API level would be fully reactive. On top of that, there could be convenience API that either delegates to the reactive API, simplifying the API interaction for the user through CompletionStage / CompletableFuture.
> 
> The two places where back pressure are needed are controlling the rate at which the client submits Operations and controlling the rate at which the implementation produces rows. In a large majority of cases neither of these matter.
> 
> Yes of course. The "classic" JDBC API will be sufficient for most applications anyway.
>  
> Only in a very, very few cases will a client produce enough Operations fast enough that controlling the rate is of benefit. I would be perfectly happy if the API did not include any support for back pressure for Operation submission. A very few of the target use cases might have a query or two that produce enough rows that back pressure would be valuable. But even in those cases the overwhelming majority of queries would produce such a limited number of rows that there would be no benefit to back pressure. So only a small fraction of the queries in the target use cases would benefit from back pressure.
> 
> Multi-result SQL could be represented as a stream of results. This is a SQL use case that I have no experience with. I would guess that the number of results is generally small. While any one result might produce a lot of rows (the second case in the previous paragraph) the number of results is probably small and the benefit of back pressure on the results (not the rows) is minimal.
> 
> You're making a lot of assumptions here :) I just hope you're right, and I'm a bit afraid you might not be, in case of which the API you're designing and which will be in the JDK for the next 20 years might be a lot less useful than it could be. The discussions on this list have only just started after years of quasi silence on the topic. I'm sure there are going to be many opinions that have not been covered yet.
> 
> I'd love to learn more about your use-cases and their sources to better understand the context of this work, though. Surely, I'm missing something that is already obvious to you.
> 
> I’m no fan of CompletableFuture. In all honesty I dislike it quite a bit for reasons that would distract from this discussion. The Class Library Team strongly encouraged us to use CompletableFuture. I recall there was a reason not to use CompletionStage but I don’t remember what it was. I’ll review that decision. On the face of it I’d prefer CompletionStage for the reasons you give and others.
> 
> Interesting, thanks for sharing. Would be really interesting to know that reason not to use CompletionStage.



More information about the jdbc-spec-discuss mailing list