JDBC improvements and convenience method suggestions

Thu Nov 7 17:24:01 UTC 2019

I pulled up Lukas's original email so I could write brief responses to each of the ideas. See inline below.

In general I believe that most database users don't actually write JDBC code directly, instead working through some framework. Adding convenience methods would not have a big impact on frameworks or their users. It's not obvious to me that they are all that valuable to the overall user community. On the other hand if added as a default methods, they don't cost anything.

> On Jul 12, 2018, at 8:43 AM, Lukas Eder <lukas.eder at gmail.com> wrote:
> 
> 
> 1. Add java.sql.Freeable and let Array, Blob, Clob, and SQLXML extend it
> https://github.com/lukaseder/openjdk/commit/acc17baa91a05525951f27e227731ac1206ee9e0
> 
Oracle defined open and close on LOBs long before Java existed. Oracle Database JDBC defined LOB.open() and LOB.close() before the JDBC spec included LOBs. It would have been nice if our crystal ball had been working that day and we had chosen different method names, but we didn't. When the need arose for some kind of "close" on the standard LOBs it would have been nice to use the name "close". That would have caused a backward compatibility issue for Oracle as the meaning of Oracle's LOB.close() is not the same as the standard LOB.free(). I wish things had be different as much as anyone else.

I don't like the idea of adding more kruft on top of the existing kruft. Rather than add Freeable and try to make it pretend to be Closeable I'd prefer to just accept that LOB.free() is krufty and leave it at that.

> 2. Add PreparedStatement.set(Object...)
> https://github.com/lukaseder/openjdk/commit/2bf968dca742989309379c8d2550979baf07c0d5
> 
I have no objection to this convenience method.

It should require the number of values be exactly the same as the number of parameters, else throw SQLException. Reusing the tail parameters is such a special case that it's not worth the unintended errors that would be caught by requiring all parameters be set.

Adding nullValue() to SQLType would be a solution to the NULL problem. Another would be to just specify that passing a SQLType means setting a NULL value of that type. Agreed neither is totally obvious, but putting it in the JavaDoc for the set(Object ...) method should be enough.
> 
> 3. Add Connection.executeXYZ() methods
> https://github.com/lukaseder/openjdk/commit/e77db3e28b24d24e220a215c396c10bf8adf2d30
> 
I have no objection to this convenience method.
> 
> 
> 4. Add Result.get():Object[]
> https://github.com/lukaseder/openjdk/commit/03a63116b935977b229b8bee4d9b526f71923eee
> 

I have no objection to this convenience method.

I'd prefer a Row type. As someone pointed out ADBA had a Row type with just the two generic getObject methods. I'd prefer something like

	<T> T get(Function<Row, T> f) throws SQLException;

The Row argument to f is valid only for the duration of the call to f. This means that the Row doesn't have to copy all the column values. If the user wants to actually get a persistent Row they would write

	Row r = result.get(Row::clone);

This would make clear that the columns are being copied. If the user doesn't want a Row but instead something else they could pass some other factory method, eg

	Employee e = result.get(Employee::fromRow);

This would avoid copying unused columns.

If absolutely necessary we could add

	default Row get() throws SQLException {
		return this.get(Row::clone);
	}

Though I prefer to make the copy explicit in the user code.
> 
> 5. Let ResultSet extends Iterable<Object[]>
> https://github.com/lukaseder/openjdk/commit/7daefde1134099baa1923c3f120789069c5f6917
> 
This one I have a problem with. There may be something here, but the right thing is not completely obvious.

A ResultSet is not a Collection of any kind. It is a dynamic object. The row movement methods, in particular next(), change the state of the ResultSet in non-trivial ways. Adding methods that create some kind of Collection would be marginally ok, but I think they would violate the principle of least surprise. An obvious case would be 

	List<Row> toList() throws SQLException;

By definition this would copy the entire content of the ResultSet. Oracle JDBC works very hard at minimizing the amount of memory allocated to handle a ResultSet. The only place memory is allocated is to return the result of a getXXX call. The toList method would call getXXX on every column of every row. Users might find that surprising, or at least might find the sudden jump in footprint surprising. Our experience is that users frequently select many more rows and columns than they actually use.

So I would accept adding collection creation methods to ResultSet, but not making ResultSet itself any kind of collection including Iterable.

> 
> 6. Add SQLInput.getConnection() and SQLOutput.getConnection()
> https://github.com/lukaseder/openjdk/commit/39aeb8c0dbf86b1fe0a4df847bca22127047667c
> 
No opinion. The default implementation would be to throw so to function the vendors would have to implement it. It's not hard to implement.
> 
> 7. Add Connection.transaction() overloads for functional transaction usage
> https://github.com/lukaseder/openjdk/commit/12c9591ca9066e676001fd6e67e18b7d444d0a32
> 
No objection to the general idea. Same as other convenience methods.

I agree with the comments about the method name and use of save points, but I think there is a version of this idea that's just fine.

> 8. Add DataSource.connect() methods
> https://github.com/lukaseder/openjdk/commit/9a2eebea7869a3e9e81347a965bd9e65b41bc1bd
> 
I have no objection to this convenience method.

> On Nov 7, 2019, at 6:25 AM, Lance Andersen <lance.andersen at oracle.com> wrote:
> 
> Hi Lukas,
> 
> I will be looking at an MR  update shortly.  I do have your previous email and need to go back through it to refresh myself
> 
>> On Nov 7, 2019, at 8:51 AM, Lukas Eder <lukas.eder at gmail.com> wrote:
>> 
>> Dear group,
>> 
>> I'm trying my luck again with this, given that 1.5 years later:
>> 
>> 1. ADBA has been stopped (so, less effort in that area)
>> 2. Text blocks have shipped (so, more value in using JDBC directly)
>> 
>> I think that text blocks are a killer feature for direct JDBC usage, if
>> only JDBC had a bit more convenience as I had suggested in my original
>> mail. Is there any planned effort around such an improvement of JDBC in the
>> near future?
>> 
>> Thanks,
>> Lukas
>> 
>> On Thu, Jul 12, 2018 at 5:43 PM Lukas Eder <lukas.eder at gmail.com> wrote:
>> 
>>> Hello,
>>> 
>>> Some time has passed since I had suggested some JDBC improvements and
>>> Lance was so kind to offer looking into concrete suggestions that I might
>>> have:
>>> 
>>> http://mail.openjdk.java.net/pipermail/jdbc-spec-discuss/2017-November/000243.html
>>> 
>>> I have finally found some time to actually do that. I have forked an
>>> inofficial git clone of the OpenJDK repository and committed my suggestions
>>> directly there:
>>> 
>>> https://github.com/lukaseder/openjdk/tree/jdk/jdk/src/java.sql/share/classes
>>> 
>>> There are 8 suggestions in total:
>>> 
>>> 
>>> 1. Add java.sql.Freeable and let Array, Blob, Clob, and SQLXML extend it
>>> 
>>> https://github.com/lukaseder/openjdk/commit/acc17baa91a05525951f27e227731ac1206ee9e0
>>> 
>>> 
>>> One thing that has been bothering me a bit in the past is the fact that
>>> these 4 types Array, Blob, Clob, and SQLXML have received new free()
>>> methods in Java 6 / JDBC 4.0, which I believe should have been called
>>> close() instead, and then in Java 7, the types should have extended
>>> AutoCloseable.
>>> 
>>> However, it is not too late to retrofit this by adding a new Freeable API
>>> (this is optional, prevents repeating the fix) and letting Freeable
>>> delegate to AutoCloseable. On the user side, this would be possible:
>>> 
>>> try (Clob clob = connection.createClob()) {
>>>   ...
>>> }
>>> 
>>> I guess this doesn't need any further explanation.
>>> 
>>> 
>>> 2. Add PreparedStatement.set(Object...)
>>> 
>>> https://github.com/lukaseder/openjdk/commit/2bf968dca742989309379c8d2550979baf07c0d5
>>> 
>>> 
>>> One of the bigger pain points in JDBC is the fact that indexed parameters
>>> are 1 based and have to be incremented manually on a per bind parameter
>>> basis. It would be *much* more convenient in many many cases, if there was
>>> a method that allows for passing a vararg of arbitrary values, simply
>>> delegating to setObject(), i.e.
>>> 
>>> try (PreparedStatement s = connection.prepareStatement(
>>>   "INSERT INTO t VALUES (?, ?, ?)")) {
>>>   s.set(1, 2, 3);
>>>   s.executeUpdate();
>>> }
>>> 
>>> For most types, this works just fine and if someone needs a special data
>>> type that would require one of the other setObject() overloads, they can
>>> still revert to the status quo for that particular statement.
>>> 
>>> I think this would greatly help!
>>> 
>>> 
>>> 3. Add Connection.executeXYZ() methods
>>> 
>>> https://github.com/lukaseder/openjdk/commit/e77db3e28b24d24e220a215c396c10bf8adf2d30
>>> 
>>> 
>>> In many cases, there is not really a need for keeping around intermediary
>>> Statement or PreparedStatement references. In a recent non-representative
>>> poll I've done on twitter (182 answers), I've had about equal numbers of
>>> answers among people:
>>> 
>>> - caching and reusing prepared statements
>>> - creating new statements all the time (this includes people who don't use
>>> JDBC directly, most ORMs don't keep around open statements)
>>> 
>>> Poll here:
>>> https://twitter.com/lukaseder/status/974229231076077568
>>> 
>>> For those people who do not cache prepared statements, I think being able
>>> to execute SQL directly on the JDBC connection would be very helpful. E.g.
>>> 
>>> // Static statement
>>> connection.executeUpdate("INSERT INTO t VALUES (1, 2, 3)");
>>> 
>>> Instead of
>>> 
>>> try (Statement s = connection.createStatement()) {
>>>   s.executeUpdate("INSERT INTO t VALUES (1, 2, 3)");
>>> }
>>> 
>>> Or this
>>> 
>>> // PreparedStatement
>>> connection.executeUpdate("INSERT INTO t VALUES (?, ?, ?)", 1, 2, 3);
>>> 
>>> Instead of
>>> 
>>> try (Statement s = connection.prepareStatement(
>>>   "INSERT INTO t VALUES (?, ?, ?)")) {
>>> 
>>>   // New suggested method in section 2
>>>   s.set(1, 2, 3);
>>> 
>>>   // Today's approach
>>>   s.setInt(1, 1);
>>>   s.setInt(2, 2);
>>>   s.setInt(3, 3);
>>>   s.executeUpdate();
>>> }
>>> 
>>> I think the value in reduced boiler plate is very obvious. A hint in the
>>> Javadoc indicating that this does not cache the statements should suffice
>>> for those who are more performance aware / who can actually profit from
>>> caching the statements.
>>> 
>>> 
>>> 4. Add Result.get():Object[]
>>> 
>>> https://github.com/lukaseder/openjdk/commit/03a63116b935977b229b8bee4d9b526f71923eee
>>> 
>>> 
>>> 
>>> Similar to PreparedStatement.set(Object[]), being able to fetch an entire
>>> row from a ResultSet is very useful to tools that process result sets
>>> generically. The simplest possible type is Object[], but a new java.sql.Row
>>> type would be reasonable as well (ADBA is going to offer such a type).
>>> 
>>> I think this is self explanatory.
>>> 
>>> 
>>> 5. Let ResultSet extends Iterable<Object[]>
>>> 
>>> https://github.com/lukaseder/openjdk/commit/7daefde1134099baa1923c3f120789069c5f6917
>>> 
>>> 
>>> Once ResultSet exposes entire rows as types, that type could also be
>>> offered through ResultSet extending Iterable<ThatType>. In my example, it's
>>> Iterable<Object[]>, but a new java.sql.Row type would be even better. I'm
>>> suggesting Iterable here instead of Iterator for 2 reasons:
>>> 
>>> - It's the more appropriate type for rewindable / scroll sensitive result
>>> sets. For instance, every time ResultSet.iterator() is called, this could
>>> go back to ResultSet.beforeFirst(). To be defined.
>>> - ResultSet.next() already returns boolean, so ResultSet cannot extend
>>> Iterator, whose next() method returns T
>>> 
>>> The fact that JDBC's ResultSet is so disconnected from the java.util
>>> collection types is a big issue for many JDBC users. Making it Iterable<T>
>>> or even List<T> (because technically, it *is* a list) would greatly help
>>> working with JDBC. I'm not suggesting List<T> because that would imply that:
>>> 
>>> - Wasteful operations could be done accidentally, e.g. jumping around the
>>> list randomly
>>> - JDBC drivers would have to buffer the results
>>> 
>>> But Iterable is definitely useful. Imagine, with the previous improvements
>>> to do this:
>>> 
>>> try (ResultSet rs = connection.executeQuery("SELECT * FROM t")) {
>>>   for (Object[] row : rs) {
>>>     System.out.println(row[0] + ":" + row[1]);
>>>   }
>>> }
>>> 
>>> Makes me wish for a for-with-resources as well that iterates over
>>> (Iterable<T> & AutoCloseable) and closes the object after (un)successful
>>> iteration :-)
>>> 
>>> 
>>> 6. Add SQLInput.getConnection() and SQLOutput.getConnection()
>>> 
>>> https://github.com/lukaseder/openjdk/commit/39aeb8c0dbf86b1fe0a4df847bca22127047667c
>>> 
>>> 
>>> This is something I've been missing every time when I worked with Oracle
>>> OBJECT types. Binding them through JDBC directly is non trivial, especially
>>> when nesting OBJECT and TABLE types. Imagine writing an OBJECT type to
>>> SQLOutput and having to create java.sql.Array types from within SQLData.
>>> There is no way to access the JDBC Connection which is needed to create
>>> arrays either through standard JDBC
>>> 
>>> - Connection.createArray(...)
>>> 
>>> Or through ojdbc
>>> 
>>> - OracleConnection.createARRAY(...)
>>> 
>>> The workaround is nasty: Store the JDBC Connection in some ThreadLocal or
>>> other utility that is statically accessible from a SQLData instance, and
>>> access it from there. It would be much better if SQLInput and SQLOutput
>>> would offer these two methods.
>>> 
>>> 
>>> 7. Add Connection.transaction() overloads for functional transaction usage
>>> 
>>> https://github.com/lukaseder/openjdk/commit/12c9591ca9066e676001fd6e67e18b7d444d0a32
>>> 
>>> 
>>> This is something that people have come to appreciate a lot in jOOQ: A
>>> functional transaction API that builds on top of the existing procedural
>>> one. I think other libraries have similar utilities.
>>> 
>>> The user code could look like this:
>>> 
>>> connection.transaction(() -> {
>>>   connection.executeUpdate("INSERT INTO .."); // Using suggested methods
>>>   connection.executeUpdate("INSERT INTO ..");
>>> });
>>> 
>>> How does it work?
>>> 
>>> - If the above lambda completes normally: Commit
>>> - If the above lambda throws an exception: Roll back
>>> 
>>> This can go a step further: Nested transactions with implicit breakpoints
>>> 
>>> // Implicit transaction start here
>>> connection.transaction(() -> {
>>>   connection.executeUpdate("INSERT INTO .."); // Using suggested methods
>>> 
>>>   try {
>>> 
>>>     // Implicit savepoint here
>>>     connection.transaction(() -> {
>>>       connection.executeUpdate("INSERT INTO ..");
>>>     });
>>>   }
>>> 
>>>   // In case of exception implicit rollback to savepoint
>>>   catch (SQLException ignore) {}
>>> });
>>> 
>>> // In case of normal completion, the first insert will be committed,
>>> // the second one might have been rolled back
>>> 
>>> My implementation suggestion on GitHub includes:
>>> 
>>> - 4 new functional interfaces that allow for passing different types of
>>> lambdas to the transaction() method:
>>> o () -> {} // void compatible
>>> o (Connection c) -> {}
>>> o () -> someResult
>>> o (Connection c) -> someResult
>>> - A draft implementation that uses a hack to store the savepoints in some
>>> connection state. A more thorough solution would be needed of course, but I
>>> think it nicely illustrates the use-case.
>>> 
>>> 
>>> 8. Add DataSource.connect() methods
>>> 
>>> https://github.com/lukaseder/openjdk/commit/9a2eebea7869a3e9e81347a965bd9e65b41bc1bd
>>> 
>>> 
>>> Another place where lambdas could be useful is the DataSource type. When
>>> obtaining a DataSource from somewhere, the procedural approach of getting a
>>> connection and closing it feels a bit weird, even years after using it. I
>>> think this would be more intuitive:
>>> 
>>> datasource.connect(c -> {
>>>   c.executeUpdate("INSERT INTO t VALUES (?, ?, ?)", 1, 2, 3);
>>>   c.executeUpdate("INSERT INTO u VALUES (?, ?)", "a", "b");
>>> });
>>> 
>>> This is much shorter than today's:
>>> 
>>> try (Connection c = datasource.getConnection();
>>>     PreparedStatement s1 = c.prepareStatement(
>>>     "INSERT INTO t VALUES (?, ?, ?)");
>>>     PreparedStatement s2 = c.prepareStatement(
>>>     "INSERT INTO u VALUES (?, ?)")) {
>>>   s1.setInt(1, 1);
>>>   s1.setInt(2, 2);
>>>   s1.setInt(3, 3);
>>>   s1.executeUpdate();
>>>   s2.setString(1, "a");
>>>   s2.setString(2, "b");
>>>   s1.executeUpdate();
>>> }
>>> 
>>> I think that's an improvement, isn't it?
>>> 
>>> 
>>> This is all I can think of right now. Looking forward to your feedback,
>>> Lukas
>>> 
> 
> <http://oracle.com/us/design/oracle-email-sig-198324.gif>
> <http://oracle.com/us/design/oracle-email-sig-198324.gif> <http://oracle.com/us/design/oracle-email-sig-198324.gif>
> <http://oracle.com/us/design/oracle-email-sig-198324.gif>Lance Andersen| Principal Member of Technical Staff | +1.781.442.2037
> Oracle Java Engineering 
> 1 Network Drive 
> Burlington, MA 01803
> Lance.Andersen at oracle.com <mailto:Lance.Andersen at oracle.com>
> 
> 
>