ADBA feedback

Thu Apr 26 20:44:56 UTC 2018

Lukas,

Thanks ever so much for your feedback. It’s greatly appreciated.

Comments in line below.

Douglas

> On Apr 26, 2018, at 8:06 AM, Lukas Eder <lukas.eder at gmail.com> wrote:
> 
> Hello,
> 
> I finally got around to playing with ADBA and AoJ and I'd like to provide
> some feedback. I will play around with it a bit more in the near future,
> perhaps waiting for new iterations. At this point, I'd like to thank you
> very much (again) for your open development approach. I find this way of
> looking into your work and being able to provide feedback (and your
> openness to said feedback) very constructive!
> 
> AOJ:
> =========================
> 
> Setup:
> -------------------------
> 
> I found it a bit difficult to set up - it took me around 1/2h to fix all
> the dependency issues / project setup, both with Eclipse and IntelliJ.
> Haven't tried NetBeans. I got it working but unfortunately, when I'll try a
> future version of AOJ / ADBA, I'm afraid I'll have to go through that setup
> hassle again, which isn't too motivating :-)
> 
> Is there any chance to provide an out-of-the-box working AOJ project that
> is built with e.g. Maven and that doesn't need any tweaking but should work
> out of the box with Maven and popular IDEs? I think this would help people
> provide even more feedback

Two things constrain what we can do with AoJ. First we have very limited resources we can expend on AoJ. To an extent we have spent all the time on it we can. Second, we can’t do much more without involving Oracle lawyers which would bring progress to a halt.

We believe the best way to address both of these is for the community to fork AoJ. All of the problems you have identified, things that we can’t address for one reason or another, can be fixed in a community supported fork. We absolutely will cooperate with a community project though we can’t promise to contribute. 

I’m not at all surprised by these bugs and I’m quite sure there are more.
> 
> Bugs:
> -------------------------
> 
> I have also provided PRs on GitHub, but haven't had time to read through
> the OCA yet, so I'll just document the bugs here as well, in case you want
> to reject the PRs for legal reasons.
> 
> 1) https://github.com/oracle/oracle-db-examples/pull/37
> 
> The RowOperation.getIdentifiers() method is implemented incorrectly with an
> off-by-one error. JDBC is 1-based, the implementation here is 0-based.
> 
> 2) https://github.com/oracle/oracle-db-examples/pull/36
> 
> There are quite a few compilation errors (see also setup problems above)
> due to the fact that the linked ADBA snapshot version seems to be out of
> date or too far advanced for the AoJ code dump. The PR illustrates just one
> example, but there are more.
> 
> 3) (no github issue)
> 
> Resource handling within AOJ could be better, e.g.
> RowOperation.completeQuery() doesn't correctly close the resources. I'm not
> sure if this matters, but the code doesn't expose best JDBC practices, at
> times :-)
> 
> Questions about code usage:
> -------------------------
> 
> 1) What are all these trailing getCompletionStage() calls for? Are they
> really needed? Calling submit() as the last operation seems sufficient, no?
> Probably from earlier versions where the CompletionStage was assigned to
> some variable?

Looking at the sample code there are two uses of getCompletionStage. Each serves a different purpose.

The first use is to assign a value to idF. This variable is a CompletionStage that is completed with the employee id for CLARK. It is used as a parameter to the next operation, the update. The second use provides a place to hang the lambda that logs the number of rows that are updated. This happens after the Operation is complete. (see my response to your question about onError below for more insight.)
> 
> 
> ADBA:
> =========================
> 
> Bikeshedding on naming:
> -------------------------
> 
> (The least important thing first)
> 
> These things have bothered me less before I started playing with AoJ but
> now they become a bit more apparent:
> 
> 1) collect() feels weird. Stream.collect() is a synchronous, terminal op.
> RowOperation.collect() is a lazy operation. Perhaps collecting() is better,
> here? I don't know, just wanted to provide feedback that the term collect
> itself has confused me at first.

I chose collect because it is what Stream uses. collecting would also be fine. I’m not going to change it until and unless there is a consensus that something else is better. But I have no problem changing it given such a consensus (or even a strong majority).

> 
> 2) Submission.getCompletionStage() breaks the fluent API for me.
> Submission.cancel() makes much more sense, for instance. All through the
> call chain, we have nice verbs like set(), collect(), submit(), cancel()
> and then this weird getCompletionStage(). But I don't have a better name
> proposal.

I agree with you. I’ve never been happy with getCompletionStage but haven’t come up with anything I like better. asCompletionStage is the only alternative that springs to mind and it isn’t better.

> 
> 3) CountOperation seems wrong. Why not UpdateOperation (like JDBC's
> executeUpdate()), or at even better: RowCountOperation? Because
>  o The term "count" implies to me that a SELECT COUNT(*) query was run
>  o not all "Count" operations actually return a meaningful rowcount, e.g.
> DDL statements

RowCountOperation is definitely a better name. I’ll make that change.

> 
> Design concerns:
> -------------------------
> 
> 1) Result.Row should have better methods akin to what JDBC's
> ResultSetMetaData (RSMD) has. For instance:
>  o It's currently not possible to discover individual column types
>  o The getIdentifiers() array corresponds to the RSMD.getColumnName()
> values, but what about qualification of columns? What about
> RSMD.getColumnLabel()?
>  o Nullability flags may be cool too. Not sure about the other info in RSMD

This is an intentional design choice for version 1. ADBA is targeted at high throughput apps. Metadata is slow. We intentionally omitted almost all metadata at least from V1 as we did not think it was an absolute requirement for the target use cases. If there are target use cases where metadata is critical we can revisit that decision, but we would prefer to defer that to version 2.

> 
> 2) A Result.Row.get(int, ...) overload to Result.Row.get(String, ...)
> should be added in my opinion. E.g. what if I run this query? SELECT 'x' AS
> "5", 'y' AS "1" FROM dual, and now I'm accessing Row.get("1"). Which row
> should be returned? The row with index 1, or the row with name "1"? I'm not
> happy with the existing "lazy" API design approach that prevents this
> overload. The overload would remove any ambiguity.

The intent is that the argument be a column name so your example should return ‘y’. I’ll try to clarify the JavaDoc. A vendor could choose to support indexes by adding get(int).
> 
> 3) I find the onError() vs submit().getCompletionStage().exceptionally()
> duality a bit confusing. Both seem to work, so is onError() really needed?
> If so, what's the difference?

The difference is timing. onError runs before the Operation is completed. submit().getCompletionStage().exceptionally() runs after the Operation is complete. exceptionally comes as part of CompletionStage so we can’t eliminate it. onError is required to insure error handling occurs before the next Operation is executed. The canonical example is transaction.setRollbackOnly. That must be called before the next Operation as the next Operation may be endTransactionOperation(transaction).
> 
> 4) I think the Result type should have a reference to its "parent" /
> producing type, e.g. Connection. The example where the countOperation()
> rolls back a transaction by referencing it from the lambda's outer scope
> shows a flaw (IMO an important one) of the current design. The transaction
> (and connection) should be available from the lambda's Result.Count
> argument as a contextual parameter, for a much more functional design, not
> in a globalish-variable kind of style. This is especially important in an
> asynchronous API, where the statement submission may happen entirely
> elsewhere than the rollback decision.

I played with that approach and it adds a lot of parameters to the lambdas. I don’t disagree with your criticism but following the path you suggest seemed to add a lot of additional code without increasing clarity. In fact it seemed to reduce clarity. There were a lot of different transaction models and this is the one that best expressed the timing dependencies.

The challenge is that the app thread creates an entire unit of work up through Connection.close() before the Connection has even authenticated with the database. That unit of work is static at that point and has to encapsulate what the app wants to happen. Going back and mucking with the Connection in a handler, after Connection.close, leads to confusion. 

I have written this many times before and I’ll keep writing it, if you think something can be improved by all means offer an alternative.

> 
> 5) I have many doubts about the Transaction design:
>  o What is the meaning of the word "Only"? Why not just call it
> Transaction.rollback()

rollback is a verb and seems to imply that the transaction is ended (with a rollback) at that point. rollbackOnly is an attempt to suggest a change in state of the Transaction object without implying that the transaction (lower case) is ended.

>  o What is the utility of the getter "isRollbackOnly()”?

Subsequent code may branch depending on whether the transaction will commit or rollback. Seemed like a good idea. I have no objection to removing it, but also don’t see any benefit to removing it.

>  o The API seems difficult to use correctly (probably even harder to
> implement correctly). Why does Connection.catchErrors() have to be called?

If an Operation completes exceptionally all subsequent Operations are skipped, completed with SqlSkippedException up to the next catchOperation or closeOperation. This includes endTransactionOperation. So if the app wants to follow a failed Operation with an explicit rollback it has to stop the skipping somehow. Making endTransactionOperation unskippable like closeOperation seemed too restrictive. closeOperation must be unskippable to insure that the Connection isn’t accidentally left open.

It would be reasonable to put a catchOperation in the default implementation of commit and commitOrRollback. I’ll do that.

>  o What happens if Connection.commitMaybeRollback(Transaction) is
> forgotten? Will the transaction be committed / rolled back implicitly when
> the Connection is closed?

That depends on the database. ADBA does not try to abstract the database; it provides access to it. The database does what the database does and ADBA stays out of the way.

>  o What about savepoints, which are essential for the implementation of
> nested transactions? Are they omitted deliberately or is their absence an
> oversight?

An intentional omission. It’s something we think can be deferred to version 2. If the consensus is that it should be in v1 there is no reason it can’t be added.

>  o Is there any real utility for this Transaction type (in the absence of
> a Savepoint), given that the commit and commitMaybeRollback methods are on
> the Connection only

OperationGroup.commitMaybeRollback also exists. 

Absolutely, controlling the dependencies.

commit and commitMaybeRollback are convenience methods. The method that actually does the work is OperationGroup.endTransactionOperation(Transaction) so think about what happens when an endTransactionOperation is submitted.

>  o What happens when a Transaction.setRollbackOnly() call is made, but
> then Connection.commit() "overrides" it?

Look at the default implementation of commit(). It creates a new Transaction and submits an endTransactionOperation with that new Transaction. The one that was setRollbackOnly is unused. There is no hidden connection between the various methods. The connection is explicit, the Transaction.

>  o What happens when Connection.transaction() is called twice? Will we get
> two times the same transaction reference, or will two separate,
> "autonomous" transactions be spawned?

Two separate Transaction objects are created. This has nothing to to with spawning autonomous (or otherwise) transactions. 

Consider a unit of work that does two transactions. Both must be submitted in advance of any knowledge of whether the transactions will be committed or rolled back. So the app must get two separate Transactions to control the two separate database transactions (lower case).
> 
> 6) Why is Row.rowNumber() zero based? That's very confusing. E.g. Row.get()
> is still 1-based, and all SQL databases that support either ROWNUM or
> ROW_NUMBER() OVER () functions will produce 1-based row numbers

Because everything else in Java is zero based. If the consensus is to change this, that’s fine with me.
> 
> 7) I have to try my luck again on those type witnesses that are required on
> each and every statement call. They are very annoying (and hard to get
> right or be properly understood by beginner developers who struggle with
> Java generics already) and I don't think they are necessary. I think the
> collect() method should define the type of the result, and in the absence
> of a collect() method, the result type should be a wildcard ?, or
> java.lang.Void. Just like a Collector<T, A, R> passed to
> Stream<T>.collect() will return R, and not T, I think a call to
> RowOperation<?>.collect() should eventually produce the R type.

You aren’t the first to complain about the type witnesses. They are due to a limitation in the type inferencing and there is no expectation it will change anytime soon. Eliminating them adds a lot of weight to the API. I expect that writing the type witnesses will become an idiom. Yes it is confusing the first time you see it or write it but if you read and write enough ADBA code it becomes automatic. As with everything else feel free to propose an alternative.

One of the advantages to the existing API is that it makes vendor extensions all but transparent. If the Connection is typed as a vendor specific type then all of the vendor extensions are exposed without any additional casts.

  VendorConnection vc = (VendorConnection)ds.getConnection();
  vc.<Integer>rowCountOperation(sql)
    .vendorSpecificMethod()
    .submit();

I consider this an important feature of the API. Consider the case where the vendor wants to add set(int, Object, SqlType). With the current design this works transparently. Eliminating the type witness without breaking this is challenging. 

The current API allows the Operation config methods to be specified in any order. Restricting the order requires adding a bunch of new types which I believe add no clarity.

It is possible to omit the type witness if you assign the Operation to a typed variable:

  RowOperation<Integer> ro = conn.rowOperation(sql);
  ro.set(“foo”, . . . )
     .set(“bar”, . . .)
     .etc();

This has been discussed extensively. As yet there is no agreement on a better approach. I believe that it is something that ADBA programmers will quickly adapt to. The rule is consistent: write a type witness anytime you create an Operation.
> 
> If I remember correctly, the current design was there to simplify the API
> as it is now. But I really think the price to pay for this is very very
> high. There should be no type witness needed.

I hope there is a better alternative, but we haven’t found it.  Having written enough ADBA, which isn’t all that much by the way, putting in the type witnesses becomes pretty automatic. It is an initial barrier, but one I think is readily overcome. 
> 
> 8) While I still like the idea of using a Collector to retrieve the results
> (and I don't want to engage in the discussion about the possibility of
> using Flow style streams instead), I think that there should be either:
> 
>  o Some convenience overloads like this
> 
>    public default RowOperation<T> consume(Consumer<? super Result.Row> c) {
>        return collect(Collector.of(() -> null, (a, r) -> c.accept(r), (x,
> y) -> null));
>    }
>  o Some public "common Collector" utility like this:
> 
>    Collector<Result.Row, Void, Void> consume(Consumer<Result.Row>
> consumer) {
>        return Collector.of(() -> null, (a, r) -> consumer.accept(r), (x,
> y) -> null);
>    }
> 
> The example code shown in AoJ is a bit scary. If for the most simple
> queries, we need to write those Collectors with all the types and dummy
> suppliers and combiners, that'll scare off people. Besides, there's no need
> to reinvent the wheel as a user, so these libraries should be provided.
> 
> Another convenience tool could be a toList() collector, or default method
> on RowOperation:
> 
>    public default <S> RowOperation<List<S>> toList(Function<? super
> Result.Row, ? extends S> f) {
>        return collect(Collector.of(
>            () -> new ArrayList<S>(),
>            (List<S> a, Result.Row r) -> {
>                a.add(f.apply(r));
>            },
>            (l, r) -> { l.addAll(r); return l; }
>        ));
>    }
> 
> There might be many more.

Absolutely there can and should be many more convenience methods. We will for sure add more after we have achieved some level of consensus on the underlying mechanism. I’m happy to add any you care to propose. I’ll add the above suggestions if you’d like.

> 
> Thanks,
> Lukas