[Records] Transparency and effects on collections
Daniel Latrémolière
daniel.latremoliere at gmail.com
Sun Mar 29 16:40:39 UTC 2020
Le 17/03/2020 à 10:57, Remi Forax a écrit :
> Hi Daniel,
> Pooling of Java objects in memory usually do more harm than good,
> because you artificially change the liveness of the pooled objects
> which doesn't work well with modern GC algorithms.
> Obviously, if objects are stored in a database, it can still be a win
> but usually the pool is more or less coupled with the ORM and use
> off-heap memory.
Hi Rémi,
Sorry for using terminology from database. I would have better used
"Compound map keys" [1, cf. Use cases] to avoid confusion. I am talking
only of in-memory (heap) data (not off-heap or in an external database).
Pooling objects with short live, by using an object with long live, is
usually bad. But, when objects are already long living, pooling objects
can be a good thing.
> EnumSet/EnumMap are specialized because Enum.ordinal() is a perfect
> hashcode function, so you can implement fast and compact set and map.
> It's not clear to me why there is a need for a specialized version of
> set/map for record given has you said a record is a class. Do we miss
> something ?
Using example from [1]: |RecordMap<PersonPlace, LocalDateTime>| would be
implementable as:
* |HashMap<PersonPlace, LocalDateTime>| which is better for get/set
operations.
* |HashMap<Person, HashMap<Place, LocalDateTime>| which is better for
operations on mappings filtered most frequently by Person:
|map.entrySet().stream().filter(filter, Map.Entry.KEY,
|||PersonPlace.PERSON|).forEach(...)|)
* |HashMap<Place, HashMap<Person, LocalDateTime>| which is better for
operations on mappings filtered most frequently by Place|:
||map.entrySet().stream().filter(||filter, ||||||Map.Entry.KEY,
|PersonPlace.PLACE|).forEach(...)|
Best implementation will usually be dependant of developer use cases of
the data in records, then the constructor of RecordMap need to require
preferred order of fields (provided by developer) to choose
corresponding implementation.
Creating a typesafe representation allow to rewrite queries on
RecordSet/RecordMap. When you have multiple nested levels of HashMap
like in preceding response, you can filter on keys of upper level
HashMap before, to iterate only on the selected inner level HashMap (the
only instances of HashMap containing mappings, matching the full query).
It an optimisation, using the delay of Stream API (doing nothing visible
before a terminal operation).
I can understand that this optimisation could seem to be too specialized
to be in JDK code, but when I see that a record is a transparent class,
I expect transparency of the fields to be anywhere (read or write but
also search) .
> About filtering fields value on stream, adding a method isEqual that
> takes a mapping function and a value should be enough,
> Predicate<Point> filter = Predicate.isEqual(Company::name, "Apple");
> I remember the lambda EG discuss that method (and its primitive
> variations), i don't remember why it's was not added.
No more uses than what a developer can do manually by adding a static
method on the record (not useful for rewriting queries):
|static final Predicate<PersonPlace> person(final ||Predicate||<Person>
filter) {|
| return record -> filter.test(record.person);|||
|}|
||
> About introducing a typesafe representation of fields, we currently
> provide a non-typesafe representation, j.l.r.RecordComponent.There is
> no typesafe representation because what :: means on a field is still
> an open question.
I am only interested in typesafe representation of record field, because
a record is not simply a normal class but it is a *transparent * class
[1]. Its contract with user allow him to see data in field separately,
then to search on field separately.
If this typesafe representation is added by defining "::" on a field,
this is perfect for me (and this can be added in another future
iteration, evolving the records).
> And if we go that way, a static final field is not the best
> representation in term of classfile because it is initialized too
> early. A ldc constantdynamic + a static method or something along that
> line is a better idea.
It is an implementation problem. I will not enter in this aspect of the
problem (only API problem).
One side question for developer documentation of records. Will Map.Entry
be retrofitted as a record? and why if not? (not found in archive of
amber-dev).
My understanding of records,
Bye,
Daniel.
[1]: https://cr.openjdk.java.net/~briangoetz/amber/datum.html
> *De: *"Brian Goetz" <brian.goetz at oracle.com>
>
> *À: *"amber-spec-experts" <amber-spec-experts at openjdk.java.net>
> *Envoyé: *Lundi 16 Mars 2020 21:24:04
> *Objet: *Fwd: [Records] Transparency and effects on collections
>
> Received on the -comments list.
>
>
> -------- Forwarded Message --------
> Subject: [Records] Transparency and effects on collections
> Date: Wed, 11 Mar 2020 07:45:32 -0700 (PDT)
> From: Daniel Latrémolière <daniel.latremoliere at gmail.com>
> To: amber-spec-comments at openjdk.java.net
>
>
>
> I understand that records are transparent and have correct
> equals/hashcode, then are useful as keys in collections. When
> trying to find classes to evolve to records, I found classes
> having more or less the same use-cases in memory than a
> multi-column primary key would have in SQL.
> ------------------------------------------------------------------------
> Records are a sugar above classes, like enum but for another use
> case, is it planned to have more evolved collections (like enum
> has with EnumSet/EnumMap)? Given records are explicitly
> transparent, API for pooling records would need to use this
> explicit transparency to allow partial queries and not only the
> Set/Map exact operations.
>
> If this is the case, it would probably need some specialised
> subtype of Set, like a new RecordSet (similar to a simple table
> without join, contrary to SQL). Current Java's Stream API would
> probably be perfect with some small enhancements, JPA-like (on a
> sub-type RecordStream<R>) allowing to refer directly to the field
> to be filtered.
>
> In this case, the compiler would need to generate for each record
> one static field per instance field of the record to allow typed
> queries, like in the following example.
>
> |record R(|||String foo, ...)| {||
> || ...||
> ||}||
> |
> desugarized more completely in:
>
> |class R {||
> || public static final RecordField<R, String> FOO;||
> || private String foo;||
> || ...||
> ||}||
> |
> It would allow some typed code for partial querying, like:
>
> |RecordSet<R> keyPool;||
> ||Predicate<String> fooFilter;||
> ||keyPool.stream().filter(R.FOO, fooFilter).forEach(...);||
> |
>
> Thanks for your attention,
> Daniel.
> ------------------------------------------------------------------------
> NB: in desugarization, I used standard static fields, like JPA,
> and not an enum containing all meta-fields (which would probably
> be more correct and efficient). This is due to the lack of JEP 301
> (needed for typed constants in enum). If allowed, the desugarized
> record will become something like:
>
> |class R {||
> || public static enum META<X> implements RecordField<R, X> {||
> || FOO<String>(String.class);||
> || }||
> || private String foo;||
> || ...||
> ||}||
> |
> In query, it would be used as R.META.FOO for filtering on field
> "foo" of the record:
>
> |keyPool.stream().filter(R.META.FOO, fooFilter).forEach(...)|
> ------------------------------------------------------------------------
> PS: I am not interested in interning records but using them in
> pools defined by programmer. Pooling would improve memory and
> performance to deduplicate records, because equals would more
> frequently succeed at identity test without continuing to real
> equality test (field by field). Having specialized implementations
> of collections, using fields of records following the order given
> by user, would probably be useful for performance against simple
> Set/Map: structures like a hierarchical Map of Map of ..., field
> by field, can be more efficient if partial queries are frequently
> used or if the pool is big.
>
More information about the amber-dev
mailing list