Proposal for field representation for shared libraries
Ramón García
ramon.garcia.f+java at gmail.com
Thu Oct 23 03:36:04 PDT 2008
As I previously commented, I want to implement shared JAR libraries
for Java. I want to share with you my thoughts about the
implementation.
What we need is to store classes in a library in a representation so
that the VM can mmap the library and access the class definitions. So
we need a representation of class that is mostly readonly. There are
other objects stored in this shared library, like method objects.
Therefore, we need a general way of storing readonly oop.
But oop are not completely readonly, they must be marked during GC and
have writable fields for several purposes. The simple solution would
be to use a writable file mapping. This would improve the time used
for class loading, but not the memory consumption. Note that the
granularity of memory is a page, so that if one writes a byte in a
writable mapping, that complete page becomes private to this process
and is no longer shared. The solution used by shared libraries is to
concentrate writable content in a few pages, and replace writable
fields by pointers (actually, relative offsets) to the writable area.
This is the solution that I propose here.
So in oop objects we replace a writable field by an offset (relative
to this) to the actual field. That offset will point to an area of
memory just next to the shared library. The loading of the shared
library will manage the creation of that relative part. That is the
xxxOopDesc, in the version inside a shared library, will replace
writable field
xxx foo
by the field something more or less like
int32_t foo_offset;
(this is just the concept, not the actual implementation) and when the
field is accessed, instead of
object->foo
the operation used to access it will be
*(xxx*) ((char*) object + object->foo_offset)
Now let us see how to implement this in C++ with the least possible
changes to the current source.
We cannot use virtual methods. There is no portable way of storing
objects with virtual methods in readonly storage in a position
independent way. That method would require to hack each compiler so
that it provides an extension that allows one to define objects with
virtual methods where the offset of the vtable (or whatever is used)
is relative to the object. So we have to use enumerations to mark if
an oop is stored in readonly storage.
So we introduce a new field in the oopDesc structure:
bool is_readonly_storage;
For the storage of each field, I would use the same datatype whether
the storage is normal or in a shared library. Otherwise we would need
a different type
for each possible oop, and since we cannot have virtual methods, it
would be difficult to have code that works with oops normal or in a
shared library in a transparent way.
The alternative is to use a union for each writable field. In order to
make it as easy to use as possible, I propose the following template
for defining a writable filed:
template<typename T>
class WritableField {
union {
T field;
uint32_t offset;
} u;
public T& getField(oopDesc container) {
if (!container->readonly_storage) {
return field;
} else {
return * (T*) ((char*) container + offset);
}
}
};
Now, in order to access the actual field, one would use:
derivedOopDesc myoop;
.....
T myfield = myoop->myfield.getField(myoop);
In order to make it somewhat easier to use we can add a helper
function in oopDesc. This does not give much gain, but removes the
duplication in the usage of the oop object, that may be error prone.
class oopDesc {
....
template<typename T>
getField(myfield<T> oopDesc::* field) {
return (this->.*field).getField(this);
};
...
Note the usage of field pointers. X oopDesc::* declares a pointer to
an X field stored in oopDesc. It stores the offset of the field in a
structure. Therefore one needs an instance to dereference the pointer,
thus instead of deferencing field with (*field) one uses
(this->.*field).
So that the field is accessed like:
derivedOopDesc myoop;
.....
T myfield = myoop->getField(&derivedOopDesc::myoop);
Well, this is what I have thought until now. I hope you like it.
More information about the hotspot-runtime-dev
mailing list