Module file parse API

Chris Hegarty chris.hegarty at oracle.com
Thu Jun 21 03:43:26 PDT 2012



On 21/06/2012 11:36, Paul Sandoz wrote:
> On Jun 21, 2012, at 12:03 PM, Chris Hegarty wrote:
>>
>>>> Now that Signer can invoke the parser directly rather than creating a
>>>> Reader/Installer we can actually simplify the Reader/Installer so that
>>>> it always extracts the content. I found this to be a more flexible and
>>>> cleaner solution.
>>>>
>>>> Sorry if I've missed the point of your question. Was this what you were
>>>> asking?
>>>
>>> Signer is an unusual case in that it is reading the module file without
>>> extracting its contents in order to calculate the hashes that it will
>>> then sign.
>>
>> Right, but it is a very good use-case. The new parser API makes the code cleaner and more readable.
>>
>
> There is a shared layer that can be reused by the signer and installer (see below).

Thanks Paul, you did mention this in a previous mail and I will include 
such a layer in the next update.

>>> I think my concern is that if the hashes inside the module file don't
>>> actually match what is being read, you would want to know about that
>>> regardless of whether you are extracting the contents.
>>
>> Right, but this API is positioned at a low level. It felt very restricting to force this requirement at this level. I agree that most tools will want to fail if the hashes don't match, but I didn't want to preclude the possibility of building a non validating parser. It seems overly restrictive for the API to specify this.
>>
>
> It's trivial to layer a hash accumulation and hash validation parser/reader on top of the current parser/reader.
>
> The signer can use this to skip through and obtain the accumulation if the hashes validate, in addition to its layer of checking if the module is already signed and obtaining the a length (to do its shuffle to stuff in the signed section, this should be re-factored to use a writer so the signer never should know about bits/bytes of the format).
>
>
>>> Having said that, I've been somewhat dubious on the overall
>>> value/purpose of the hashes inside the module file. They don't provide
>>> any security without something additional such as a signature, but when
>>> generating the signature, it has to recalculate all of the hashes again
>>> to be sure they are still correct since the module file was created. So
>>> their only value is as a checksum, but my experience with checksums is
>>> that they are usually stored separately from what they are computed over.
>>
>> I view the hashes in an unsigned module file as a checksum also, but I think they are very useful. For example, if a tool is only interested in the classes of a module file, it can skip to the classes section, extract/process it, validate the hash, and exit without having to finish reading the remainder of the module file. This is nice, especially if the module file is being read from a remote stream, and only possible with per section hashes.
>>
>
> Also one does not have to upload/download separate data to/from a repository for the content and the hash of the content. This also has the advantage that the repository can verify the hashes before accepting deployment of a module (instead of say uploading using multipart MIME).

Yes, this makes the sharing/deployment environment much cleaner.

-Chris.

>
> Paul.



More information about the jigsaw-dev mailing list