BIGV file format

Tue Jun 13 15:29:49 UTC 2017

Hi Alex,

Thanks for the background. Sounds like an interesting tool you are developing.

One suggestion is to use Graal as an analysis framework instead of relying on graph dumping output. We do this ourselves to check certain invariants in the Graal code base with the CheckgraalInvariants[1] unit test. This way you don't need to force the code you're interesting in analyzing to be compiled Graal but simply feed the relevant jars to the tool.

-Doug

[1] https://github.com/graalvm/graal/blob/1c5e3a5e544335e85dd271853161bceb635ac376/compiler/src/org.graalvm.compiler.core.test/src/org/graalvm/compiler/core/test/CheckGraalInvariants.java

> On 13 Jun 2017, at 16:46, Alexander Senier <alexander.senier at tu-dresden.de> wrote:
> 
> Hi Doug,
> 
> sure, I'm happy to do that. We develop a toolset to partition security
> protocol implementations into interacting components. Based on the
> security guarantees (confidentiality, integrity) required by the
> elements of the protocol we divide it into partitions. The idea is that
> some partitions are untrusted and require no guarantees (e.g.
> networking) while others do (encryption). If components are isolated and
> interact over well-defined  interfaces only, errors in untrusted
> partitions don't affect security critical ones. If we could do this for
> large software automatically, this would greatly decrease the trusted
> code size.
> 
> This is where Graal comes into play. So far, our models are derived
> manually from a specification, which is tedious, error-prone and doesn't
> scale. The idea is that we use an existing implementation of a security
> protocol and build a sample app around it. For this application we
> perform a data flow analysis between subprograms. We may, however, need
> to analyze the control flow to identify security-critical branches  in
> some cases  (e.g. when checking whether a signature was correct).
> 
> As for the our needs wrt. to the output format, the information I see in
> IGV right now looks sufficient. I don't think we require any graph from
> the compile steps after parsing for our purpose, so a dump level 0 with
> only one graph per method would be nice. From the output, we need be
> able  to unambiguously identify the methods invoked so that we can
> annotate them (e.g. to mark the bounds of our model or map library
> functions to known primitives). In the future, we may also want to map
> back the log output to the source code to be able to create source code
> for the components we  have identified.
> 
> A standard binary format for which libraries exist would be great. I
> have no strong opinion on that - there are plenty of potential options
> (CBOR, BSON, XDR...). What is slightly annoying for integration into an
> automatic toolchain is the non-deterministic names of the log files. One
> single   file with a fixed name would be nice here. But I guess that's a
> multithreading issue and it's not super critical.
> 
> Hopefully I could make our intention clear. Thanks for asking! I'm
> interested in feedback if you see a better way to obtain the information
> we need.
> 
> Cheers,
> Alex
> 
> On 06/12/2017 02:50 PM, Doug Simon wrote:
>> Hi Alex,
>> 
>> Are you able to provide more information about the tool you are developing/using that reads the BIGV format? This may help guide us in terms of what's important.
>> 
>> -Doug
>> 
> 
> -- 
> Dipl.-Inf. Alexander Senier
> Scientific Assistant
> 
> TU Dresden
> Faculty of Computer Science
> Institute of System Architecture
> Chair of Privacy and Data Security
> 01062 Dresden
> 
> Tel.: +49 351 463-38719
> Fax : +49 351 463-38255