best practice for processing large binary file

Alan Bateman Alan.Bateman at oracle.com
Tue Sep 27 06:05:25 PDT 2011


Andrew M wrote:
> I have large binary files (up to 1GB) of 56 byte records of int, float 
> and char primitives.  I want to read these files and process the 
> records.  The JVM can have several GB of available heap if necessary.
>
> In the bad old days I would have a Thread that reads the file and 
> put()s a byte[] in a LinkedBlockingQueue<byte[]> while a consumer 
> Thread take()s and processes the byte[].
>
> Now I'm wondering if nio/nio2 and jdk7 allow something faster and also 
> more elegant.  Should I use a SeekableByteChannel as shown here?
>
>   http://download.oracle.com/javase/tutorial/essential/io/file.html
>
> should I use a direct byte buffer for extra speed?  Should I be using 
> a memory mapped file?  Files.newBufferedReader?  Files.newInputStream?
Are you looking to access these 56 byte records sequentially? If so then 
SeekableByteChannel might not be interested as it's just a ByteChannel 
that maintains a position so is more interesting when the access isn't 
sequential. Also the newBufferedReader method returns a BufferedReader 
which is for processing text files and it sounds like this is binary 
data. If you have lots of memory then by all means try out mapping file 
into memory. A MappedByteBuffer (actually ByteBuffer) defines method for 
accessing binary data which I think you are looking for. You could start 
a thread that invokes the MappedByteBuffer load method to force it to be 
loaded in memory. That said, depending on the usage a simple 
DataInputStream that wraps a buffered stream might be sufficient for 
your needs.

-Alan.


More information about the nio-discuss mailing list