[OpenJDK 2D-Dev] Review Request for JDK-8152672 : Exception while getting second image properties for JPEG with embedded thumbnail
Jayathirth D V
jayathirth.d.v at oracle.com
Tue Jun 21 05:43:38 UTC 2016
Hi Phil,
Gentle remainder for review.
Thanks,
Jay
-----Original Message-----
From: Jim Graham
Sent: Wednesday, June 15, 2016 10:01 AM
To: Jayathirth D V; Phil Race
Cc: 2d-dev at openjdk.java.net
Subject: Re: [OpenJDK 2D-Dev] Review Request for JDK-8152672 : Exception while getting second image properties for JPEG with embedded thumbnail
Hi Jay,
Thanks for explaining all of the details. It seems that constantly being in scan mode simplifies things because we have to be in scan mode for entropy data anyway, but it still allows errant bytes outside of the entropy data to be accepted by this parser. I'm not sure if that is a problem in the long run, but it is not a new issue (since the existing code already was doing that) so we can live with it for purposes of fixing this particular bug.
One simplification of what you say below, it appears that among all of those cases none of them really impact the fact that un-sized entropy encoded data only appears inside the SOS marker and the only markers found inside the entropy data itself are the RSTn markers. Ideally we could limit our scanning to just the data following the SOS marker and its sized header and only allow RSTn markers to appear inside that manual scan, reverting to a non-scanning mode when we reach another marker (DNL it looks like). But, that would be the subject of a different bug fix.
It looks OK to me as long as Phil is OK with the types of exceptions that you are throwing in the various new exceptional cases, and with the change to now assume that skipImage always being called at an SOI marker. Phil?
...jim
On 6/14/16 9:36 AM, Jayathirth D V wrote:
> Hi Jim,
>
> I have updated the code to add check for EOF even in case of reading length. Also I have updated the code such that in all the cases where we find EOF before EOI we are throwing IndexOutOfBoundsException and in other failed cases we are throwing IOException.
>
> More analysis :
> All the length markers that we are checking in our case have valid length data except SOS marker in which always the length value will be 12 and it is the value only for the length of Scan header.
> After Scan header(SOS) we have compressed data for which we have no parameter which gives the length so that we can skip it completely.
> Once we get to the initial SOS header it can take 3 paths based on how the data follows:
> a) If we don't have Restart enabled and if we scan continuously through compressed data we will find EOI otherwise we will find RST markers and then EOI marker.
> b) If we have multiple scan headers(Multi-scan scenario) we have to follow point 'a' in loop with each scan header separated by DNL and other miscellaneous tables.
> c) If we have multiple frame headers(Multi-frame scenario) we have to follow point 'b' in loop with different markers coming in between.
> Above information is taken from page 52 of
> https://www.w3.org/Graphics/JPEG/itu-t81.pdf
>
> Since we can't rely on length specified in SOS header and there is possibility of having multiple SOS headers, we need to search for FF(foundFF). It means even if we jump for the length specified is SOS header the next byte is not promised to be '0xff'.
> For all the remaining markers we will get proper information related to the length and we will skip all data part(which might contain data similar to EOI/SOI/or any other marker). While propagating through the 'for' loop we are following the same approach.
>
> Please find updated webrev for review :
> http://cr.openjdk.java.net/~jdv/8152672/webrev.02/
>
> Thanks,
> Jay
>
> -----Original Message-----
> From: Jim Graham
> Sent: Tuesday, June 14, 2016 12:44 AM
> To: Jayathirth D V
> Cc: 2d-dev at openjdk.java.net; Philip Race
> Subject: Re: [OpenJDK 2D-Dev] Review Request for JDK-8152672 :
> Exception while getting second image properties for JPEG with embedded
> thumbnail
>
> Hi Jay,
>
> You still don't check the read() calls in the length case to see if you reached EOF (-1). The potential for an infinite loop is still there.
>
> Also, you still search for an FF, even if you require the function to start at an SOI marker - all subsequent markers are still subject to a search rather than a deterministic requirement that we encounter markers with no gaps between them.
>
> Why do we have the foundFF variable in the first place? If we just saw an SOI marker then the next byte *must be* a 0xff (shouldn't it? Am I missing something?). We shouldn't read a byte and check if it is a 0xff and then try again, we should expect a single 0xff byte followed by a marker type byte, as in:
>
> while (true) {
> int byteval == iis.read();
> // if (byteval < 0) then what?
> if (byteval != 0xff) { exception }
> byteval = iis.read();
> switch (byteval) {
> }
> }
>
> The only question is if we get a -1 on the first read if we treat that as an implicit EOI as we do now, or if we treat it as an exception.
> Note that if we get a -1 in the second read, then we have a half-formed tag and that should fall into the default and be declared a bad file.
>
> ...jim
>
> On 6/13/2016 10:00 AM, Jayathirth D V wrote:
>> Hi Jim,
>>
>> Thanks for your valuable inputs.
>> I have updated the code with your inputs:
>> 1) We should check for complete SOI marker and not just "FF" at start of skipImage().
>> 2) There is no need of iis.read() which was happening in default case. iis.read() present in for loop check will take care of checking EOF.
>> 3) I have added case condition for all the markers having length and added default case where we get invalid marker starting with FF.
>>
>> Apart from above changes, after going more through https://www.w3.org/Graphics/JPEG/itu-t81.pdf got to know following things:
>>
>> 1) TEM is also one more marker without length so added case for that.
>> 2) Since we have all unique conditions checked, we should not find any SOI marker after the initial SOI marker before we find EOI. Made changes to throw IOException in this case.
>>
>> Please find updated webrev for review:
>> http://cr.openjdk.java.net/~jdv/8152672/webrev.01/
>>
>> Thanks,
>> Jay
>>
>> -----Original Message-----
>> From: Jim Graham
>> Sent: Saturday, June 11, 2016 3:07 AM
>> To: Jayathirth D V; Philip Race
>> Cc: 2d-dev at openjdk.java.net
>> Subject: Re: [OpenJDK 2D-Dev] Review Request for JDK-8152672 :
>> Exception while getting second image properties for JPEG with
>> embedded thumbnail
>>
>> Thanks for the response Jay, I think I was misreading some of the code as now that I look back at it, it's mostly written as I was suggesting with respect to skipping over data sections, however it is still doing some scanning to find the initial 0xFF. Some thoughts:
>>
>> - If we can be sure that we are located at where a tag should be, then shouldn't we just read a byte and assert that it is 0xFF and report the file as invalid if it isn't? The current code will just ignore the byte and continue reading until it finds a 0xFF, but the fact that the first byte we read isn't 0xFF means we have wandered into data that isn't following the file format (or, possibly, that this was called from a case where we hadn't completely read a section of data, but that should be fixed as we could be in the middle of a section that isn't entropy encoded and our search for 0xFF might have invalid assumptions).
>>
>> - The bytes read in the default section to get the length and the tag for the next block aren't tested for EOF (-1). This may even get us into an infinite loop if we hit EOF at the right time (just after a sized block tag) as the size bytes will read and combine into a -1 size, back up three bytes, and then reread the same tag and try to compute a length again which will send us back 3 bytes, etc.
>>
>> - default assumes that all other markers that are created will be sized, but can we assert that? Shouldn't we specifically list all known sized markers and have the default case reject the file as not being of a format that we recognize?
>>
>> ...jim
>>
>> On 6/9/2016 11:21 PM, Jayathirth D V wrote:
>>> Hi Jim,
>>>
>>> I think the harmless byte that you are referring to will be applied only for image data(Between SOS(Start of Scan) marker and EOI). For example, any "FF" data present in Jpeg image will be represented as "FF 00". But I think application headers or comments section can contain data which will be similar to EOI,SOI or other markers(FF XX).
>>>
>>> Thanks,
>>> Jay
>>>
>>> -----Original Message-----
>>> From: Jim Graham
>>> Sent: Friday, June 10, 2016 5:28 AM
>>> To: Jayathirth D V; Philip Race
>>> Cc: 2d-dev at openjdk.java.net
>>> Subject: Re: [OpenJDK 2D-Dev] Review Request for JDK-8152672 :
>>> Exception while getting second image properties for JPEG with
>>> embedded thumbnail
>>>
>>> It looks like JPEG files have protection for scanning for an FF and assuming it is a marker by making sure that all FF bytes that appear in data are followed by a harmless byte, so a brute force search is probably fine. But it still seems wasteful when we know we are at a tag and we know the sizes of all of the tags, we should be able to skip around the file looking for the SOI directly...
>>>
>>> ...jim
>>>
>>> On 6/2/2016 5:10 AM, Jayathirth D V wrote:
>>>> Fixed typo.
>>>>
>>>>
>>>>
>>>> *From:*Jayathirth D V
>>>> *Sent:* Thursday, June 02, 2016 5:08 PM
>>>> *To:* Philip Race
>>>> *Cc:* Jim Graham; 2d-dev at openjdk.java.net
>>>> *Subject:* RE: Review Request for JDK-8152672 : Exception while
>>>> getting second image properties for JPEG with embedded thumbnail
>>>>
>>>>
>>>>
>>>> Hi Phil,
>>>>
>>>>
>>>>
>>>> We have two kind of images with which we are able to reproduce the issue:
>>>>
>>>> 1) sample.jpg present in JBS bug(We can't use this image because it
>>>> is licensed ).
>>>>
>>>> 2) JpegEmbedThumbnail.jpg taken using Prasanta's camera and used in
>>>> webrev.
>>>>
>>>>
>>>>
>>>> _ _
>>>>
>>>> _sample.jpg : _
>>>>
>>>> _ _
>>>>
>>>> If we do getNumImages() it will return 2. getNumImages() follows
>>>> the same logic of skipping markers with length and registering SOI
>>>> to get number of images.
>>>>
>>>> sample.jpg has markers as follows :
>>>>
>>>> SOI -> APP1 - > SOI -> EOI -> APP1 End -> EOI -> SOI -> EOI
>>>>
>>>>
>>>>
>>>> I have dumped first image its SOI is first one in the above list
>>>> and for second image it is third one in the list. getNumImages()
>>>> counts first and third SOI for number of images. But in case of
>>>> skipImage() we are getting inside APP1 marker and considering its SOI.
>>>>
>>>>
>>>>
>>>> _JpegEmbedThumbnail.jpg :_
>>>>
>>>> _ _
>>>>
>>>> If we do getNumImages() it will return 1.
>>>>
>>>> JpegEmbedThumbnail.jpg has markers as follows :
>>>>
>>>> SOI -> APP1 -> SOI -> EOI -> APP1 End -> APP2 -> SOI -> APP2 End ->
>>>> APP2
>>>> -> EOI -> APP2 End -> EOI
>>>>
>>>>
>>>>
>>>> getNumImages() counts only first SOI for number of images. But in
>>>> case of skipImage() we will are getting inside APP1 and APP2 markers also.
>>>>
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Jay
>>>>
>>>> *From:*Phil Race
>>>> *Sent:* Thursday, June 02, 2016 4:05 AM
>>>> *To:* Jayathirth D V
>>>> *Cc:* Jim Graham; 2d-dev at openjdk.java.net
>>>> <mailto:2d-dev at openjdk.java.net>
>>>> *Subject:* Re: Review Request for JDK-8152672 : Exception while
>>>> getting second image properties for JPEG with embedded thumbnail
>>>>
>>>>
>>>>
>>>> I am bit doubtful about this. Are you sure we are correct in
>>>> reporting two images to begin with ?
>>>> Thumbnails should not get counted ..
>>>>
>>>>
>>>> -phil.
>>>>
>>>> On 06/01/2016 01:06 AM, Jayathirth D V wrote:
>>>>
>>>> Updated bug title in JBS as it was misleading.
>>>>
>>>>
>>>>
>>>> *From:* Jayathirth D V
>>>> *Sent:* Wednesday, June 01, 2016 12:48 PM
>>>> *To:* Philip Race; Jim Graham
>>>> *Cc:* 2d-dev at openjdk.java.net <mailto:2d-dev at openjdk.java.net>
>>>> *Subject:* Review Request for JDK-8152672 : Exception getting
>>>> thumbnail size for JPEG with embedded thumbnail
>>>>
>>>>
>>>>
>>>> Hi,
>>>>
>>>>
>>>>
>>>> _Please review the following fix in JDK9:_
>>>>
>>>>
>>>>
>>>> Bug : https://bugs.openjdk.java.net/browse/JDK-8152672
>>>>
>>>>
>>>>
>>>> Webrev : http://cr.openjdk.java.net/~jdv/8152672/webrev.00/
>>>> <http://cr.openjdk.java.net/%7Ejdv/8152672/webrev.00/>
>>>>
>>>>
>>>>
>>>> Issue : When we are trying to get properties related to second image
>>>> in JPEG file we are getting IIOException mentioning that it is not a
>>>> JPEG file.
>>>>
>>>>
>>>>
>>>> Root cause : When we are skipping first image to reach second image
>>>> header, we are just trying to find next available EOI marker. But if
>>>> first image has embedded thumbnail in APP1 marker, we will reach to
>>>> EOI of this thumbnail and not EOI of first image. So after we reach
>>>> EOI of embedded thumbnail we try to access second image SOI marker
>>>> which will fail.
>>>>
>>>>
>>>>
>>>> Solution : We have to change the logic of how we skip to consecutive
>>>> images in JPEG file. We know that application markers, comments or
>>>> other markers can contain data same as SOI & EOI. Instead of just
>>>> checking for EOI marker serially, we should read length of these
>>>> markers and skip them.
>>>>
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Jay
>>>>
>>>>
>>>>
>>>>
>>>>
More information about the 2d-dev
mailing list