RFR 8224789 : Parsing repetition count in regex does not detect numeric overflow
Joe Darcy
joe.darcy at oracle.com
Wed May 29 20:58:16 UTC 2019
As an academic note, note that non-overflowing strings can be recognized
with a regex :-)
For signed numbers in base 10:
# MAX_VALUE in base 10: 2147483647
((-)?0*((\p{Digit}){1,9})
|([0-1]\p{Digit}{9})
|(20\p{Digit}{8})
|(21[0-3]\p{Digit}{7})
|(214[0-6]\p{Digit}{6})
|(2147[0-3]\p{Digit}{5})
|(21474[0-7]\p{Digit}{4})
|(214748[0-2]\p{Digit}{3})
|(2147483[0-5]\p{Digit}{2})
|(21474836[0-3]\p{Digit}{1})
|(214748364[0-7])
)|
(-0*2147483648)
-Joe
On 5/29/2019 1:14 PM, Brent Christian wrote:
> Looks fine. Thanks, Ivan.
>
> -Brent
>
> On 5/29/19 7:27 AM, Roger Riggs wrote:
>> Thanks Ivan, Looks good.
>>
>>
>> On 05/28/2019 10:08 PM, Ivan Gerasimov wrote:
>>> Hi Brent!
>>>
>>>
>>> On 5/28/19 4:06 PM, Brent Christian wrote:
>>>> Hi, Ivan
>>>>
>>>> I agree with Roger that there are more test cases than necessary.
>>>> Otherwise I think it looks pretty good.
>>>>
>>> Okay. Then let's make the list of invalid ranges shorter, but add a
>>> randomly generated value!
>>>
>>> Please find the updated webrev here:
>>> http://cr.openjdk.java.net/~igerasim/8224789/01/webrev/
>>>
>>> With kind regards,
>>> Ivan
>>>
>>>> I find the addExact/multiplyExact code less readable. I'm not sure
>>>> what could be done about it - maybe some different indentation:
>>>>
>>>> cmin = Math.addExact(
>>>> Math.multiplyExact(cmin, 10),
>>>> ch - '0');
>>>>
>>>> though that makes for some long lines. Just a thought.
>>>>
>>>> Thanks,
>>>> -Brent
>>>>
>>>> On 5/24/19 11:28 PM, Ivan Gerasimov wrote:
>>>>> Hello!
>>>>>
>>>>> When Pattern.compile() parses the repetition count in the
>>>>> expressions like '.{100}', '.{1,2}' or '.{3,}' it fails to detect
>>>>> numeric overflow if the result is still non-negative.
>>>>>
>>>>> Could you please help review the patch?
>>>>>
>>>>> BUGURL: https://bugs.openjdk.java.net/browse/JDK-8224789
>>>>> WEBREV: http://cr.openjdk.java.net/~igerasim/8224789/00/webrev/
>>>>>
>>>>> Also, reading a char at line 3274 is done with skip(), so the
>>>>> exception thrown at 3315 displays the position of the error more
>>>>> accurately.
>>>>>
>>>>
>>>
>>
More information about the core-libs-dev
mailing list