<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"><html><head><meta content="text/html;charset=UTF-8" http-equiv="Content-Type"></head><body ><div style="font-family: Verdana, Arial, Helvetica, sans-serif; font-size: 10pt;"><div>Hello sir/madam,<br></div><div><br></div><div>In Section 3.9 Keywords<br></div><div><br></div><div>it states that "51 character sequences, formed from ASCII characters, are reserved for use as keyword and cannot be used as<br></div><div>identifiers. Another 17 character sequences, also formed from ASCII characters, may be interpreted as keywords or as other<br></div><div>tokens, depending on the context in which they appear."<br></div><div><br></div><div>This fails to mention that these character sequences are formed after ignoring the ignorable characters.<br></div><div><br></div><div>e.g.<br></div><div><br></div><div>public is equivalent to pu\u00adblic (\u00ad is the soft-hypen and would be rendered as public (looks the same)<br></div><div>i.e. an ignorable character for identifiers as mentioned in section 3.8 for identifiers with the help of the statement<br></div><div><br></div><div>"Two identifiers are the same only if, after ignoring characters that are<br></div><div>ignorable, the identifiers have the same Unicode character for each letter<br></div><div>or digit. An ignorable character is a character for which the method<br></div><div>Character.isIdentifierIgnorable(int) returns true."<br></div><div><br></div><div>This is true for all keywords also.<br></div><div>Basically all identifier ignorable characters are valid identifier part but they are not valid identifier start.<br></div><div><br></div><div>    IntStream.range(0,0x10ffff).filter(Character::isIdentifierIgnorable).allMatch(Character::isJavaIdentifierPart)<br></div><div>    returns true<br></div><div><br></div><div>    IntStream.range(0,0x10ffff).filter(Character::isIdentifierIgnorable).anyMatch(Character::isJavaIdentifierStart)<br></div><div>    returns false<br></div><div><br></div><div>This allows someone to embed these characters without changing the equivalence for identifiers.<br></div><div>Interestingly the same is also true for keywords.<br></div><div>There is one exception in the contextual keyword (non-sealed) which is made up of three tokens, in this case the<br></div><div>identifier ignorable character can be embedded except at the beginning of the third token sealed<br></div><div>i.e. non-\u00adsealed is invalid keyword<br></div><div>but non\u00ad-sealed is valid keyword.<br></div><div><br></div><div>Request to make a clarification about the keywords being equivalent to the ASCII sequence provided after ignoring the ignorable characters.<br></div><div><br></div><div>Thanks and regards,<br></div><div>Pravin<br></div><div><br></div></div><br></body></html>