PROPOSAL: Binary Literals

james lowden jl0235 at yahoo.com
Wed Mar 25 12:07:22 PDT 2009


Actually, that's a good idea in general for long numeric constants.  9_000_000_000_000_000L is easier to parse than 9000000000000000L.


--- On Wed, 3/25/09, Stephen Colebourne <scolebourne at joda.org> wrote:

> From: Stephen Colebourne <scolebourne at joda.org>
> Subject: Re: PROPOSAL: Binary Literals
> To: coin-dev at openjdk.java.net
> Date: Wednesday, March 25, 2009, 1:50 PM
> See
> http://www.jroller.com/scolebourne/entry/changing_java_adding_simpler_primitive
> for my take on this from long ago.
> 
> In particular, I'd suggest allowing a character to
> separate long binary strings:
> 
> int anInt1 = 0b10100001_01000101_10100001_01000101;
> 
> much more readable.
> 
> Stephen
> 
> 2009/3/25 Derek Foster <vapor1 at teleport.com>:
> > Hmm. Second try at sending to the list. Let's see
> if this works. (In the
> > meantime, I noticed that Bruce Chapman has mentioned
> something similar in his
> > another proposal, so I think we are in agreement on
> this. This proposal
> > should not be taken as to compete with his similar
> proposal: I'd quite like
> > to see type suffixes for bytes, shorts, etc. added to
> Java, in addition to
> > binary literals.) Anyway...
> >
> >
> >
> >
> > Add binary literals to Java.
> >
> > AUTHOR(S): Derek Foster
> >
> > OVERVIEW
> >
> > In some programming domains, use of binary numbers
> (typically as bitmasks,
> > bit-shifts, etc.) is very common. However, Java code,
> due to its C heritage,
> > has traditionally forced programmers to represent
> numbers in only decimal,
> > octal, or hexadecimal. (In practice, octal is rarely
> used, and is present
> > mostly for backwards compatibility with C)
> >
> > When the data being dealt with is fundamentally
> bit-oriented, however, using
> > hexadecimal to represent ranges of bits requires an
> extra degree of
> > translation for the programmer, and this can often
> become a source of errors.
> > For instance, if a technical specification lists
> specific values of interest
> > in binary (for example, in a compression encoding
> algorithm or in the
> > specifications for a network protocol, or for
> communicating with a bitmapped
> > hardware device) then a programmer coding to that
> specification must
> > translate each such value from its binary
> representation into hexadecimal.
> > Checking to see if this translation has been done
> correctly is accomplished
> > by back-translating the numbers. In most cases,
> programmers do these
> > translations in their heads, and HOPEFULLY get them
> right. however, errors
> > can easily creep in, and re-verifying the results is
> not straightforward
> > enough to be done frequently.
> >
> > Furthermore, in many cases, the binary representations
> of numbers makes it
> > much more clear what is actually intended than the
> hexadecimal one. For
> > instance, this:
> >
> > private static final int BITMASK = 0x1E;
> >
> > does not immediately make it clear that the bitmask
> being declared comprises
> > a single contiguous range of four bits.
> >
> > In many cases, it would be more natural for the
> programmer to be able to
> > write the numbers in binary in the source code,
> eliminating the need for
> > manual translation to hexadecimal entirely.
> >
> >
> > FEATURE SUMMARY:
> >
> > In addition to the existing "1" (decimal),
> "01" (octal) and "0x1"
> > (hexadecimal) form of specifying numeric literals, a
> new form "0b1" (binary)
> > would be added.
> >
> > Note that this is the same syntax as has been used as
> an extension by the GCC
> > C/C++ compilers for many years, and also is used in
> the Ruby language, as
> > well as in the Python language.
> >
> >
> > MAJOR ADVANTAGE:
> >
> > It is no longer necessary for programmers to translate
> binary numbers to and
> > from hexadecimal in order to use them in Java
> programs.
> >
> >
> > MAJOR BENEFIT:
> >
> > Code using bitwise operations is more readable and
> easier to verify against
> > technical specifications that use binary numbers to
> specify constants.
> >
> > Routines that are bit-oriented are easier to
> understand when an artifical
> > translation to hexadecimal is not required in order to
> fulfill the
> > constraints of the language.
> >
> > MAJOR DISADVANTAGE:
> >
> > Someone might incorrectly think that "0b1"
> represented the same value as
> > hexadecimal number "0xB1". However, note
> that this problem has existed for
> > octal/decimal for many years (confusion between
> "050" and "50") and does not
> > seem to be a major issue.
> >
> >
> > ALTERNATIVES:
> >
> > Users could continue to write the numbers as decimal,
> octal, or hexadecimal,
> > and would continue to have the problems observed in
> this document.
> >
> > Another alternative would be for code to translate at
> runtime from binary
> > strings, such as:
> >
> >   int BITMASK =
> Integer.parseInt("00001110", 2);
> >
> > Besides the obvious extra verbosity, there are several
> problems with this:
> >
> > * Calling a method such as Integer.parseInt at runtime
> will typically make it
> > impossible for the compiler to inline the value of
> this constant, since its
> > value has been taken from a runtime method call.
> Inlining is important,
> > because code that does bitwise parsing is often very
> low-level code in tight
> > loops that must execute quickly. (This is particularly
> the case for mobile
> > applications and other applications that run on
> severely resource-constrained
> > environments, which is one of the cases where binary
> numbers would be most
> > valuable, since talking to low-level hardware is one
> of the primary use cases
> > for this feature.)
> >
> > * Constants such as the above cannot be used as
> selectors in 'switch'
> > statements.
> >
> > * Any errors in the string to be parsed (for instance,
> an extra space) will
> > result in runtime exceptions, rather than compile-time
> errors as would have
> > occurred in normal parsing. If such a value is
> declared 'static', this will
> > result in some very ugly exceptions at runtime.
> >
> >
> > EXAMPLES:
> >
> > // An 8-bit 'byte' literal.
> > byte aByte = (byte)0b00100001;
> >
> > // A 16-bit 'short' literal.
> > short aShort = (short)0b1010000101000101;
> >
> > // Some 32-bit 'int' literals.
> > int anInt1 = 0b10100001010001011010000101000101;
> > int anInt2 = 0b101;
> > int anInt3 = 0B101; // The B can be upper or lower
> case as per the x in
> > "0x45".
> >
> > // A 64-bit 'long' literal. Note the
> "L" suffix, as would also be used
> > // for a long in decimal, hexadecimal, or octal.
> > long aLong =
> >
> 0b01010000101000101101000010100010110100001010001011010000101000101L;
> >
> > SIMPLE EXAMPLE:
> >
> > class Foo {
> > public static void main(String[] args) {
> >  System.out.println("The value 10100001 in
> decimal is " + 0b10100001);
> > }
> >
> >
> > ADVANCED EXAMPLE:
> >
> > // Binary constants could be used in code that needs
> to be
> > // easily checkable against a specifications document,
> such
> > // as this simulator for a hypothetical 8-bit
> microprocessor:
> >
> > public State decodeInstruction(int instruction, State
> state) {
> >  if ((instruction & 0b11100000) == 0b00000000) {
> >    final int register = instruction &
> 0b00001111;
> >    switch (instruction & 0b11110000) {
> >      case 0b00000000: return state.nop();
> >      case 0b00010000: return
> state.copyAccumTo(register);
> >      case 0b00100000: return
> state.addToAccum(register);
> >      case 0b00110000: return
> state.subFromAccum(register);
> >      case 0b01000000: return
> state.multiplyAccumBy(register);
> >      case 0b01010000: return
> state.divideAccumBy(register);
> >      case 0b01100000: return
> state.setAccumFrom(register);
> >      case 0b01110000: return
> state.returnFromCall();
> >      default: throw new IllegalArgumentException();
> >    }
> >  } else {
> >    final int address = instruction & 0b00011111;
> >    switch (instruction & 0b11100000) {
> >      case 0b00100000: return state.jumpTo(address);
> >      case 0b01000000: return
> state.jumpIfAccumZeroTo(address);
> >      case 0b01000000: return
> state.jumpIfAccumNonzeroTo(address);
> >      case 0b01100000: return
> state.setAccumFromMemory(address);
> >      case 0b10100000: return
> state.writeAccumToMemory(address);
> >      case 0b11000000: return state.callTo(address);
> >      default: throw new IllegalArgumentException();
> >    }
> >  }
> > }
> >
> > // Binary literals can be used to make a bitmap more
> readable:
> >
> > public static final short[] HAPPY_FACE = {
> >   (short)0b0000011111100000;
> >   (short)0b0000100000010000;
> >   (short)0b0001000000001000;
> >   (short)0b0010000000000100;
> >   (short)0b0100000000000010;
> >   (short)0b1000011001100001;
> >   (short)0b1000011001100001;
> >   (short)0b1000000000000001;
> >   (short)0b1000000000000001;
> >   (short)0b1001000000001001;
> >   (short)0b1000100000010001;
> >   (short)0b0100011111100010;
> >   (short)0b0010000000000100;
> >   (short)0b0001000000001000;
> >   (short)0b0000100000010000;
> >   (short)0b0000011111100000;
> > }
> >
> > // Binary literals can make relationships
> > // among data more apparent than they would
> > // be in hex or octal.
> > //
> > // For instance, what does the following
> > // array contain? In hexadecimal, it's hard to
> tell:
> > public static final int[] PHASES = {
> >    0x31, 0x62, 0xC4, 0x89, 0x13, 0x26, 0x4C, 0x98
> > }
> >
> > // In binary, it's obvious that a number is being
> > // rotated left one bit at a time.
> > public static final int[] PHASES = {
> >    0b00110001,
> >    0b01100010,
> >    0b11000100,
> >    0b10001001,
> >    0b00010011,
> >    0b00100110,
> >    0b01001100,
> >    0b10011000,
> > }
> >
> >
> > DETAILS
> >
> > SPECIFICATION:
> >
> > Section 3.10.1 ("Integer Literals") of the
> JLS3 should be changed to add the
> > following:
> >
> > IntegerLiteral:
> >        DecimalIntegerLiteral
> >        HexIntegerLiteral
> >        OctalIntegerLiteral
> >        BinaryIntegerLiteral         // Added
> >
> > BinaryIntegerLiteral:
> >        BinaryNumeral IntegerTypeSuffix_opt
> >
> > BinaryNumeral:
> >        0 b BinaryDigits
> >        0 B BinaryDigits
> >
> > BinaryDigits:
> >        BinaryDigit
> >        BinaryDigit BinaryDigits
> >
> > BinaryDigit: one of
> >        0 1
> >
> > COMPILATION:
> >
> > Binary literals would be compiled to class files in
> the same fashion as
> > existing decimal, hexadecimal, and octal literals are.
> No special support or
> > changes to the class file format are needed.
> >
> > TESTING:
> >
> > The feature can be tested in the same way as existing
> decimal, hexadecimal,
> > and octal literals are: Create a bunch of constants in
> source code, including
> > the maximum and minimum positive and negative values
> for integer and long
> > types, and verify them at runtime to have the correct
> values.
> >
> >
> > LIBRARY SUPPORT:
> >
> > The methods Integer.decode(String) and
> Long.decode(String) should be modified
> > to parse binary numbers (as specified above) in
> addition to their existing
> > support for decimal, hexadecimal, and octal numbers.
> >
> >
> > REFLECTIVE APIS:
> >
> > No updates to the reflection APIs are needed.
> >
> >
> > OTHER CHANGES:
> >
> > No other changes are needed.
> >
> >
> > MIGRATION:
> >
> > Individual decimal, hexadecimal, or octal constants in
> existing code can be
> > updated to binary as a programmer desires.
> >
> >
> > COMPATIBILITY
> >
> >
> > BREAKING CHANGES:
> >
> > This feature would not break any existing programs,
> since the suggested
> > syntax is currently considerd to be a compile-time
> error.
> >
> >
> > EXISTING PROGRAMS:
> >
> > Class file format does not change, so existing
> programs can use class files
> > compiled with the new feature without problems.
> >
> >
> > REFERENCES:
> >
> > The GCC/G++ compiler, which already supports this
> syntax (as of version 4.3)
> > as an extension to standard C/C++.
> > http://gcc.gnu.org/gcc-4.3/changes.html
> >
> > The Ruby language, which supports binary literals:
> > http://wordaligned.org/articles/binary-literals
> >
> > The Python language added binary literals in version
> 2.6:
> >
> http://docs.python.org/dev/whatsnew/2.6.html#pep-3127-integer-literal-support-and-syntax
> >
> > EXISTING BUGS:
> >
> > "Language support for literal numbers in binary
> and other bases"
> >
> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=5025288
> >
> > URL FOR PROTOTYPE (optional):
> >
> > None.
> >
> >


      



More information about the coin-dev mailing list