PROPOSAL: Unsigned Integer Widening Operator

Reinier Zwitserloot reinier at zwitserloot.com
Wed Mar 25 04:39:26 PDT 2009


Noel, Bruce:

Perhaps Noel has a point, here. imagine:

public class ByteUtils {
     public int b2i(byte b) {
         return b & 0xFF;
     }
}

import static ByteUtils.b2i;

results = (result << 8) | b2i(contents[i]);

vs:

results = (result << 8) | (+)contents[i];


The second bit is hardly better, and it suffers from cast ambiguity if  
you add more fluff to the expression. Once you also add parens to  
localize the cast like so:

results = (result << 8) | ((+)contents[i]);

The static import starts to win, in my book, and that's before  
considering the impact of a language change.



did you not read Bruce's introduction to his three proposals? It  
provides some useful background information.

If you've ever worked with bytes in java, you may remember that the  
main issue is crippling wordiness. A librar
  --Reinier Zwitserloot
Like it? Tip it!
http://tipit.to



On Mar 25, 2009, at 11:30, Noel Grandin wrote:

>
> I would think that a utility class that performed widening operations,
> and maybe some other useful stuff would be preferable to extending the
> language/
>
> Perhaps a BinaryMath class to live alongside java.lang.Math?
>
> Regards, Noel.
>
> Bruce Chapman wrote:
>> Title
>> Unsigned Integer Widening Operator
>>
>> latest html version at http://docs.google.com/Doc?id=dcvp3mkv_2k39wt5gf&hl
>>
>> AUTHOR(S): Bruce Chapman
>>
>> OVERVIEW
>>
>>
>>
>> FEATURE SUMMARY:
>> Add an unsigned widening operator to convert bytes (in particular),
>> shorts, and chars (for completeness) to int while avoiding sign  
>> extension.
>>
>>
>> MAJOR ADVANTAGE:
>>
>> Byte manipulation code becomes littered with (b & 0xFF) expressions  
>> in
>> order to reverse the sign extension that occurs when a byte field or
>> variable or array access appears on either side of an operator and is
>> thus subject to widening conversion with its implicit sign extension.
>> This masking with 0xFF can detract from the clarity of the code by
>> masking the actual algorithm. It is the Java Language's rules and not
>> the algorithm itself that demand this masking operation which can  
>> appear
>> to be a redundant operation to the uninitiated.
>>
>>
>> It is highly intentional that the new operator (+) can be read as a  
>> cast
>> to a positive.
>>
>>
>>
>> MAJOR DISADVANTAGE:
>> A new operator.
>>
>>
>> ALTERNATIVES:
>>
>> explicit masking. If java.nio.ByteBuffer was extensible (it isn't)
>> unsigned get and set methods could be added to hide the masking in an
>> API. Extension methods could be employed to that end if they were
>> implemented.
>>
>> EXAMPLES
>> SIMPLE EXAMPLE:
>>
>>            byte[] buffer = ...; int idx=...; int length=...;
>>
>>            int value=0;
>>
>>            for(int i = idx; i < idx + length; i++) {
>>                value = (value << 8) | (buffer[i] & 0xff);
>>            }
>>
>> can be recoded as
>>
>>
>>            for(int i = idx; i < idx + length; i++) {
>>                value = (value << 8) | (+)buffer[i];
>>            }
>>
>>
>> ADVANCED EXAMPLE:
>>
>>    private int getBerValueLength(byte[] contents, int idx) {
>>        if((contents[idx] & 0x80) == 0) return contents[idx];
>>        int lenlen = (+)contents[idx] ^ 0x80;  // invert high bit  
>> which = 1
>>        int result=0;
>>        for(int i = idx+1; i < idx + 1 + lenlen; i++ ) {
>>            result = (result << 8) | (+)contents[i];
>>        }
>>        return result;
>>    }
>>
>> DETAILS
>> SPECIFICATION:
>>
>> amend  15.15
>>
>> The unary operators include +, -, ++, --, ~, !, unsigned integer
>> widening operator and cast operators.
>>
>>
>>
>> add the following to the grammars in 15.15
>>
>>
>>
>> The following productions from §new section are repeated here for
>> convenience:
>>
>>
>> UnsignedWideningExpression:
>>
>>    UnsignedIntegerWideningOperator UnaryExpression
>>
>>
>> UnsignedIntegerWideningOperator:
>>
>>        ( + )
>>
>>
>>
>> Add a new section to 15 - between "15.15 Unary Expressions" and  
>> "15.16
>> Cast Expressions" would seem ideal in terms of context and  
>> precedence level.
>>
>>
>> The unsigned integer widening operator is a unary operator which  
>> may be
>> applied to expressions of type byte, short and char. It is a compile
>> time error to apply this operator to other types.
>>
>>
>> UnsignedWideningExpression:
>>
>>    UnsignedIntegerWideningOperator UnaryExpression
>>
>>
>> UnsignedIntegerWideningOperator:
>>
>>        ( + )
>>
>>
>> The unsigned integer widening operator converts its operand to type  
>> int.
>> Unary numeric promotion (§) is NOT performed on the operand. For a  
>> byte
>> operand, the lower order 8 bits of the resultant have the same  
>> values as
>> in the operand. For short and char operands, the resultant's lower  
>> order
>> 16 bits have the same value as the operand's. The remaining high  
>> order
>> bits are set to zero. This is effectively a zero extend widening
>> conversion and is equivalent to the following expression for byte  
>> operand x,
>>
>>    x & 0xFF
>>
>> and equivalent to the following for a short or char operand y
>>
>>    y & 0xFFFF
>>
>>
>> Other sections have lists of operators for which various things  
>> apply.
>> Add to these as appropriate - yet to be determined.
>>
>> Note the specification above could also be ammended to allow the
>> operator to zero extend an int to a long, however the utility value  
>> of
>> this is uncertain.
>> COMPILATION:
>>
>> Compilation may be equivalent to the masking operation above.   
>> Hotspot
>> could detect the redundant sign extend followed by masking out the  
>> sign
>> extended bits and remove both. If that were the case the operator  
>> could
>> be applied to every access of a byte field, variable or array to
>> indicate treatment as unsigned byte, with no cost.
>>
>>
>> For a char, the operator is equivalent to a widening conversion to  
>> int.
>> The new operator is permitted on a char expression because there is  
>> no
>> reason to disallow it. However it would be equally effective if it  
>> did
>> not apply to char.
>>
>>
>> TESTING:
>>
>> There are no gnarly use cases, so testing is straight forward. It  
>> could
>> be as simple as compiling and executing a main class with a handful  
>> of
>> asserts.
>>
>> LIBRARY SUPPORT:
>>
>> No library support required.
>>
>> REFLECTIVE APIS:
>>
>> None
>>
>> OTHER CHANGES:
>>
>> none foreseen
>>
>> MIGRATION:
>>
>> COMPATIBILITY
>> BREAKING CHANGES:
>>
>> None
>>
>> EXISTING PROGRAMS:
>>
>> Tools could detect the specific masking operations used to zero  
>> extend a
>> previously sign extended byte or short, and replace that with the new
>> operator.
>>
>> REFERENCES
>> EXISTING BUGS:
>>
>> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4186775
>>
>> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4879804
>>
>> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4504839
>>
>>
>> URL FOR PROTOTYPE (optional):
>> None
>>
>>
>>
>>
>
>
> Disclaimer: http://www.peralex.com/disclaimer.html
>
>
>




More information about the coin-dev mailing list