Java prefix and unary operators together

294 Views Asked by At

I was working on Java prefix operators and came across this behavior

i = +--j //does not give an error
i = -++j //does not give an error

i = ---j //gives an error
i = +++j //gives an error

Why is this happening?

3

There are 3 best solutions below

1
On

The compiler uses greedy left-to-right selection of tokens. So when it sees +--j, the longest sequence that is a valid token is +, since +- is not a valid token, so it takes + as the first token. Then it looks at the next largest thing that can be identified as a token, which is --j. So the result is + --j

For ---j it sees -- as the longest valid token, then -j as the next valid token, and tries to put those together as -- -j which, as @Mureinik pointed out, is not valid.

0
On

Before the compiler even gets to the point where it knows which operators are present, it must parse them. I can see 3 possible parsings for ---j:

  • - - -j // 3 unary - operators
  • -- -j // predecrement -- followed by unary -
  • - --j // unary - followed by predecrement --

The case with +++j is equivalent, with preincrement ++ and unary + substituted.

Why does Java interpret it as -- followed by -, the second case, which is the only one that isn't syntactically valid? The compiler is generally greedy. Section 3.2 of the JLS states:

A raw Unicode character stream is translated into a sequence of tokens, using the following three lexical translation steps, which are applied in turn:

...

The longest possible translation is used at each step, even if the result does not ultimately make a correct program while another lexical translation would. There is one exception: if lexical translation occurs in a type context (§4.11) and the input stream has two or more consecutive > characters that are followed by a non-> character, then each > character must be translated to the token for the numerical comparison operator >.

(bold emphasis mine)

The compiler greedily sees two - characters and immediately declares it to be the -- token, without considering the third - coming next.

This has nothing to do with operator associativity or even operator precedence and everything to do with the syntax parsing.

As has already been mentioned by @Mureinik, placing clarifying parentheses properly will compel the compiler to parse it correctly. But it accomplishes this by breaking up the characters into different tokens, not by changing the precedence of operations.

The expression -++j is unaffected by the greediness of the compiler; -+ isn't a valid token, so it is parsed correctly as the token - followed by the token ++, and similarly for the expression +--j.

1
On

Since both + and +++ (or - and --) are left-associative, +++j is evaluated as ++(+j). Since ++ can only be applied to an l-value (i.e., a variable) and +j is not an l-value (variable), you get a compilation error.

You could use parentheses to fix this, though: i = +(++j);.