I just saw a cool optimization by a C compiler. This is assembly for the Freescale HCS08 series.
TST var
BEQ loc_52
CLRA
CPHX #$a601
loc_52:
STA var
One thing sticks out: the seemingly unused result of the CPHX
(status flags are overwritten later in the code, before any
conditionals). And after fixing my custom HCS08 disassembler, I
noticed loc_52
is actually in the middle of the CPHX
instruction. I double checked the instruction decoding, instruction
lengths, branch offsets. All looked good. But that 0xA601 immediate
value is interesting. Decoded, it would mean LDA #1
, which makes
sense here, since it’s about to be stored in var
. But no, the CLRA
and CPHX
machine codes are correct…
Fast forward an hour or two, and it dawned on me how clever this is. This code inverts a boolean flag. So in C, it would be
if (var) var = 0;
else var = 1;
The straight-forward assembly code translation of this would be
TST var
BEQ loc_53
CLRA
BRA loc_55
loc_53:
LDA #1
loc_55:
STA var
Imagine we go the CLRA
execution path. Then we want to skip the
LDA
instruction. HC08 doesn’t have a conditional skip or conditional
load. But what if we could make the LDA
instruction not doing
anything? E.g. by turning it into a comparison operation instead of a
load. That means one byte overhead instead of the branche’s two. Saved
a byte, in every !a
expression.
So CPHX
is used for two-byte instructions. I don’t think HC08 has
instructions with three bytes arguments, so the trick wouldn’t work
for longer instructions. In the code I’m reverse engineering, I
haven’t found any one-byte instructions being disabled using the same
technique.
Cute.