I just saw a cool optimization by a C compiler. This is assembly for the Freescale HCS08 series.
TST var
BEQ loc_52
CLRA
CPHX #$a601
loc_52:
STA var
One thing sticks out: the seemingly unused result of the CPHX
(status flags are overwritten later in the code, before any
conditionals). And after fixing my custom HCS08 disassembler, I
noticed loc_52 is actually in the middle of the CPHX
instruction. I double checked the instruction decoding, instruction
lengths, branch offsets. All looked good. But that 0xA601 immediate
value is interesting. Decoded, it would mean LDA #1, which makes
sense here, since it’s about to be stored in var. But no, the CLRA
and CPHX machine codes are correct…
Fast forward an hour or two, and it dawned on me how clever this is. This code inverts a boolean flag. So in C, it would be
if (var) var = 0;
else var = 1;
The straight-forward assembly code translation of this would be
TST var
BEQ loc_53
CLRA
BRA loc_55
loc_53:
LDA #1
loc_55:
STA var
Imagine we go the CLRA execution path. Then we want to skip the
LDA instruction. HC08 doesn’t have a conditional skip or conditional
load. But what if we could make the LDA instruction not doing
anything? E.g. by turning it into a comparison operation instead of a
load. That means one byte overhead instead of the branche’s two. Saved
a byte, in every !a expression.
So CPHX is used for two-byte instructions. I don’t think HC08 has
instructions with three bytes arguments, so the trick wouldn’t work
for longer instructions. In the code I’m reverse engineering, I
haven’t found any one-byte instructions being disabled using the same
technique.
Cute.