sphaleron: September 2012

Caveat: Contents certainly platform-dependent -- these results are from GCC 4.6.1 (Ubuntu/Linaro 4.6.1-9ubuntu3). Your mileage may vary. If you get different results on other platforms, I'd love to hear about it in the comments!

Today I came across a wonderfully devious C++ gotcha, which tops my "evil C++" charts so far: failing to return x at the end of a non-void member function, you get as the return value the address of the instance (i.e. this), cast to the return type of the function.

Of course, one should always return something from a non-void function. But it's an easy mistake to make, due to typo or misconception.

Consider, for example:

Quite surprisingly, this compiles with no errors or warnings by default!

Enabling -Wreturn-type (which comes with -Wall) does get us this:

$ g++ -o null_member null_member.cpp -g -O0 -Wall
null_member.cpp: In member function ‘int foo::thinger()’:
null_member.cpp:5:24: warning: no return statement in function returning non-void [-Wreturn-type]

It seems like this should always be an error... there is never a time that this is a good idea.

In any case, the program outputs:

$ ./null_member
f @ 0x7fffc30e54cf
f.thinger() = c30e54cf

Clearly, nothing in the C++ code is doing this.

But having studied Apple IIe assembly for half a semester back in high school, I thought I'd try my luck with the GDB disassembler.

$ gdb ./null_member
Reading symbols from null_member...done.
(gdb) disassemble main
Dump of assembler code for function main(int, char**):
0x00000000004004f4 <+0>: push %rbp
0x00000000004004f5 <+1>: mov %rsp,%rbp
0x00000000004004f8 <+4>: sub $0x20,%rsp
0x00000000004004fc <+8>: mov %edi,-0x14(%rbp)
0x00000000004004ff <+11>: mov %rsi,-0x20(%rbp)
0x0000000000400503 <+15>: lea -0x1(%rbp),%rax
0x0000000000400507 <+19>: mov %rax,%rdi
0x000000000040050a <+22>: callq 0x400544 <foo::thinger()>
0x000000000040050f <+27>: mov %eax,-0x8(%rbp)
0x0000000000400512 <+30>: lea -0x1(%rbp),%rax
0x0000000000400516 <+34>: mov %rax,%rsi
0x0000000000400519 <+37>: mov $0x40063c,%edi
0x000000000040051e <+42>: mov $0x0,%eax
0x0000000000400523 <+47>: callq 0x4003f0 <printf@plt>
0x0000000000400528 <+52>: mov -0x8(%rbp),%eax
0x000000000040052b <+55>: mov %eax,%esi
0x000000000040052d <+57>: mov $0x400644,%edi
0x0000000000400532 <+62>: mov $0x0,%eax
0x0000000000400537 <+67>: callq 0x4003f0 <printf@plt>
0x000000000040053c <+72>: mov $0x0,%eax
0x0000000000400541 <+77>: leaveq
0x0000000000400542 <+78>: retq
End of assembler dump
(gdb) disassemble foo::thinger
Dump of assembler code for function foo::thinger():
0x0000000000400544 <+0>: push %rbp
0x0000000000400545 <+1>: mov %rsp,%rbp
0x0000000000400548 <+4>: mov %rdi,-0x8(%rbp)
0x000000000040054c <+8>: pop %rbp
0x000000000040054d <+9>: retq
End of assembler dump.

Compare the last bit to the assembly for a thinger that returns 42:

Dump of assembler code for function foo::thinger():
0x0000000000400544 <+0>: push %rbp
0x0000000000400545 <+1>: mov %rsp,%rbp
0x0000000000400548 <+4>: mov %rdi,-0x8(%rbp)
0x000000000040054c <+8>: mov $0x2a,%eax
0x0000000000400551 <+13>: pop %rbp
0x0000000000400552 <+14>: retq
End of assembler dump.

In main(), at instruction 0x40050f, register %eax is copied into -0x8(%rbp), or the location of variable i; the 32-bit accumulator register %eax is used to store the return value. Presumably this is a shortcut; writing to a known register is quicker than pushing a return value onto the stack.

In the latter thinger, 0x2a (42) is written to %eax at instruction 0x40054c, but in the former, we don't do anything. %eax is just whatever it happened to be... essentially an uninitialized variable.

In the case of normal (static) function call, this is true -- %eax is just some leftover junk. But for a method call, GCC generates the following:

0x00000000004004d2 <+30>: lea -0x1(%rbp),%rax
0x00000000004004d6 <+34>: mov %rax,%rdi
0x00000000004004d9 <+37>: callq 0x4004e6 <foo::thinger()>
0x00000000004004de <+42>: mov %eax,-0x8(%rbp)

lea (load equivalent address) copies the address of f (here, one byte before the base pointer %rbp) into %rax for temporary storage. Since %rax is a 64-bit wide register of which %eax comprises the lower half, %eax is left with part of the address of f, and that's what gets interpreted as the return value. The call to foo::thinger was expected to modify it, but didn't.

It's a wonderful piece of evil. It's a plausible typo bug which compiles without error and causes functions to return corrupt data. It depends on compiler- and machine-specific, assembly-level implementation details, invisible in the source code. Bug reports will vary by platform. And I have no evidence of this, but "no return means return NULL-ish" sure sounds like common C++ misconception.

I caught it because I happened to be returning a pointer, which caused a segfault on deference. But a float could easily go unnoticed, as could a pointer of the same class as self.

Happy coding, and beware of C++!

sphaleron

Thursday, September 27, 2012

Evil C++ 6: Default Method Return Values!?

About Me

Blog Archive