Why do we need extern "C"{ #include } in C++?


Question

Why do we need to use:

extern "C" {
#include <foo.h>
}

Specifically:

  • When should we use it?

  • What is happening at the compiler/linker level that requires us to use it?

  • How in terms of compilation/linking does this solve the problems which require us to use it?

1
133
3/8/2019 8:56:54 PM

Accepted Answer

C and C++ are superficially similar, but each compiles into a very different set of code. When you include a header file with a C++ compiler, the compiler is expecting C++ code. If, however, it is a C header, then the compiler expects the data contained in the header file to be compiled to a certain format—the C++ 'ABI', or 'Application Binary Interface', so the linker chokes up. This is preferable to passing C++ data to a function expecting C data.

(To get into the really nitty-gritty, C++'s ABI generally 'mangles' the names of their functions/methods, so calling printf() without flagging the prototype as a C function, the C++ will actually generate code calling _Zprintf, plus extra crap at the end.)

So: use extern "C" {...} when including a c header—it's that simple. Otherwise, you'll have a mismatch in compiled code, and the linker will choke. For most headers, however, you won't even need the extern because most system C headers will already account for the fact that they might be included by C++ code and already extern their code.

121
4/24/2019 11:03:47 PM

extern "C" determines how symbols in the generated object file should be named. If a function is declared without extern "C", the symbol name in the object file will use C++ name mangling. Here's an example.

Given test.C like so:

void foo() { }

Compiling and listing symbols in the object file gives:

$ g++ -c test.C
$ nm test.o
0000000000000000 T _Z3foov
                 U __gxx_personality_v0

The foo function is actually called "_Z3foov". This string contains type information for the return type and parameters, among other things. If you instead write test.C like this:

extern "C" {
    void foo() { }
}

Then compile and look at symbols:

$ g++ -c test.C
$ nm test.o
                 U __gxx_personality_v0
0000000000000000 T foo

You get C linkage. The name of the "foo" function in the object file is just "foo", and it doesn't have all the fancy type info that comes from name mangling.

You generally include a header within extern "C" {} if the code that goes with it was compiled with a C compiler but you're trying to call it from C++. When you do this, you're telling the compiler that all the declarations in the header will use C linkage. When you link your code, your .o files will contain references to "foo", not "_Z3fooblah", which hopefully matches whatever is in the library you're linking against.

Most modern libraries will put guards around such headers so that symbols are declared with the right linkage. e.g. in a lot of the standard headers you'll find:

#ifdef __cplusplus
extern "C" {
#endif

... declarations ...

#ifdef __cplusplus
}
#endif

This makes sure that when C++ code includes the header, the symbols in your object file match what's in the C library. You should only have to put extern "C" {} around your C header if it's old and doesn't have these guards already.


Licensed under: CC-BY-SA with attribution
Not affiliated with: Stack Overflow
Icon