Application Binary Interface (ABI)

ABI is an interface between two binary program modules, usually one of them is the library and one of them is the user program. It is the assumptions the compiler makes about how exactly the bit-for-bit representation and the usage of computer’s actual hardware resources when it negotiates things that lie outside a singular routine. This includes:

  • the position, order, and layout of memory in struct or class
  • the argument types and return types of a function (for C++, apply to any relevant overloaded functions too)
  • the special method in class (C++ only)
  • hierarchy and ordering of virtual function (C++ only)

Every platform has their own ABI standard, and different compiler or vendor may have different ABI implementation.

Since C doesn’t mangle the assembly symbols (change the function name in assembly code), this causes a lot of troubles in the ABI. If there are two libraries or implementation files have the same function name with the same number of arguments but having different return types or argument types, there is no guarantee on which function should be linked by the linker. The following examples show how could such incident happen:

// do_stuff.c
__int128_t do_stuff(__int128_t value) {
  if (value < 0) {
    return 0;
  }
  return value;
}
// main.c
#include <limits.h>

extern long long do_stuff(long long value); // the compiler will not warn this

int main() {
  long long x = do_stuff(-LLONG_MAX); // memory violation as the ABI uses the 128-bit version
  /* wow cool stuff with x ! */
  if (x != 0)
    return 1;
  return 0;
}

This is the reason a lot of features in C and C++ are introduced with the prefix _ or __ which is particularly declared by the committee as reserved identifiers. In the book Modern C, the author recommends using the library name as the prefix of functions and variables, e.g., in Pthread case. It is to avoid ABI violation that could not be detected by the compilers and linkers, each has their own implementation on how to handle the standard.

An old solution used by compilers is by giving the programmer the control over what should the function be named in the assembly code, i.e., changing the function’s assembly label (sometime called binary symbol).

Another solution proposed by Meneide 1 is to introduce transparent alias with the keyword _Alias 2 . Programmers can change what should the function to be named in the source code level without changing the underlying assembly label for that function. That being said, this could increase the binary size of the program due to the duplication of the functions.

The shortcomings of these solutions, however, is that they don’t allow the linkage against the old version of library if the program is build with newer version of it. This is because the old version of library doesn’t have the symbols that the newer version of it. We can say that the library doesn’t have forward compatibility.

Footnotes
1.
To Save C, We Must Save ABI, Meneide J., March 13, 2022,
2.
N2970 Transparent Aliases, Meneide J., May 05, 2022,
#c #cpp