ELF (Executable and Linkable Format) is the most common object and executable file format on Unix-like systems.

Most people don’t need to care about ELF until production bites:

  • ldd looks fine, but the binary still fails with not found
  • you hit undefined symbol after a library upgrade
  • you get wrong ELF class when a 64-bit binary tries to load a 32-bit library (or vice versa)
  • you’re handed a random binary and need to answer: static or dynamic? which loader? which segments?

Once you understand the ELF structure, the chain of compile -> link -> load -> run becomes debuggable.

A 3-minute triage path for dynamic linking issues

Rule of thumb: ldd is a simulation under the current environment, not a perfect guarantee of runtime behavior.

When you see “library not found / wrong version loaded”, walk this path:

  1. Confirm platform/arch:
file ./a.out
readelf -h ./a.out | head
  1. Check what the binary declares as dependencies (NEEDED):
readelf -d ./a.out | grep NEEDED
  1. Inspect runtime search hints (RPATH/RUNPATH):
readelf -d ./a.out | grep -E 'RPATH|RUNPATH'
  1. Ask the loader to explain itself:
LD_DEBUG=libs,files ./a.out 2>&1 | head -n 80
  1. Compare with system cache and env:
ldconfig -p | grep <libname>
echo $LD_LIBRARY_PATH

The rest of the post explains why these knobs exist and how sections/segments relate to what the loader does.

The two views of ELF

ELF contains two parallel descriptions of the same binary:

  • Sections (linker view): used during compilation and static linking — code, data, symbols, relocations.
  • Segments (program headers, loader view): what the kernel’s ELF loader maps into memory with permissions (R/W/X).

A simplified mapping:

Sections (linker)              Segments (loader)
.text                          LOAD (R-X)
.data + .bss                   LOAD (RW-)
.symtab / .strtab / .rel.*     not loaded (only for linkers/debuggers)

There are three ELF types — REL (relocatable .o), EXEC (executable), and DYN (shared object). The distinction matters at load time: the kernel handles EXEC and DYN differently (DYN requires the dynamic linker; EXEC doesn’t).

Relocatable file example: ELF Header + Sections

A relocatable object (.o) header excerpt from readelf -h:

ELF Header:
  Class:                             ELF64
  Data:                              2's complement, little endian
  Type:                              REL (Relocatable file)
  Machine:                           Advanced Micro Devices X86-64
  Entry point address:               0x0
  Start of section headers:          0x2c0 (bytes into file)
  Number of section headers:         12

A readelf -S excerpt (key columns only):

[Nr] Name      Type      Addr   Off    Size   ES Flg Lk Inf Al
[ 1] .text     PROGBITS  0000   0040   0038   00 AX  0  0  16
[ 2] .rel.text REL       0000   0210   0018   08     6  1  8
[ 3] .data     PROGBITS  0000   0080   0020   00 WA  0  0  8
[ 4] .bss      NOBITS    0000   00a0   0010   00 WA  0  0  8
[ 5] .symtab   SYMTAB    0000   0240   00f0   18     6  8  8
[ 6] .strtab   STRTAB    0000   0330   0048   00     0  0  1

Notes:

  • Addr: Load address (often 0 in relocatable files; fixed by the linker).
  • Off/Size: File offset and size.
  • Flg: Permissions (A=alloc, X=exec, W=write).

Object file layout (simplified)

0x0000  ELF Header
0x0040  .text
0x0080  .data
0x00a0  .bss (no file bytes)
0x0210  .rel.text
0x0240  .symtab
0x0330  .strtab
0x02c0  Section Header Table

Executable example: Program Headers and Segments

After linking, an executable includes Program Headers:

Program Headers:
  Type   Offset  VirtAddr  FileSiz MemSiz  Flg Align
  LOAD   0x0000  0x400000  0x0800  0x0800  R E 0x1000
  LOAD   0x1000  0x601000  0x0200  0x0300  RW  0x1000

Explanation:

  • First LOAD: Contains .text, permissions R-X.
  • Second LOAD: Contains .data + .bss, permissions RW-.
  • MemSiz > FileSiz usually means .bss occupies memory only.

Use readelf -l to view Section to Segment mapping and see how Sections are merged into Segments.

Relocation: turning placeholders into real addresses

Object files often contain placeholder addresses that the linker fixes using .rel.* entries.

A simplified example (pseudo-assembly):

mov    data_items(%rip), %rax   ; access a global array

In the object file the encoding may contain placeholders:

8b 04 bd 00 00 00 00

After linking it becomes a real address:

8b 04 bd a0 90 04 08

Corresponding relocation entry (excerpt):

Relocation section '.rel.text' contains 1 entry:
  Offset  Info   Type       Sym.Name
  0x0008  ...    R_X86_64_32 data_items

Key idea: the linker patches specific offsets based on relocation tables.

A concrete walkthrough from object file to executable

Below is an example using a real 32-bit assembly program that finds the maximum of a set of integers. The goal is to show exactly how readelf/hexdump/objdump output changes from .o to executable.

Relocatable file (.o)

The object file has no Program Headers and its section addresses are all 0 (placeholders):

$ readelf -h max.o
ELF Header:
  Type:                              REL (Relocatable file)
  Entry point address:               0x0
  Start of section headers:          200 (bytes into file)
  Number of section headers:         8

$ readelf -S max.o
[Nr] Name      Type      Addr     Off    Size   ES Flg
[ 1] .text     PROGBITS  00000000 000034 00002a 00 AX
[ 2] .data     PROGBITS  00000000 000060 000038 00 WA
[ 3] .bss      NOBITS    00000000 000098 000000 00 WA
[ 4] .symtab   SYMTAB    00000000 000208 000080 10
[ 5] .strtab   STRTAB    00000000 000288 000028 00
[ 6] .rel.text REL       00000000 0002b0 000010 08

The .rel.text section tells the linker which offsets to patch:

Relocation section '.rel.text' at offset 0x2b0 contains 2 entries:
 Offset     Info    Type            Sym.Value  Sym. Name
00000008  00000201 R_386_32          00000000   .data
00000017  00000201 R_386_32          00000000   .data

These two offsets (0x08 and 0x17) point to the mov instructions that reference the data array. The linker will fill in the absolute address of .data at those positions.

The disassembly shows placeholders (0x00000000) where addresses will go:

$ objdump -d max.o
00000000 <_start>:
   0:   bf 00 00 00 00          mov    $0x0,%edi
   5:   8b 04 bd 00 00 00 00    mov    0x0(,%edi,4),%eax
   c:   89 c3                   mov    %eax,%ebx

After linking (executable)

After linking, the type becomes EXEC, addresses are filled in, Program Headers appear:

$ readelf -h max
  Type:                              EXEC (Executable file)
  Entry point address:               0x8048074
  Start of program headers:          52 (bytes into file)
  Start of section headers:          256 (bytes into file)

Program Headers:
  Type   Offset  VirtAddr   PhysAddr   FileSiz MemSiz Flg Align
  LOAD   0x0000  0x08048000 0x08048000 0x0009e 0x0009e R E 0x1000
  LOAD   0x00a0  0x080490a0 0x080490a0 0x00038 0x00038 RW  0x1000

The two LOAD segments map .text as R-X and .data as RW-. Notice the page-aligned base address (0x08048000) and the 4K alignment (0x1000).

The linked disassembly now has real addresses:

$ objdump -d max
08048074 <_start>:
 8048074: bf 00 00 00 00       mov    $0x0,%edi
 8048079: 8b 04 bd a0 90 04 08 mov    0x80490a0(,%edi,4),%eax
 8048080: 89 c3                mov    %eax,%ebx

The placeholder 0x00000000 in the mov instruction at offset 0x05 has been replaced with 0x080490a0 (the actual address of .data). The symbol table also changes from relative to absolute addresses:

Symbol table '.symtab':
   Num:    Value      Ndx Name
     4:   0x080490a0   2  data_items
     5:   0x08048082   1  start_loop
     7:   0x08048074   1  _start

What the linker changed (and what it didn’t)

Relative jump instructions (like je, jle, jmp) did not need patching because they encode a displacement relative to the current instruction pointer rather than an absolute address. Only the two mov instructions with absolute address operands were patched, as directed by the .rel.text entries.

In x86 Linux, segment base addresses are always 0, so the virtual address computed during addressing is always the linear address directly.

Shared libraries and PIC / GOT / PLT

Shared objects must load at arbitrary addresses, so they use PIC (position-independent code):

  • GOT (Global Offset Table): Stores real addresses of variables/functions.
  • PLT (Procedure Linkage Table): A jump stub used for lazy binding.

A typical PLT entry on x86:

push@plt:
  jmp    *GOT[push]        ; 1st time: jumps to next instruction
  push   $reloc_index      ; identify which function to resolve
  jmp    plt0              ; call dynamic linker

On x86-64 the pattern is similar but uses RIP-relative addressing.

Tracing lazy binding with gdb

When a shared library function is called for the first time, the GOT entry points back into the PLT stub itself. Here is a simplified gdb trace of push@plt being called from main:

(gdb) start
Temporary breakpoint 1 at main
(gdb) si                          ; step into call push@plt
0x080483d8 in push@plt ()
(gdb) x/gx 0x804a008              ; examine GOT entry for push
0x804a008: 0x080483de             ; points to next PLT instruction!
(gdb) si                          ; jmp *GOT[push] -> falls through
0x080483de in push@plt ()
(gdb) si                          ; push reloc_index
0x080483e3 in push@plt ()
(gdb) si                          ; jmp plt0 -> dynamic linker
0x080483a8 in ?? ()
(gdb) si
0xb806a080 in ?? () from /lib/ld-linux.so.2
(gdb) finish                      ; let dynamic linker resolve push
Run till exit from ld-linux.so.2
main () at main.c:8
(gdb) x/gx 0x804a008              ; examine GOT entry again
0x804a008: 0xb803f47c             ; now points to real push()!
(gdb) x/5i 0xb803f47c
0xb803f47c <push>: push  %ebp     ; the actual function body

After resolution, subsequent calls to push@plt jump directly to the real push() function without involving the dynamic linker again.

One more real-world gotcha: “I installed the library, why can’t it be found?”

Three common causes:

  1. The file exists but isn’t in default search paths (for example /usr/local/lib without ldconfig / ld.so.conf.d).
  2. Multiple copies of the same SONAME: RUNPATH/RPATH and LD_LIBRARY_PATH may load a different one than you expect.
  3. ABI/arch mismatch: 64-bit vs 32-bit (wrong ELF class).

If you just want the truth fast, LD_DEBUG=libs is usually quicker than guessing.

x86 addressing formula

Memory addressing in x86 follows this general format:

ADDRESS_OR_OFFSET(%BASE_OR_OFFSET,%INDEX,MULTIPLIER)

The resulting address:

FINAL = ADDRESS_OR_OFFSET + BASE_OR_OFFSET + MULTIPLIER × INDEX

In Linux, segment base addresses are always 0, so this computed address is the linear address directly (before page translation).

References