pwnlib.elf.elf — ELF Files

Exposes functionality for manipulating ELF files

Stop hard-coding things! Look them up at runtime with pwnlib.elf.

Example Usage

>>> e = ELF('/bin/cat')
>>> print hex(e.address) 
0x400000
>>> print hex(e.symbols['write']) 
0x401680
>>> print hex(e.got['write']) 
0x60b070
>>> print hex(e.plt['write']) 
0x401680

You can even patch and save the files.

>>> e = ELF('/bin/cat')
>>> e.read(e.address+1, 3)
'ELF'
>>> e.asm(e.address, 'ret')
>>> e.save('/tmp/quiet-cat')
>>> disasm(file('/tmp/quiet-cat','rb').read(1))
'   0:   c3                      ret'

Module Members

class pwnlib.elf.elf.ELF(path, checksec=True)[source]

Bases: elftools.elf.elffile.ELFFile

Encapsulates information about an ELF file.

Example

>>> bash = ELF(which('bash'))
>>> hex(bash.symbols['read'])
0x41dac0
>>> hex(bash.plt['read'])
0x41dac0
>>> u32(bash.read(bash.got['read'], 4))
0x41dac6
>>> print bash.disasm(bash.plt.read, 16)
0:   ff 25 1a 18 2d 00       jmp    QWORD PTR [rip+0x2d181a]        # 0x2d1820
6:   68 59 00 00 00          push   0x59
b:   e9 50 fa ff ff          jmp    0xfffffffffffffa60
asm(address, assembly)[source]

Assembles the specified instructions and inserts them into the ELF at the specified address.

This modifies the ELF in-pace. The resulting binary can be saved with ELF.save()

bss(offset=0) → int[source]
Returns:Address of the .bss section, plus the specified offset.
checksec(banner=True)[source]

Prints out information in the binary, similar to checksec.sh.

Parameters:banner (bool) – Whether to print the path to the ELF binary.
debug(argv=[], *a, **kw) → tube[source]

Debug the ELF with gdb.debug().

Parameters:
  • argv (list) – List of arguments to the binary
  • *args – Extra arguments to gdb.debug()
  • **kwargs – Extra arguments to gdb.debug()
Returns:

tube – See gdb.debug()

disable_nx()[source]

Disables NX for the ELF.

Zeroes out the PT_GNU_STACK program header p_type field.

disasm(address, n_bytes) → str[source]

Returns a string of disassembled instructions at the specified virtual memory address

dynamic_by_tag(tag) → tag[source]
Parameters:tag (str) – Named DT_XXX tag (e.g. 'DT_STRTAB').
Returns:elftools.elf.dynamic.DynamicTag
dynamic_string(offset) → bytes[source]

Fetches an enumerated string from the DT_STRTAB table.

Parameters:offset (int) – String index
Returns:str – String from the table as raw bytes.
dynamic_value_by_tag(tag) → int[source]

Retrieve the value from a dynamic tag a la DT_XXX.

If the tag is missing, returns None.

fit(address, *a, **kw)[source]

Writes fitted data into the specified address.

See: packing.fit()

flat(address, *a, **kw)[source]

Writes a full array of values to the specified address.

See: packing.flat()

static from_assembly(assembly) → ELF[source]

Given an assembly listing, return a fully loaded ELF object which contains that assembly at its entry point.

Parameters:
  • assembly (str) – Assembly language listing
  • vma (int) – Address of the entry point and the module’s base address.

Example

>>> e = ELF.from_assembly('nop; foo: int 0x80', vma = 0x400000)
>>> e.symbols['foo'] = 0x400001
>>> e.disasm(e.entry, 1)
'  400000:       90                      nop'
>>> e.disasm(e.symbols['foo'], 2)
'  400001:       cd 80                   int    0x80'
static from_bytes(bytes) → ELF[source]

Given a sequence of bytes, return a fully loaded ELF object which contains those bytes at its entry point.

Parameters:
  • bytes (str) – Shellcode byte string
  • vma (int) – Desired base address for the ELF.

Example

>>> e = ELF.from_bytes('\x90\xcd\x80', vma=0xc000)
>>> print(e.disasm(e.entry, 3))
    c000:       90                      nop
    c001:       cd 80                   int    0x80
get_machine_arch()[source]

Return the machine architecture, as detected from the ELF header. Not all architectures are supported at the moment.

get_section_by_name(name)[source]

Get a section from the file, by name. Return None if no such section exists.

iter_segments_by_type(t)[source]
Yields:Segments matching the specified type.
num_sections()[source]

Number of sections in the file

num_segments()[source]

Number of segments in the file

offset_to_vaddr(offset) → int[source]

Translates the specified offset to a virtual address.

Parameters:offset (int) – Offset to translate
Returns:int – Virtual address which corresponds to the file offset, or None.

Examples

This example shows that regardless of changes to the virtual address layout by modifying ELF.address, the offset for any given address doesn’t change.

>>> bash = ELF('/bin/bash')
>>> bash.address == bash.offset_to_vaddr(0)
True
>>> bash.address += 0x123456
>>> bash.address == bash.offset_to_vaddr(0)
True
p16(address, data, *a, **kw)[source]

Writes a 16-bit integer data to the specified address

p32(address, data, *a, **kw)[source]

Writes a 32-bit integer data to the specified address

p64(address, data, *a, **kw)[source]

Writes a 64-bit integer data to the specified address

p8(address, data, *a, **kw)[source]

Writes a 8-bit integer data to the specified address

pack(address, data, *a, **kw)[source]

Writes a packed integer data to the specified address

process(argv=[], *a, **kw) → process[source]

Execute the binary with process. Note that argv is a list of arguments, and should not include argv[0].

Parameters:
  • argv (list) – List of arguments to the binary
  • *args – Extra arguments to process
  • **kwargs – Extra arguments to process
Returns:

process

read(address, count) → bytes[source]

Read data from the specified virtual address

Parameters:
  • address (int) – Virtual address to read
  • count (int) – Number of bytes to read
Returns:

A str object, or None.

Examples

The simplest example is just to read the ELF header.

>>> bash = ELF(which('bash'))
>>> bash.read(bash.address, 4)
'\x7fELF'

ELF segments do not have to contain all of the data on-disk that gets loaded into memory.

First, let’s create an ELF file has some code in two sections.

>>> assembly = '''
... .section .A,"awx"
... .global A
... A: nop
... .section .B,"awx"
... .global B
... B: int3
... '''
>>> e = ELF.from_assembly(assembly, vma=False)

By default, these come right after eachother in memory.

>>> e.read(e.symbols.A, 2)
'\x90\xcc'
>>> e.symbols.B - e.symbols.A
1

Let’s move the sections so that B is a little bit further away.

>>> objcopy = pwnlib.asm._objcopy()
>>> objcopy += [
...     '--change-section-vma', '.B+5',
...     '--change-section-lma', '.B+5',
...     e.path
... ]
>>> subprocess.check_call(objcopy)
0

Now let’s re-load the ELF, and check again

>>> e = ELF(e.path)
>>> e.symbols.B - e.symbols.A
6
>>> e.read(e.symbols.A, 2)
'\x90\x00'
>>> e.read(e.symbols.A, 7)
'\x90\x00\x00\x00\x00\x00\xcc'
>>> e.read(e.symbols.A, 10)
'\x90\x00\x00\x00\x00\x00\xcc\x00\x00\x00'

Everything is relative to the user-selected base address, so moving things around keeps everything working.

>>> e.address += 0x1000
>>> e.read(e.symbols.A, 10)
'\x90\x00\x00\x00\x00\x00\xcc\x00\x00\x00'
save(path=None)[source]

Save the ELF to a file

>>> bash = ELF(which('bash'))
>>> bash.save('/tmp/bash_copy')
>>> copy = file('/tmp/bash_copy')
>>> bash = file(which('bash'))
>>> bash.read() == copy.read()
True
search(needle, writable = False) → generator[source]

Search the ELF’s virtual address space for the specified string.

Notes

Does not search empty space between segments, or uninitialized data. This will only return data that actually exists in the ELF file. Searching for a long string of NULL bytes probably won’t work.

Parameters:
  • needle (str) – String to search for.
  • writable (bool) – Search only writable sections.
Yields:

An iterator for each virtual address that matches.

Examples

An ELF header starts with the bytes \x7fELF, so we sould be able to find it easily.

>>> bash = ELF('/bin/bash')
>>> bash.address + 1 == next(bash.search('ELF'))
True

We can also search for string the binary.

>>> len(list(bash.search('GNU bash'))) > 0
True
section(name) → bytes[source]

Gets data for the named section

Parameters:name (str) – Name of the section
Returns:str – String containing the bytes for that section
string(address) → str[source]

Reads a null-terminated string from the specified address

Returns:A str with the string contents (NUL terminator is omitted), or an empty string if no NUL terminator could be found.
u16(address, *a, **kw)[source]

Unpacks an integer from the specified address.

u32(address, *a, **kw)[source]

Unpacks an integer from the specified address.

u64(address, *a, **kw)[source]

Unpacks an integer from the specified address.

u8(address, *a, **kw)[source]

Unpacks an integer from the specified address.

unpack(address, *a, **kw)[source]

Unpacks an integer from the specified address.

vaddr_to_offset(address) → int[source]

Translates the specified virtual address to a file offset

Parameters:address (int) – Virtual address to translate
Returns:int – Offset within the ELF file which corresponds to the address, or None.

Examples

>>> bash = ELF(which('bash'))
>>> bash.vaddr_to_offset(bash.address)
0
>>> bash.address += 0x123456
>>> bash.vaddr_to_offset(bash.address)
0
>>> bash.vaddr_to_offset(0) is None
True
write(address, data)[source]

Writes data to the specified virtual address

Parameters:
  • address (int) – Virtual address to write
  • data (str) – Bytes to write

Note

This routine does not check the bounds on the write to ensure that it stays in the same segment.

Examples

>>> bash = ELF(which('bash'))
>>> bash.read(bash.address+1, 3)
'ELF'
>>> bash.write(bash.address, "HELO")
>>> bash.read(bash.address, 4)
'HELO'
address[source]

Address of the lowest segment loaded in the ELF.

When updated, the addresses of the following fields are also updated:

However, the following fields are NOT updated:

Example

>>> bash = ELF('/bin/bash')
>>> read = bash.symbols['read']
>>> text = bash.get_section_by_name('.text').header.sh_addr
>>> bash.address += 0x1000
>>> read + 0x1000 == bash.symbols['read']
True
>>> text == bash.get_section_by_name('.text').header.sh_addr
True
Type:int
arch = None[source]

Architecture of the file (e.g. 'i386', 'arm').

See: ContextType.arch

Type:str
asan[source]

Whether the current binary was built with Address Sanitizer (ASAN).

Type:bool
aslr[source]

Whether the current binary is position-independent.

Type:bool
bits = 32[source]

Bit-ness of the file

Type:int
build = None[source]

Linux kernel build commit, if this is a Linux kernel image

Type:str
buildid[source]

GNU Build ID embedded into the binary

Type:str
bytes = 4[source]

Pointer width, in bytes

Type:int
canary[source]

Whether the current binary uses stack canaries.

Type:bool
config = None[source]

Linux kernel configuration, if this is a Linux kernel image

Type:dict
data[source]

Raw data of the ELF file.

See:
get_data()
Type:str
dwarf[source]

DWARF info for the elf

elftype[source]

ELF type (EXEC, DYN, etc)

Type:str
endian = 'little'[source]

Endianness of the file (e.g. 'big', 'little')

Type:str
entry[source]

Address of the entry point for the ELF

Type:int
entrypoint[source]

Address of the entry point for the ELF

Type:int
execstack[source]

Whether the current binary uses an executable stack.

This is based on the presence of a program header PT_GNU_STACK being present, and its setting.

PT_GNU_STACK

The p_flags member specifies the permissions on the segment containing the stack and is used to indicate wether the stack should be executable. The absense of this header indicates that the stack will be executable.

In particular, if the header is missing the stack is executable. If the header is present, it may explicitly mark that the stack is executable.

This is only somewhat accurate. When using the GNU Linker, it usees DEFAULT_STACK_PERMS to decide whether a lack of PT_GNU_STACK should mark the stack as executable:

/* On most platforms presume that PT_GNU_STACK is absent and the stack is
 * executable.  Other platforms default to a nonexecutable stack and don't
 * need PT_GNU_STACK to do so.  */
uint_fast16_t stack_flags = DEFAULT_STACK_PERMS;

By searching the source for DEFAULT_STACK_PERMS, we can see which architectures have which settings.

$ git grep '#define DEFAULT_STACK_PERMS' | grep -v PF_X
sysdeps/aarch64/stackinfo.h:31:#define DEFAULT_STACK_PERMS (PF_R|PF_W)
sysdeps/nios2/stackinfo.h:31:#define DEFAULT_STACK_PERMS (PF_R|PF_W)
sysdeps/tile/stackinfo.h:31:#define DEFAULT_STACK_PERMS (PF_R|PF_W)
Type:bool
executable = None[source]

True if the ELF is an executable

executable_segments[source]

List of all segments which are executable.

See:
ELF.segments
Type:list
file = None[source]

Open handle to the ELF file on disk

Type:file
fortify[source]

Whether the current binary was built with Fortify Source (-DFORTIFY).

Type:bool
functions = {}[source]

dotdict of name to Function for each function in the ELF

got = {}[source]

dotdict of name to address for all Global Offset Table (GOT) entries

libc[source]

If this ELF imports any libraries which contain 'libc[.-], and we can determine the appropriate path to it on the local system, returns a new ELF object pertaining to that library.

If not found, the value will be None.

Type:ELF
library = None[source]

True if the ELF is a shared library

linker = None[source]

Path to the linker for the ELF

memory = None[source]

IntervalTree which maps all of the loaded memory segments

mmap = None[source]

Memory-mapped copy of the ELF file on disk

Type:mmap.mmap
msan[source]

Whether the current binary was built with Memory Sanitizer (MSAN).

Type:bool
native = None[source]

Whether this ELF should be able to run natively

non_writable_segments[source]

List of all segments which are NOT writeable.

See:
ELF.segments
Type:list
nx[source]

Whether the current binary uses NX protections.

Specifically, we are checking for READ_IMPLIES_EXEC being set by the kernel, as a result of honoring PT_GNU_STACK in the kernel.

The Linux kernel directly honors PT_GNU_STACK to mark the stack as executable.

case PT_GNU_STACK:
    if (elf_ppnt->p_flags & PF_X)
        executable_stack = EXSTACK_ENABLE_X;
    else
        executable_stack = EXSTACK_DISABLE_X;
    break;

Additionally, it then sets read_implies_exec, so that all readable pages are executable.

if (elf_read_implies_exec(loc->elf_ex, executable_stack))
    current->personality |= READ_IMPLIES_EXEC;
Type:bool
os = None[source]

Operating system of the ELF

packed[source]

Whether the current binary is packed with UPX.

Type:bool
path = '/path/to/the/file'[source]

Path to the file

Type:str
pie[source]

Whether the current binary is position-independent.

Type:bool
plt = {}[source]

dotdict of name to address for all Procedure Linkate Table (PLT) entries

relro[source]

Whether the current binary uses RELRO protections.

This requires both presence of the dynamic tag DT_BIND_NOW, and a GNU_RELRO program header.

The ELF Specification describes how the linker should resolve symbols immediately, as soon as a binary is loaded. This can be emulated with the LD_BIND_NOW=1 environment variable.

DT_BIND_NOW

If present in a shared object or executable, this entry instructs the dynamic linker to process all relocations for the object containing this entry before transferring control to the program. The presence of this entry takes precedence over a directive to use lazy binding for this object when specified through the environment or via dlopen(BA_LIB).

(page 81)

Separately, an extension to the GNU linker allows a binary to specify a PT_GNU_RELRO program header, which describes the region of memory which is to be made read-only after relocations are complete.

Finally, a new-ish extension which doesn’t seem to have a canonical source of documentation is DF_BIND_NOW, which has supposedly superceded DT_BIND_NOW.

DF_BIND_NOW

If set in a shared object or executable, this flag instructs the dynamic linker to process all relocations for the object containing this entry before transferring control to the program. The presence of this entry takes precedence over a directive to use lazy binding for this object when specified through the environment or via dlopen(BA_LIB).

>>> path = pwnlib.data.elf.relro.path
>>> for test in glob(os.path.join(path, 'test-*')):
...     e = ELF(test)
...     expected = os.path.basename(test).split('-')[2]
...     actual = str(e.relro).lower()
...     assert actual == expected
Type:bool
rpath[source]

Whether the current binary has an RPATH.

Type:bool
runpath[source]

Whether the current binary has a RUNPATH.

Type:bool
rwx_segments[source]

List of all segments which are writeable and executable.

See:
ELF.segments
Type:list
sections[source]

A list of elftools.elf.sections.Section objects for the segments in the ELF.

Type:list
segments[source]

A list of elftools.elf.segments.Segment objects for the segments in the ELF.

Type:list
start[source]

Address of the entry point for the ELF

Type:int
statically_linked = None[source]

True if the ELF is statically linked

sym[source]

Alias for ELF.symbols

Type:dotdict
symbols = {}[source]

dotdict of name to address for all symbols in the ELF

ubsan[source]

Whether the current binary was built with Undefined Behavior Sanitizer (UBSAN).

Type:bool
version = None[source]

Linux kernel version, if this is a Linux kernel image

Type:tuple
writable_segments[source]

List of all segments which are writeable.

See:
ELF.segments
Type:list
class pwnlib.elf.elf.Function(name, address, size, elf=None)[source]

Encapsulates information about a function in an ELF binary.

Parameters:
  • name (str) – Name of the function
  • address (int) – Address of the function
  • size (int) – Size of the function, in bytes
  • elf (ELF) – Encapsulating ELF object
address = None[source]

Address of the function in the encapsulating ELF

elf = None[source]

Encapsulating ELF object

name = None[source]

Name of the function

size = None[source]

Size of the function, in bytes

class pwnlib.elf.elf.dotdict[source]

Wrapper to allow dotted access to dictionary elements.

Is a real dict object, but also serves up keys as attributes when reading attributes.

Supports recursive instantiation for keys which contain dots.

Example

>>> x = pwnlib.elf.elf.dotdict()
>>> isinstance(x, dict)
True
>>> x['foo'] = 3
>>> x.foo
3
>>> x['bar.baz'] = 4
>>> x.bar.baz
4