Memory corruption vulnerabilities

The topic for this meeting was memory corruption vulnerabilities.

What is a memory corruption bug?

To quote Wikipedia:

Memory corruption occurs in a computer program when the contents of a memory location are modified due to programmatic behavior that exceeds the intention of the original programmer or program/language constructs; this is termed violating memory safety. The most likely cause of memory corruption is programming error.

It is the part “programmatic behavior that exceeds the intention of the original programmer” that is intresting to us.

Causes of memory corruption

There are several causes, but we focused on buffer overflows. The initial target for our discussion was the challenge “bof” at https://pwnable.kr/play.php and the goal for this session was to solve that challenge.

Example

Here is a small example to illustrate what a buffer overflow can look like. Usually we have no control over how variables are placed in memory, but we can explicitly control their layout using a struct.

#include <stdio.h>

int main() {

	struct {
		char a[8];
		int b;
		char extra[200];
	} locals;

	locals.b = 0x1337;

	while(!feof(stdin)) {
		gets(locals.a);
		printf("Value of b: %x\n", locals.b);
	}
	return 0;
}

Can you make the program

  • print something else than 1337?
  • print deadbeef ?
  • crash?

Hints: Look at the manpages of all functions in this program. Is there anything that indicates a risky behaviour?

Endianess

As a byte only can hold values between 0 and 255 inclusive, we need more bytes to describe larger numbers. Much like we use several digits to describe numbers larger than nine. There are several conventions on in wich order the bytes should be written to memory, this is called endianess.

Little endian means that the least significant byte is stored first in memory. The number 0x11223344 is described by the byte sequence [0x44, 0x33, 0x22, 0x11]. Most processors, including x86, use little endian internally.

In big endian, the most significant byte is written first, much like we humans treat decimal numbers. The number 0x11223344 is described by the byte sequence [0x11, 0x22, 0x33, 0x44]. Big endian is used in network communication protocols, and in a few processors.

Automating input

Non printable characters

We briefly discussed how to create non-printable characters using command line tools. The trick is to create an escaped sequence and let a program interpret that sequence.

The most common ways are to use bash or echo to interpret escape character.

In bash, words of the form $‘string’ expands backslash-escaped characters.

The -e flag tells echo to expand escape characters.

Bash is useful as an interpreter when you are giving data as an argument, like

./a.out $'\x43\x44\x45'

Whereas echo is useful for standard input.

echo -e "\x43\x44\x45" | ./a.out

Giving more data after generated data

A common trick to give user input after generated data is to use a pipeline.

(echo -e '\x41\x42\x43' ; cat )  | ./a.out

First the echo command will run and write to the pipe, and then after it has terminated the cat command will read data from standard input (the terminal) and write to the pipe.

Concluding actions

We ended the session by writing an actual exploit for the bof level at https://pwnables.kr and make it nice using the python framework pwntools.