Buffer
A Buffer
is temporary storage usually present in the physical memory used to hold data.
Consider the most useless program ever made shown on the left image where a character buffer of length 5 is defined. In a big cluster of memory, a small memory of 5 bytes would be assigned to the buffer which looks like the image on the right.
Buffer Overflow
A Buffer Overflow
occurs when more data is written to a specific length of memory such that adjacent memory addresses are overwritten.
DEMO(Controlling Local Variables):
Let’s take an example of a basic authentication app which asks for a password and returns Authenticated!
if the password is correct.
Without really knowing how the app works, let’s enter a random password.
It says Authentical Declined
since the password wasn’t correct. To test buffer overflow
, we need to enter large random data.
You must be wondering why it got authenticated and why there is a Segmentation Fault!
. Let’s see a more detailed version of the app.
As you can see, there are three variables: auth
, sys_pass
, and usr_pass
.
The auth
variable determines if the user is authenticated or not depending on the value(initially 0). The usr_pass
stores the password that the user enters and the sys_pass
variable is what the correct password is.
How the app works is if the usr_pass
variable is equal to sys_pass
then the auth
variable becomes 1
. If the auth
variable is not 0
, then the user is authenticated.
You may also see how the variables are stored in memory. Since the address is in hexadecimal
and there is a difference of 1 therefore, usr_pass
and sys_pass
variables are buffers of length 16.
To test for Buffer Overflow, a long password is entered as shown.
As you can see the password entered in usr_pass
variable overflows the sys_pass
variable and then the auth
variable.
Note: C functions like strcpy()
, strcmp()
, strcat()
do not check the length of the variable and can overwrite later memory addresses which is what precisely buffer overflow is.
Refer to the code below for better understanding.
#include <stdio.h>
int main(void) {
int auth = 0;
char sys_pass[16] = "Secret";
char usr_pass[16];
printf("Enter password: ");
scanf("%s", usr_pass);
if (strcmp(sys_pass, usr_pass) == 0) {
authorized = 1;
}
printf("usr_pass: %s\n", usr_pass);
printf("sys_pass: %s\n", sys_pass);
printf("auth: %d\n", authorized);
printf("sys_pass addr: %p\n", (void *)sys_pass);
printf("auth addr: %p\n", (void *)&authorized);
if (auth) {
printf("Authenticated!\n");
}
else{
printf("Authentication declined!\n");
}
}
Note: This might be the most unrealistic example and only meant for understanding purposes. You may not see such situations in real life.
Let’s dive a little deeper into the concepts now.
Important Concepts
Division of Memory for a Running Process
This is how the memory
assigned to a process
looks like. There are various sections like stack
, heap
, Uninitialized data
etc. used for different purposes.
You may read more about the memory layout here: Memory Layout of a Process.
This blog focuses on Buffer Overflow
in Stack
so let’s look at that.
- Stack: A LIFO data structure extensively used by computers in Memory management etc.
- There are a bunch of registers present in the memory amongst which we shall only be concerned about EIP, EBP, ESP.
- EBP: It’s a stack pointer which points to the base of the stack.
- ESP: It’s a stack pointer which points to the top of the stack.
- EIP: It contains the address of the next instruction to be executed.
Stack Layout
The above image shows how a stack
looks like. It might look intimidating but trust me, it isn’t. Let’s see some important points related to the stack
-
A stack is filled from higher memory to lower memory.
-
In a stack, all the variables are accessed relative to the EBP.
-
In a program, every function has its own stack.
-
Everything is referenced from the EBP register. Source: Link
-
Above the EBP,
function parameters
are stored.For example:
void foo(int a, int b, int c){ //Function body }
Here
a
,b
andc
being the function parameters are stored above the EBP. -
All the
local variables
of a function are stored below the EBP. -
The
Old %ebp
is the value of the EBP of the previous function. Since after a function is executed, it has to return back to an older function; therefore, we need to store the values of both old EBP and EIP. -
ESP register stores address of the bottom of the stack.
For example:
void foo(int a, int b, int c){ int x; int y; int z; }
Here
x
,y
,z
being local variables to the function are stored below the EBP.
Exploiting Buffer Overflow
It’s time to get into Buffer Overflow exploitation using stack.
Before that, let’s try to understand how a stack is built for any function.
Taking an example below:
The stack
on the right is of the function foo
as seen on the left image.
- Since
a
,b
andc
areparameters
passed to the function, therefore, it is stored above the EBP. Also because the stack is filled from higher to lower memory and parameters are read from right to left, therefore,c
is written first in the memory followed byb
anda
. x
,y
andz
being the local variables are stored below the EBP.- It is also required to store the
Old EIP
andOld EBP
of thefunction main
in the stack to know where to return after the function executes.
Now, as shown in the previous demo
, you could see how Buffer Overflow took place using the local variables.
Imagine a situation where you overflow
the variables x
,y
and z
in such a way that Old EIP is modified and stores the address of the memory where the malicious code
is placed.
Refer to the below image for better understanding.
Assume a buffer
of length 500 defined in a function. Now it is overflowed in such a way that it has some random data
, followed by the shellcode
(malicious code) and then the Return address
which points to the shellcode.
So after the function gets executed, the instruction pointed by the Return address
gets executed and this is how our shellcode gets executed.
This is pretty much how Buffer Overflow happens.
You must watch this video Buffer Overflow Attack - Computerphile to get a more realistic idea of Buffer Overflow.
The codes
used in the above video are present here.
Security Measures
- Use programming languages like Python, Java, Ruby in which
Dynamic Memory Allocation
takes place and, the language itself manages the memory for you. - In languages like C, C++ before writing data to a buffer perform all the relevant checks and
input validation
. - Before using any
external libraries
, check forsecurity vulnerabilities
in it. - Use
source code analysis tools
for static analysis against vulnerabilities. - Use
Non-executable Stack
: It means that even if a machine code is injected in the stack, it cannot be executed since that particular region of memory is non-executable. It is done by setting upNX
bit.
Note: Even after these measures are taken it might be possible to exploit Buffer Overflow. Therefore, these are just layers of security which can help to prevent exploitation of Buffer Overflow.
References
- Smashing The Stack For Fun And Profit
- Buffer Overflow Exploits and Countermeasures
- Buffer-Overflow Vulnerabilities and Attacks
Acknowledgement
This blog is based upon a talk given by me at the null monthly meet on 22nd June 2019
. A big thanks to Riddhi Shree for helping me out with the talk and providing me with appropriate resources. Without her that talk might have not been as good as it was.