Buffer overflows: Reasons to apply mundane-sounding software fixes

Topher Kessler MacFixIt Editor

Topher, an avid Mac user for the past 15 years, has been a contributing author to MacFixIt since the spring of 2008. One of his passions is troubleshooting Mac problems and making the best use of Macs and Apple hardware at home and in the workplace.

See full bio

Topher Kessler

Oct. 14, 2010 11:00 a.m. PT

5 min read

Many times when updates or security patches appear for the OS or applications, a common reason for the update is that an attacker can make the program execute arbitrary code, usually by tricking a user into opening a "maliciously crafted" file, be it for a document, a movie, or an elaborate Web page containing scripts and images. This wording has come to sound rather mundane and uninformative to many users, who usually just install the available patches without having an idea of what is going on.

The programming language used for writing most OS X programs is Objective-C, but others such as C and C++ may also be used and as with any programming language there are quirks in these which allow for various inadvertent faults to appear if the programmer is not careful.

One of these faults is called a "Buffer Overflow," which allows a program to write outside of a designated allocation of memory and result in odd behavior, including crashes and in some circumstances the execution of a hacker's code.

Buffer overflows are very easy to do in some programming languages. For instance, take a look at the following code in C (it's very simple to understand):

Here is the program in Xcode (click for larger view).

#include <stdio.h>

int main (int argc, const char * argv[]) {
     char word[10];
     scanf("%s",word);
     printf("The word is: %s", word);
     return 0;
}

This is a very simple "C" program that will run in the Terminal and perform four tasks. It will create a character storage variable named "word", ask you to enter a value for this variable, print out the value of this variable to the Terminal window, and then return a "0" value to the Terminal. It's very crudely similar to the concept of the clipboard, in that there is a place to store the data, a way to enter it, and an output for the data.

char word[10];

This line declares a character variable that we've named "word," which is 10 bytes long so it can can hold 10 characters (one character is one byte). Each of the 10 bytes is an individual RAM unit (or "memory address") that will be used to hold the values stored in the "word" variable. Allocating it in this way allows the memory addresses for this variable to be easily tracked, and as such forms a 10-byte "buffer" for the storage variable.

scanf("%s",word);

This line tells the program to "scan" (ask the user) for input, so you can enter some characters to store in the "word" variable. When the program runs, if you keep the number of characters under 10 then there will be no problem, but if you enter more than 10 characters you will go over the allocated 10-byte memory limit, and may result in undesired behavior when the program tries to save the additional characters to memory.

Entering a large string of characters for the program caused it to unexpectedly quit, and invoke the OS X crash reporter.

For instance, entering "ThisWorks" (less than 10 characters) results in the program continuing to function normally, but if you enter "ThisIsMoreThanTenCharacters" or a similar lengthy string of characters, the program may abort with an error and spur the crash reporter to generate a crash log for submission.

What happened here is the program tried to write more characters to the "word" variable than what can fit in the allocated buffer size, so we ran into a "buffer overflow." In some instances this can happen without a problem, but other times it can result in odd behavior, including crashes and hangs.

When the large string of characters was entered, the program did not complete and instead received an abort signal from the system.

Besides crashes and hangs, buffer overflows can be potentially dangerous since they allow the opportunity in some circumstances to inject foreign code into memory and have it be executed.

One way this can happen is via the "return address," which is a section of memory that is located very near the memory being used to store other variables for the function and therefore subject to its contents being potentially overwritten by a buffer overflow.

If you look at the whole program above, the last line is a "return" line, which means it will pass the designated value back to the Terminal or other program that launched it.

A classic example of where this is useful is if you have a function in a program that adds two numbers (a and b), you can have it ask for "a" and ask for "b," and then return "a+b" directly--simple addition.

When such a function runs in C, all of the memory it uses will be organized in a "stack" based on how they are accessed during the function's execution. The local variable addresses (such as those for "a" and "b," or "word" in the program above), are located at the top of the stack and are used by the function, with variables that are accessed by parent functions (e.g., the "return address") being at the bottom, but still located near the local variables.

When a function is finished running, the local memory addresses will be released, with the return address being one of the last to be released but not before the information stored in it is passed to other programs or parent functions. Therefore, unlike the local variables whose values are freed, the value that is "returned" by a function is kept persistent in memory so it can be accessed by a parent program once the function is done performing its computation.

When a buffer overflow occurs, in some cases the overflow data can overwrite the "return address" for the function, and then be passed back to the main program unintentionally. In more elaborate programs that have larger buffers the overflow data can be large enough to include or at least point to logical code in memory that may be executable. If a programmer knows how to take advantage of this in the main parent program, then that programmer may be able to get the injected code to execute.

One way to prevent this type of problem is to implement bounds checking in a C program that will ensure a buffer's proper use and prevent data from being written to it if the data is too large. Unfortunately the C programming language and compilers do not have built-in bounds-checking routines, so it is up to the programmer to implement these and ensure that the code works properly.

When standalone programs from Microsoft, Adobe, Apple, as well as the Mac OS components like QuickTime are updated, if they mention "arbitrary code execution" as the reason for the update, this is the type of problem they are rooting out and fixing. Granted it will usually take user input and trickery by a hacker to result in any harm from these vulnerabilities, but patching them is the safest way to go.

Questions? Comments? Have a fix? Post them below or e-mail us!
Be sure to check us out on Twitter and the CNET Mac forums.

Computing Guides

Laptops

Desktops & Monitors

Computer Accessories

Photography

Tablets & E-Readers

3D Printers

Computing Coupons