Simple rules to avoid Memory Leaks in C


Below are several short tips which will help you survive C programming. I’ll be adding and amending this from time to time.

The most terrifying aspect of the C programming language is closely related to it’s core strength. Programming in C is all about using pointers – which are memory locations. Thus, in C, when you need a variable to work with to store numbers and manipulate them, you don’t have to deal with the variable directly – you use it’s address, instead. This is, of course, a really stupid thing to do when you just have to use a single variable to store a single number. But it is an enormously useful strategy when sending that variable off to a distant procedure, or when dealing with large arrays.

The great strength of the C programming language is in this use of pointers. You can add them, multiply them, crunch them, and control them. They are at your hands. But with great power comes great responsibility. The trouble with the C programming language is that it is all too easy to misuse the power endowed upon you by the compiler. A stray pointer might point to a wrong place. Sometimes it points at a very critical place in your computer, and carelessly manipulating the data where it points might lead to computer crashes, data corruption, and worse. There is also the danger of memory leaks. This happens when the programmer asks the system to hand him a block of computer memory, reserved for his uses, untouchable by other processes. The system returns the pointer to the beginning of the memory block, and reserves the block. The is done using the C command “malloc”. The programmer needs to return that pointer back to the system, so that the memory block can be returned to the general memory pool. This is done vie the appropriately named “free” procedure. Often enough, the programmer forgets to call “free”. Or, if he does, he is using the wrong address.

These problems are so scary for programmers and system administrators, that most modern-day programming languages no longer allow you to roam freely in the world of memory addresses. These days, writing C# or JAVA, the pointers are hidden by the compiler, so that the programmer does not use them directly. this seems to save the programmer some amount of head-ache. But I personally feel, each time I use either of these modern languages, like a person confined to a wheelchair. Sure, I can go places. But it is cumbersome and inconvenient.

So, assuming that you wish, like I do, to write good old C code without all the needless bickering, but you also wish to avoid memory leaks, here are some simple rules which will help you save face:

Always remember:

This is C. No one is going to clean up after you. Don’t make a mess. And if you do, make sure to clean it up.

Rule 1: Always write “free” just after “malloc”

This one should be trivial.

Suppose you want to manipulate an array of integers. You need to allocate them. So you write “malloc”. Don’t wait. Jump to the next line and free the memory.

int *p = (int*) malloc ( sizeof(int) * n );

free (p);

Now go in between and write your code:

int *p = (int*) malloc ( sizeof(int) * n );

// ... do stuff

free(p);

That way you never forget to free memory.

Rule 2: Never, ever, work with the allocated pointer. Use a copy!

Do not use the allocated pointer for doing stuff. Do not even use it to access memory. Always copy it first, and use only the copy in your code. And I’m speaking about copying the pointer not the array!

int *p_allocated = (int*) malloc ( sizeof(int) * n );
int *p_copy = p_allocated;
// do your stuff with p_copy, not with p_allocated!
// e.g.:
while (n--) { *p_copy++ = n; }
...
free (p_allocated);

Thus you are safe from [accidentally] changing the pointer and returning the wrong address over to free. You can freely play with the p_copy pointer, like in the example above.

Trust me, this small tip will save you endless trouble.

Rule 3: Don’t be parsimonious. Use more memory.

Always start by allocating more memory than you need.  After you finish debugging, go back and cut on memory use. If you need an array 1000 integers long, allocate 2000, and only after you make sure everything else is OK – only then go back and cut it down to 1000.

Rule 4: Always carry array length along with you

Wherever your array goes, there should go with it it’s length. A nice trick is to allocate an array sized n+1, and save n into it’s 0 place:

int *p_allocated = (int*) malloc ( sizeof(int) * (n+1) );
int *p_copy = p_allocated+1;
p_copy[-1] = n;
// do your stuff with p_copy, not with p_allocated!
free (p_allocated);

This way the array length is kept before the first place-holder of the array. This is somewhat dangerous, but it can make things easier. As reader Lucid Strap commented, this only works when we have array lengths not exceeding the type. In the example above, if int is 32 bits, this should be enough to get away with array sizes of no more than 4GB. (if you’re doing arrays larger than 4GB then you’re no novice). But if the data was in mere Byte-size, then an array of 257 chars would be the end of you. In that case, do this:

#define Max_Arr_Length sizeof(long long int)

_type *p_allocated = (_type*) malloc ( sizeof(_type) * (n+1) );
_type *p_copy = p_allocated + sizeof (Max_Arr_Len);
p_copy[-Max_Arr_Len] = n;
// do your stuff with p_copy, not with p_allocated!
// array length would sit at p_copy[-Max_Arr_Len]
free (p_allocated);

There are simpler ways to do it, and some ways could be more complicated. I wound’t go there. In any case you should remember that the size of a pointer could vary – there are different memory models for legacy systems. And the maximum memory size changes slowly through the decades. What I suggest above would be easy to manage, but still not entirely architecture-independent. But, really, if you come to a point where you should bother yourself with maintaining architecture-independent code (if you’re writing linux kernal patch), then you’re already a pro and you don’t need my advice anymore…

Rule 5: Be consistent. And save comments

The most important thing is to be consistent and to write down what you do. I am always amazed at how many programmers seem to think that comments are a waste of time. They are imperative. Without comments, you probably won’t remember what you did. Imagine returning to your code a year after you wrote it, and spending  countless hour trying to recall what that index does. Better to spend a couple of seconds writing it down.

Also, if you are consistent, you will not fail often. Always use the same mechanism for passing arrays and pointers. Don’t change the way you do things lightly. If you decide to use my previous trick, use it everywhere, or you might find yourself referring back to a nonexistent place because you forgot what type of reference you chose.

Valgrind

I once worked at a start-up company, where we maintained our own memory-management utilities Don’t. there are ,marvelous tools for profiling an memory checking, written by people who know how to maintain this sort of code. Use them. Valrind is one such. Use it. Note that you’ll need to compile without optimization flags. Add those later, after you made sure the code is correct.

Advertisements

11 thoughts on “Simple rules to avoid Memory Leaks in C

  1. kinda

    hi,
    the rule 2 seems wrong.

    int *p_copy = p_allocated; //only copy the first element in the array.

    Reply
  2. mousomer

    No. Note the difference:

    int p = val;

    defines p to be an integer, and inserts (integer) val into p.

    int p = *adrs;

    defines p to be an integer, and inserts the (integer) value from the address adrs points to, into p.

    When you write

    int* p

    then p is no longer an integer. It is a pointer. So:

    int p_copy = p_allocated; // define p_copy as int, and copy (integer) value from p_allocated into it.

    int p_copy = *p_allocated; // define p_copy as int and copy first element from p_allocated into it.

    int * p_copy = p_allocated; // define p_copy as pointer to integer, and copy p_allocated (an address) into it – this is the correct use.

    int * p_copy = *p_allocated; // a dangerous error: put the value of p[0] as an address into pointer p_copy.

    Reply
  3. Gidi

    All is good and well BUT there is a fundamental error in your code in tip #2…

    1. In the “malloc” call the functions return value needs to be casted into the recieving pointer type – i.e – There should be an asterix in the casting..

    int *p_allocated = (int*) malloc ( sizeof(int) * n ); // Mind the difference, an asterix after the int in the casting pharen..

    2. The assigning of ” int *p_copy = p_allocated; ” simply assigns the RHS value in the LHS, thus actually STILL working on the original array and will modify it (what you were trying to avoid). To copy the entire array try something like:

    int *p_allocated = (int*) malloc ( sizeof(int) * n );
    int *p_copy = (int*) malloc ( sizeof(int) * n );

    for(int i = 0; i < n; i++)
    {
    *(p_copy + i) = *(p_allocated + i);
    }

    which will actually copy the whole of the array and enable you to modify the copy without wirry….

    Reply
    1. mousomer Post author

      1. Got me there… amended. Thanks.
      2. You missed the point. Is it written better now? The whole idea is that you have good reasons to play with the pointer to an array. Sometimes you might change it by mistake (like writing p=something when you wanted *p=something). By working with a copy of the pointer, you save the original memory address no matter what manipulation you did on the way.

      Reply
  4. Lucid Strap

    I do like rule 4, But I have two problems with it.

    1) It assumes that the variable type of your array is suitable to store the length of the array.

    For example, if you have an array of signed chars, their range is -128 to 127. Suppose that array is 1000 elements long… uh oh!

    2) If memory is tight, I mean really tight, you might end up wasting a few bytes.

    For example, if you have an array of long doubles which ,lets say, are 12 bytes each on the target platform. That array might be 200 elements long. So your storing a value which can be covered by an unsigned char (0-255) which is just 1 byte….. wasting 11 bytes!

    I think it’s an interesting approach, and should be known and used where appropriate, but for C beginners/intermediates like myself, they might just read it and go ahead an implement it without thinking of possible consequences, which is bad practice. That’s their responsibility at the end of the day, but I think you should note those points in your article.

    Thanks for a great read though!

    Reply
    1. Omer Moussaffi

      thanks.

      1. That’s true, of course. The correct way to do it is to save sizeof(_type) bytes, and use a cast whenever you need the size. But that would probably be more cumbersome that what you’d probably need. If you really want to do it right, you also need to divide the size of pointer by the size of the current type – and then you get to the shady area of knowing your architecture – because on some systems, pointer length is not constant. And don’t even get me started on function pointers! I’ll add a simple workaround.

      2. You can’t really help but waste some of that space. How do you refrain from saving the array length? How do you run a FOR loop without knowing when to stop? You need to put the length number somewhere. The method I describe is basically what people are doing when they write embedded systems / drivers. I do suggest, strongly, that if you can spend an extra 12 bytes, use rule 2 and work with a pointer copy. It’s usually not those 12 bytes which will kill your app. It’s them 1MB per loop from forgetting to free a buffer.

      Reply
  5. Pingback: Memory Leak and Valgrind | LEFTBRIDGE

  6. Roland Harrison

    “But I personally feel, each time I use either of these modern languages, like a person confined to a wheelchair. Sure, I can go places. But it is cumbersome and inconvenient.”

    Do not agree at all as I feel hamstrung when I have to write in C, a better way to write that sentence is C#/java is like a wheel chair with jumpjet rockets and a wingsuit attached, sure you can’t walk but you can fly!

    Reply
    1. mousomer Post author

      I understand the sentiment. It is a personal thing, although when writing web apps, people don’t use C. And when I do string logic, I go to C++ streams.

      But, when I need to walk – I don’t want to waste my time with jetpacks. When I’m doing back-end job implementing algorithms on data files, there is no good reason to run away from low level C.

      It’s both a matter of personal taste and a question of what application you’re doing.

      Reply

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s