Pointer Decay in C++

“Arrays are really just pointers”.

This is one of the biggest “gotchas” out there. It’s one that, at the end of the day, really isn’t even that big of a deal. You could live happily for the rest of your life convinced that arrays and pointers are the exact same thing (though I wouldn’t recommend it). But arrays aren’t pointers. They decay into pointers. Plus you can’t change where they point to, but that really isn’t the point here..

Before I go on, I want to pick this very obvious spot to point out that everything I’m about to explain was pulled almost entirely from this thread on gamedev, including most of the program below (with other, clearer versions in the thread). Reading through that post will probably benefit you much more than reading through this one, especially the parts on the extra indirection after a function call towards the end, which won’t be mentioned here.

Anyway, take the following code:

#include <iostream>
#include <string>

static const size_t ARRAY_SIZE=5;

void display_array_info(int size, const std::string &title)
{
	std::cout << title << std::endl;
	std::cout << "Array size: " << size << std::endl;
}

template <typename T, size_t U>
void reference(const T (&some_array)[U])
{
	display_array_info(sizeof(some_array), "By Reference:");
}
template <typename T, size_t U>
void value(const T some_array[])
{
	display_array_info(sizeof(some_array), "By \"Value\":");
}

template <typename U>
void pointer(const U* const some_array)
{
	display_array_info(sizeof(some_array), "By Pointer:");
}

int main(int argc, char *argv[]) {
	int integer_array[ARRAY_SIZE] = { 2, 4, 6, 8, 10 };

	std::cout << "From Main:" << std::endl;
	std::cout << "Array size: " << sizeof(integer_array) << std::endl;

	pointer<int>(integer_array);
	value<int, ARRAY_SIZE>(integer_array);
	reference<int, ARRAY_SIZE>(integer_array);

	std::cin.get();

	return 0;
}

Which, for me, will output:

From Main:
Array size: 20
By Pointer:
Array size: 4
By "Value":
Array size: 4
By Reference:
Array size: 20

(the array takes up 20 bytes in total…)

So what exactly is going on here?

First, the code is a mess, so we’ll review it and pick out the important parts.

The main purpose of the program is to declare some array and run it through the sizeof() operator. Next, we’ll pass the array to a few different functions–one by passing a pointer to the array, one by passing the array using the array notation, and one by passing a reference to the array. Each of these functions will print the sizeof the array as well.

Our function declarations look like this:

void reference(const T (&some_array)[U])
void value(const T some_array[])
void pointer(const U* const some_array)

If you don’t like templates, just replace T with some type (int, in this case) and U with the array size (5 in this case). No big deal.

Each of these functions is named after the way the array is passed in. The reference function takes a reference to the array, the pointer function takes a pointer, and the ‘value’ function uses the array syntax.

So why, then, does the reference version return the correct size information, but not the other two?

The sizeof operator

You probably are familiar with the sizeof operator. Many of the data types in C++ vary from compiler to compiler. An int might be 32 bits on your machine, but it might be 64 bits on your neighbors. sizeof(int) will let you know for sure.

What we’re interested in is figuring out how sizeof gets this information. The C/C++ crowds love their “don’t pay for what you don’t use” mantra, so that pretty much rules out any ideas relating to size information being stored for a particular variable. Really, all the compiler does is look back at the declaration of a variable, figure out what size that variable is, and then throws that value at you.

So for this array:

char some_array[5];

sizeof(some_array) will cause the compiler to look at how some_array was declared. some_array was declared as a char[5], so sizeof(some_array) becomes sizeof(char[5]), and since a char is a single byte, sizeof(char[5]) returns 5.

“Okay you dumbass”, you think to yourself. “Why would you make me read all of this crap to explain something so obvious?”

Pointer Decay

When we pass an array through a pointer, sizeof loses the ability to figure out this size information:

void value(const T some_array[])
void pointer(const U* const some_array)

Given just these two function declarations and no other code, could YOU figure out how big the passed in array was? In fact, in the case of the “pointer” function, could you even figure out that for sure that an array was being passed? What happens, then, is that the array decays into a pointer. Calling sizeof on some_array will simply return the sizeof that pointer. sizeof(some_array) turns into sizeof(T*).

Also, passing using the some_array[] syntax is essentially the same as the pointer syntax. It was included mostly for completeness.

When we pass by reference, though, we’re required to send along the array size as well. This makes our function declaration:

void reference(const T (&some_array)[U])

Using this information, sizeof is able to figure out the actual size of the array (U * sizeof(T)).

Seriously, cares?

What are the advantages, then, of this new method of passing crap around? Really, it’s hard to say. Usually when you pass an array to a function, you also pass along the size of the array so that the function will know where it ends. When passing an array by reference, people start getting the idea that they no longer need to do that–they can just throw in a few sizeof operators, maybe a division, and be good to go.

Useful? Debatable. This is one of those areas where you sort of have to make your own call. So, to help you out with that, I offer up another gamedev thread.

Good luck!