Why aren't variable-length arrays part of the C++ standard?


Question

I haven't used C very much in the last few years. When I read this question today I came across some C syntax which I wasn't familiar with.

Apparently in C99 the following syntax is valid:

void foo(int n) {
    int values[n]; //Declare a variable length array
}

This seems like a pretty useful feature. Was there ever a discussion about adding it to the C++ standard, and if so, why it was omitted?

Some potential reasons:

  • Hairy for compiler vendors to implement
  • Incompatible with some other part of the standard
  • Functionality can be emulated with other C++ constructs

The C++ standard states that array size must be a constant expression (8.3.4.1).

Yes, of course I realize that in the toy example one could use std::vector<int> values(m);, but this allocates memory from the heap and not the stack. And if I want a multidimensional array like:

void foo(int x, int y, int z) {
    int values[x][y][z]; // Declare a variable length array
}

the vector version becomes pretty clumsy:

void foo(int x, int y, int z) {
    vector< vector< vector<int> > > values( /* Really painful expression here. */);
}

The slices, rows and columns will also potentially be spread all over memory.

Looking at the discussion at comp.std.c++ it's clear that this question is pretty controversial with some very heavyweight names on both sides of the argument. It's certainly not obvious that a std::vector is always a better solution.

1
297
5/23/2017 12:18:22 PM

Accepted Answer

There recently was a discussion about this kicked off in usenet: Why no VLAs in C++0x.

I agree with those people that seem to agree that having to create a potential large array on the stack, which usually has only little space available, isn't good. The argument is, if you know the size beforehand, you can use a static array. And if you don't know the size beforehand, you will write unsafe code.

C99 VLAs could provide a small benefit of being able to create small arrays without wasting space or calling constructors for unused elements, but they will introduce rather large changes to the type system (you need to be able to specify types depending on runtime values - this does not yet exist in current C++, except for new operator type-specifiers, but they are treated specially, so that the runtime-ness doesn't escape the scope of the new operator).

You can use std::vector, but it is not quite the same, as it uses dynamic memory, and making it use one's own stack-allocator isn't exactly easy (alignment is an issue, too). It also doesn't solve the same problem, because a vector is a resizable container, whereas VLAs are fixed-size. The C++ Dynamic Array proposal is intended to introduce a library based solution, as alternative to a language based VLA. However, it's not going to be part of C++0x, as far as I know.

184
9/2/2015 9:11:04 PM

(Background: I have some experience implementing C and C++ compilers.)

Variable-length arrays in C99 were basically a misstep. In order to support VLAs, C99 had to make the following concessions to common sense:

  • sizeof x is no longer always a compile-time constant; the compiler must sometimes generate code to evaluate a sizeof-expression at runtime.

  • Allowing two-dimensional VLAs (int A[x][y]) required a new syntax for declaring functions that take 2D VLAs as parameters: void foo(int n, int A[][*]).

  • Less importantly in the C++ world, but extremely important for C's target audience of embedded-systems programmers, declaring a VLA means chomping an arbitrarily large chunk of your stack. This is a guaranteed stack-overflow and crash. (Anytime you declare int A[n], you're implicitly asserting that you have 2GB of stack to spare. After all, if you know "n is definitely less than 1000 here", then you would just declare int A[1000]. Substituting the 32-bit integer n for 1000 is an admission that you have no idea what the behavior of your program ought to be.)

Okay, so let's move to talking about C++ now. In C++, we have the same strong distinction between "type system" and "value system" that C89 does… but we've really started to rely on it in ways that C has not. For example:

template<typename T> struct S { ... };
int A[n];
S<decltype(A)> s;  // equivalently, S<int[n]> s;

If n weren't a compile-time constant (i.e., if A were of variably modified type), then what on earth would be the type of S? Would S's type also be determined only at runtime?

What about this:

template<typename T> bool myfunc(T& t1, T& t2) { ... };
int A1[n1], A2[n2];
myfunc(A1, A2);

The compiler must generate code for some instantiation of myfunc. What should that code look like? How can we statically generate that code, if we don't know the type of A1 at compile time?

Worse, what if it turns out at runtime that n1 != n2, so that !std::is_same<decltype(A1), decltype(A2)>()? In that case, the call to myfunc shouldn't even compile, because template type deduction should fail! How could we possibly emulate that behavior at runtime?

Basically, C++ is moving in the direction of pushing more and more decisions into compile-time: template code generation, constexpr function evaluation, and so on. Meanwhile, C99 was busy pushing traditionally compile-time decisions (e.g. sizeof) into the runtime. With this in mind, does it really even make sense to expend any effort trying to integrate C99-style VLAs into C++?

As every other answerer has already pointed out, C++ provides lots of heap-allocation mechanisms (std::unique_ptr<int[]> A = new int[n]; or std::vector<int> A(n); being the obvious ones) when you really want to convey the idea "I have no idea how much RAM I might need." And C++ provides a nifty exception-handling model for dealing with the inevitable situation that the amount of RAM you need is greater than the amount of RAM you have. But hopefully this answer gives you a good idea of why C99-style VLAs were not a good fit for C++ — and not really even a good fit for C99. ;)


For more on the topic, see N3810 "Alternatives for Array Extensions", Bjarne Stroustrup's October 2013 paper on VLAs. Bjarne's POV is very different from mine; N3810 focuses more on finding a good C++ish syntax for the things, and on discouraging the use of raw arrays in C++, whereas I focused more on the implications for metaprogramming and the typesystem. I don't know if he considers the metaprogramming/typesystem implications solved, solvable, or merely uninteresting.


Licensed under: CC-BY-SA with attribution
Not affiliated with: Stack Overflow
Icon