Problems caused by raw arrays in C++

05 Aug 2020 - John Z. Li

All basic types in C++, (bool, char, integer, floating-point number, enum, pointers, union), except references, have trivial copy semantics, which can easily generalized to compound types that consists of basic types. The copy assignment operator, move constructor and move assignment operator, , can be consistently generated by the compiler with well-defined semantics. That is, copying an object of a class consisting of basic types means a shallow copy, and moving such an object is like copying but the “moved-from” object should not be used again. For the reason that the reference types (the lvalue reference type and rvalue reference type) are mainly used in function calls, and seldom used in constructing more complex types, they are deliberately omitted here. (A class contains a reference type probably implies a design mistake in most cases.)

Semantically, an array is just like a struct, except that

Conceptually, arrays should not be so special. A well-designed language should treat arrays just like a special case of structs, and the semantics of structs naturally applies to arrays. If this was the case in C++, it would result in a much clearer language:

  1. A pointer to an array behaves the same as a pointer to a struct.
  2. Passing an array to a function is just like passing a struct, nothing special;
  3. Returning an array from a function is just like returning a struct, nothing special.
  4. new an array is just like new a struct. You would not need new[] and delete[] anymore. A whole lot of headache would be removed from the language. For example, you don’t have to worry about delete a pointer when you really should use delete[] because the chunk of memory is allocated with new[].
  5. Basic types have six relational operators ( ==, <, <=, >, >=, != ) naturally defined with them. When programmers use basic types as building blocks to form more advanced types, the six relational operators can be automatically generalized by performing lexicographical comparison of their corresponding elements. Recall that C strings are just null terminated char arrays. This means you would be able directly compare them with operator <, <=, > >=. (You would not need strcmp. Isn’t that neat?) A lot of Math-related code would also benefit from it.
  6. Using make_shared and make_unique with arrays.
  7. etc..

The semantics of the array type in C, from which, C++ inherited, is a convenient mistake. The result is that arrays don’t play well with many parts of C++. Want to store an array in a STL container? No. Want to use array in a std::tuple, std::variant, std::optional, std::any? NO, NO, NO, NO.

As an ugly fix, one could use std::array instead in new C++ code. The sad news is that std::array is a class template (meaning long compilation time), and std::array<T, N> is a bit verbose.