Problems caused by raw arrays in C++
05 Aug 2020 - John Z. Li
All basic types in C++, (bool, char, integer, floating-point number, enum, pointers, union), except references, have trivial copy semantics, which can easily generalized to compound types that consists of basic types. The copy assignment operator, move constructor and move assignment operator, , can be consistently generated by the compiler with well-defined semantics. That is, copying an object of a class consisting of basic types means a shallow copy, and moving such an object is like copying but the “moved-from” object should not be used again. For the reason that the reference types (the lvalue reference type and rvalue reference type) are mainly used in function calls, and seldom used in constructing more complex types, they are deliberately omitted here. (A class contains a reference type probably implies a design mistake in most cases.)
Semantically, an array is just like a struct, except that
- an array only contains elements of the same type.
- the elements of an array is accessed by array index, while the elements of a struct are accessed by field names.
Conceptually, arrays should not be so special. A well-designed language should treat arrays just like a special case of structs, and the semantics of structs naturally applies to arrays. If this was the case in C++, it would result in a much clearer language:
- A pointer to an array behaves the same as a pointer to a struct.
- Passing an array to a function is just like passing a struct, nothing special;
- Returning an array from a function is just like returning a struct, nothing special.
new
an array is just likenew
a struct. You would not neednew[]
anddelete[]
anymore. A whole lot of headache would be removed from the language. For example, you don’t have to worry aboutdelete
a pointer when you really should usedelete[]
because the chunk of memory is allocated withnew[]
.- Basic types have six relational operators ( ==, <, <=, >, >=, != ) naturally defined with them.
When programmers use basic types as building blocks to form more advanced types,
the six relational operators can be automatically generalized by performing
lexicographical comparison of their corresponding elements. Recall that C strings
are just null terminated char arrays. This means you would be able directly compare
them with operator
<, <=, > >=
. (You would not needstrcmp
. Isn’t that neat?) A lot of Math-related code would also benefit from it. - Using
make_shared
andmake_unique
with arrays. - etc..
The semantics of the array type in C, from which, C++ inherited, is a convenient mistake. The result is that arrays don’t play well with many parts of C++. Want to store an array in a STL container? No. Want to use array in a std::tuple, std::variant, std::optional, std::any? NO, NO, NO, NO.
As an ugly fix, one could use std::array
instead in new C++ code. The sad news is that
std::array
is a class template (meaning long compilation time), and std::array<T, N>
is a bit verbose.