ES.expr
ES.expr: Expressions
Expressions manipulate values.
ES.40: Avoid complicated expressions
Reason
Complicated expressions are error-prone.
Example
// bad: assignment hidden in subexpression
while ((c = getc()) != -1)
// bad: two non-local variables assigned in sub-expressions
while ((cin >> c1, cin >> c2), c1 == c2)
// better, but possibly still too complicated
for (char c1, c2; cin >> c1 >> c2 && c1 == c2;)
// OK: if i and j are not aliased
int x = ++i + ++j;
// OK: if i != j and i != k
v[i] = v[j] + v[k];
// bad: multiple assignments "hidden" in subexpressions
x = a + (b = f()) + (c = g()) * 7;
// bad: relies on commonly misunderstood precedence rules
x = a & b + c * d && e ^ f == 7;
// bad: undefined behavior
x = x++ + x++ + ++x;
Some of these expressions are unconditionally bad (e.g., they rely on undefined behavior). Others are simply so complicated and/or unusual that even good programmers could misunderstand them or overlook a problem when in a hurry.
Note
C++17 tightens up the rules for the order of evaluation (left-to-right except right-to-left in assignments, and the order of evaluation of function arguments is unspecified; see ES.43), but that doesn't change the fact that complicated expressions are potentially confusing.
Note
A programmer should know and use the basic rules for expressions.
Example
x = k * y + z; // OK
auto t1 = k * y; // bad: unnecessarily verbose
x = t1 + z;
if (0 <= x && x < max) // OK
auto t1 = 0 <= x; // bad: unnecessarily verbose
auto t2 = x < max;
if (t1 && t2) // ...
Enforcement
Tricky. How complicated must an expression be to be considered complicated? Writing computations as statements with one operation each is also confusing. Things to consider:
- side effects: side effects on multiple non-local variables (for some definition of non-local) can be suspect, especially if the side effects are in separate subexpressions
- writes to aliased variables
- more than N operators (and what should N be?)
- reliance of subtle precedence rules
- uses undefined behavior (can we catch all undefined behavior?)
- implementation defined behavior?
- ???
ES.41: If in doubt about operator precedence, parenthesize
Reason
Avoid errors. Readability. Not everyone has the operator table memorized.
Example
const unsigned int flag = 2;
unsigned int a = flag;
if (a & flag != 0) // bad: means a&(flag != 0)
Note: We recommend that programmers know their precedence table for the arithmetic operations, the logical operations, but consider mixing bitwise logical operations with other operators in need of parentheses.
if ((a & flag) != 0) // OK: works as intended
Note
You should know enough not to need parentheses for:
if (a < 0 || a <= max) {
// ...
}
Enforcement
- Flag combinations of bitwise-logical operators and other operators.
- Flag assignment operators not as the leftmost operator.
- ???
ES.42: Keep use of pointers simple and straightforward
Reason
Complicated pointer manipulation is a major source of errors.
Note
Use gsl::span
instead.
Pointers should only refer to single objects.
Pointer arithmetic is fragile and easy to get wrong, the source of many, many bad bugs and security violations.
span
is a bounds-checked, safe type for accessing arrays of data.
Access into an array with known bounds using a constant as a subscript can be validated by the compiler.
Example, bad
void f(int* p, int count)
{
if (count < 2) return;
int* q = p + 1; // BAD
ptrdiff_t d;
int n;
d = (p - &n); // OK
d = (q - p); // OK
int n = *p++; // BAD
if (count < 6) return;
p[4] = 1; // BAD
p[count - 1] = 2; // BAD
use(&p[0], 3); // BAD
}
Example, good
void f(span<int> a) // BETTER: use span in the function declaration
{
if (a.size() < 2) return;
int n = a[0]; // OK
span<int> q = a.subspan(1); // OK
if (a.size() < 6) return;
a[4] = 1; // OK
a[a.size() - 1] = 2; // OK
use(a.data(), 3); // OK
}
Note
Subscripting with a variable is difficult for both tools and humans to validate as safe.
span
is a run-time bounds-checked, safe type for accessing arrays of data.
at()
is another alternative that ensures single accesses are bounds-checked.
If iterators are needed to access an array, use the iterators from a span
constructed over the array.
Example, bad
void f(array<int, 10> a, int pos)
{
a[pos / 2] = 1; // BAD
a[pos - 1] = 2; // BAD
a[-1] = 3; // BAD (but easily caught by tools) -- no replacement, just don't do this
a[10] = 4; // BAD (but easily caught by tools) -- no replacement, just don't do this
}
Example, good
Use a span
:
void f1(span<int, 10> a, int pos) // A1: Change parameter type to use span
{
a[pos / 2] = 1; // OK
a[pos - 1] = 2; // OK
}
void f2(array<int, 10> arr, int pos) // A2: Add local span and use that
{
span<int> a = {arr.data(), pos};
a[pos / 2] = 1; // OK
a[pos - 1] = 2; // OK
}
Use at()
:
void f3(array<int, 10> a, int pos) // ALTERNATIVE B: Use at() for access
{
at(a, pos / 2) = 1; // OK
at(a, pos - 1) = 2; // OK
}
Example, bad
void f()
{
int arr[COUNT];
for (int i = 0; i < COUNT; ++i)
arr[i] = i; // BAD, cannot use non-constant indexer
}
Example, good
Use a span
:
void f1()
{
int arr[COUNT];
span<int> av = arr;
for (int i = 0; i < COUNT; ++i)
av[i] = i;
}
Use a span
and range-for
:
void f1a()
{
int arr[COUNT];
span<int, COUNT> av = arr;
int i = 0;
for (auto& e : av)
e = i++;
}
Use at()
for access:
void f2()
{
int arr[COUNT];
for (int i = 0; i < COUNT; ++i)
at(arr, i) = i;
}
Use a range-for
:
void f3()
{
int arr[COUNT];
int i = 0;
for (auto& e : arr)
e = i++;
}
Note
Tooling can offer rewrites of array accesses that involve dynamic index expressions to use at()
instead:
static int a[10];
void f(int i, int j)
{
a[i + j] = 12; // BAD, could be rewritten as ...
at(a, i + j) = 12; // OK -- bounds-checked
}
Example
Turning an array into a pointer (as the language does essentially always) removes opportunities for checking, so avoid it
void g(int* p);
void f()
{
int a[5];
g(a); // BAD: are we trying to pass an array?
g(&a[0]); // OK: passing one object
}
If you want to pass an array, say so:
void g(int* p, size_t length); // old (dangerous) code
void g1(span<int> av); // BETTER: get g() changed.
void f2()
{
int a[5];
span<int> av = a;
g(av.data(), av.size()); // OK, if you have no choice
g1(a); // OK -- no decay here, instead use implicit span ctor
}
Enforcement
- Flag any arithmetic operation on an expression of pointer type that results in a value of pointer type.
- Flag any indexing expression on an expression or variable of array type (either static array or
std::array
) where the indexer is not a compile-time constant expression with a value between0
and the upper bound of the array. - Flag any expression that would rely on implicit conversion of an array type to a pointer type.
This rule is part of the bounds-safety profile.
ES.43: Avoid expressions with undefined order of evaluation
Reason
You have no idea what such code does. Portability. Even if it does something sensible for you, it might do something different on another compiler (e.g., the next release of your compiler) or with a different optimizer setting.
Note
C++17 tightens up the rules for the order of evaluation: left-to-right except right-to-left in assignments, and the order of evaluation of function arguments is unspecified.
However, remember that your code might be compiled with a pre-C++17 compiler (e.g., through cut-and-paste) so don't be too clever.
Example
v[i] = ++i; // the result is undefined
A good rule of thumb is that you should not read a value twice in an expression where you write to it.
Enforcement
Can be detected by a good analyzer.
ES.44: Don't depend on order of evaluation of function arguments
Reason
Because that order is unspecified.
Note
C++17 tightens up the rules for the order of evaluation, but the order of evaluation of function arguments is still unspecified.
Example
int i = 0;
f(++i, ++i);
Before C++17, the behavior is undefined, so the behavior could be anything (e.g., f(2, 2)
).
Since C++17, this code does not have undefined behavior, but it is still not specified which argument is evaluated first. The call will be f(1, 2)
or f(2, 1)
, but you don't know which.
Example
Overloaded operators can lead to order of evaluation problems:
f1()->m(f2()); // m(f1(), f2())
cout << f1() << f2(); // operator<<(operator<<(cout, f1()), f2())
In C++17, these examples work as expected (left to right) and assignments are evaluated right to left (just as ='s binding is right-to-left)
f1() = f2(); // undefined behavior in C++14; in C++17, f2() is evaluated before f1()
Enforcement
Can be detected by a good analyzer.
ES.45: Avoid "magic constants"; use symbolic constants
Reason
Unnamed constants embedded in expressions are easily overlooked and often hard to understand:
Example
for (int m = 1; m <= 12; ++m) // don't: magic constant 12
cout << month[m] << '\n';
No, we don't all know that there are 12 months, numbered 1..12, in a year. Better:
// months are indexed 1..12
constexpr int first_month = 1;
constexpr int last_month = 12;
for (int m = first_month; m <= last_month; ++m) // better
cout << month[m] << '\n';
Better still, don't expose constants:
for (auto m : month)
cout << m << '\n';
Enforcement
Flag literals in code. Give a pass to 0
, 1
, nullptr
, \n
, ""
, and others on a positive list.
ES.46: Avoid lossy (narrowing, truncating) arithmetic conversions
Reason
A narrowing conversion destroys information, often unexpectedly so.
Example, bad
A key example is basic narrowing:
double d = 7.9;
int i = d; // bad: narrowing: i becomes 7
i = (int) d; // bad: we're going to claim this is still not explicit enough
void f(int x, long y, double d)
{
char c1 = x; // bad: narrowing
char c2 = y; // bad: narrowing
char c3 = d; // bad: narrowing
}
Note
The guidelines support library offers a narrow_cast
operation for specifying that narrowing is acceptable and a narrow
("narrow if") that throws an exception if a narrowing would throw away legal values:
i = gsl::narrow_cast<int>(d); // OK (you asked for it): narrowing: i becomes 7
i = gsl::narrow<int>(d); // OK: throws narrowing_error
We also include lossy arithmetic casts, such as from a negative floating point type to an unsigned integral type:
double d = -7.9;
unsigned u = 0;
u = d; // bad: narrowing
u = gsl::narrow_cast<unsigned>(d); // OK (you asked for it): u becomes 4294967289
u = gsl::narrow<unsigned>(d); // OK: throws narrowing_error
Note
This rule does not apply to contextual conversions to bool:
if (ptr) do_something(*ptr); // OK: ptr is used as a condition
bool b = ptr; // bad: narrowing
Enforcement
A good analyzer can detect all narrowing conversions. However, flagging all narrowing conversions will lead to a lot of false positives. Suggestions:
- Flag all floating-point to integer conversions (maybe only
float
->char
anddouble
->int
. Here be dragons! we need data). - Flag all
long
->char
(I suspectint
->char
is very common. Here be dragons! we need data). - Consider narrowing conversions for function arguments especially suspect.
ES.47: Use nullptr
rather than 0
or NULL
Reason
Readability. Minimize surprises: nullptr
cannot be confused with an
int
. nullptr
also has a well-specified (very restrictive) type, and thus
works in more scenarios where type deduction might do the wrong thing on NULL
or 0
.
Example
Consider:
void f(int);
void f(char*);
f(0); // call f(int)
f(nullptr); // call f(char*)
Enforcement
Flag uses of 0
and NULL
for pointers. The transformation might be helped by simple program transformation.
ES.48: Avoid casts
Reason
Casts are a well-known source of errors and make some optimizations unreliable.
Example, bad
double d = 2;
auto p = (long*)&d;
auto q = (long long*)&d;
cout << d << ' ' << *p << ' ' << *q << '\n';
What would you think this fragment prints? The result is at best implementation defined. I got
2 0 4611686018427387904
Adding
*q = 666;
cout << d << ' ' << *p << ' ' << *q << '\n';
I got
3.29048e-321 666 666
Surprised? I'm just glad I didn't crash the program.
Note
Programmers who write casts typically assume that they know what they are doing, or that writing a cast makes the program "easier to read". In fact, they often disable the general rules for using values. Overload resolution and template instantiation usually pick the right function if there is a right function to pick. If there is not, maybe there ought to be, rather than applying a local fix (cast).
Notes
Casts are necessary in a systems programming language. For example, how else would we get the address of a device register into a pointer? However, casts are seriously overused as well as a major source of errors.
If you feel the need for a lot of casts, there might be a fundamental design problem.
The type profile bans reinterpret_cast
and C-style casts.
Never cast to (void)
to ignore a [[nodiscard]]
return value.
If you deliberately want to discard such a result, first think hard about whether that is really a good idea (there is usually a good reason the author of the function or of the return type used [[nodiscard]]
in the first place).
If you still think it's appropriate and your code reviewer agrees, use std::ignore =
to turn off the warning which is simple, portable, and easy to grep.
Alternatives
Casts are widely (mis)used. Modern C++ has rules and constructs that eliminate the need for casts in many contexts, such as
- Use templates
- Use
std::variant
- Rely on the well-defined, safe, implicit conversions between pointer types
- Use
std::ignore =
to ignore[[nodiscard]]
values.
Enforcement
- Flag all C-style casts, including to
void
. - Flag functional style casts using
Type(value)
. UseType{value}
instead which is not narrowing. (See ES.64.) - Flag identity casts between pointer types, where the source and target types are the same (#Pro-type-identitycast).
- Flag an explicit pointer cast that could be implicit.
ES.49: If you must use a cast, use a named cast
Reason
Readability. Error avoidance. Named casts are more specific than a C-style or functional cast, allowing the compiler to catch some errors.
The named casts are:
static_cast
const_cast
reinterpret_cast
dynamic_cast
std::move
//move(x)
is an rvalue reference tox
std::forward
//forward<T>(x)
is an rvalue or an lvalue reference tox
depending onT
gsl::narrow_cast
//narrow_cast<T>(x)
isstatic_cast<T>(x)
gsl::narrow
//narrow<T>(x)
isstatic_cast<T>(x)
ifstatic_cast<T>(x) == x
or it throwsnarrowing_error
Example
class B { /* ... */ };
class D { /* ... */ };
template<typename D> D* upcast(B* pb)
{
D* pd0 = pb; // error: no implicit conversion from B* to D*
D* pd1 = (D*)pb; // legal, but what is done?
D* pd2 = static_cast<D*>(pb); // error: D is not derived from B
D* pd3 = reinterpret_cast<D*>(pb); // OK: on your head be it!
D* pd4 = dynamic_cast<D*>(pb); // OK: return nullptr
// ...
}
The example was synthesized from real-world bugs where D
used to be derived from B
, but someone refactored the hierarchy.
The C-style cast is dangerous because it can do any kind of conversion, depriving us of any protection from mistakes (now or in the future).
Note
When converting between types with no information loss (e.g. from float
to
double
or from int32
to int64
), brace initialization might be used instead.
double d {some_float};
int64_t i {some_int32};
This makes it clear that the type conversion was intended and also prevents
conversions between types that might result in loss of precision. (It is a
compilation error to try to initialize a float
from a double
in this fashion,
for example.)
Note
reinterpret_cast
can be essential, but the essential uses (e.g., turning a machine address into pointer) are not type safe:
auto p = reinterpret_cast<Device_register>(0x800); // inherently dangerous
Enforcement
- Flag all C-style casts, including to
void
. - Flag functional style casts using
Type(value)
. UseType{value}
instead which is not narrowing. (See ES.64.) - The type profile bans
reinterpret_cast
. - The type profile warns when using
static_cast
between arithmetic types.
ES.50: Don't cast away const
Reason
It makes a lie out of const
.
If the variable is actually declared const
, modifying it results in undefined behavior.
Example, bad
void f(const int& x)
{
const_cast<int&>(x) = 42; // BAD
}
static int i = 0;
static const int j = 0;
f(i); // silent side effect
f(j); // undefined behavior
Example
Sometimes, you might be tempted to resort to const_cast
to avoid code duplication, such as when two accessor functions that differ only in const
-ness have similar implementations. For example:
class Bar;
class Foo {
public:
// BAD, duplicates logic
Bar& get_bar()
{
/* complex logic around getting a non-const reference to my_bar */
}
const Bar& get_bar() const
{
/* same complex logic around getting a const reference to my_bar */
}
private:
Bar my_bar;
};
Instead, prefer to share implementations. Normally, you can just have the non-const
function call the const
function. However, when there is complex logic this can lead to the following pattern that still resorts to a const_cast
:
class Foo {
public:
// not great, non-const calls const version but resorts to const_cast
Bar& get_bar()
{
return const_cast<Bar&>(static_cast<const Foo&>(*this).get_bar());
}
const Bar& get_bar() const
{
/* the complex logic around getting a const reference to my_bar */
}
private:
Bar my_bar;
};
Although this pattern is safe when applied correctly, because the caller must have had a non-const
object to begin with, it's not ideal because the safety is hard to enforce automatically as a checker rule.
Instead, prefer to put the common code in a common helper function -- and make it a template so that it deduces const
. This doesn't use any const_cast
at all:
class Foo {
public: // good
Bar& get_bar() { return get_bar_impl(*this); }
const Bar& get_bar() const { return get_bar_impl(*this); }
private:
Bar my_bar;
template<class T> // good, deduces whether T is const or non-const
static auto& get_bar_impl(T& t)
{ /* the complex logic around getting a possibly-const reference to my_bar */ }
};
Note: Don't do large non-dependent work inside a template, which leads to code bloat. For example, a further improvement would be if all or part of get_bar_impl
can be non-dependent and factored out into a common non-template function, for a potentially big reduction in code size.
Exception
You might need to cast away const
when calling const
-incorrect functions.
Prefer to wrap such functions in inline const
-correct wrappers to encapsulate the cast in one place.
Example
Sometimes, "cast away const
" is to allow the updating of some transient information of an otherwise immutable object.
Examples are caching, memoization, and precomputation.
Such examples are often handled as well or better using mutable
or an indirection than with a const_cast
.
Consider keeping previously computed results around for a costly operation:
int compute(int x); // compute a value for x; assume this to be costly
class Cache { // some type implementing a cache for an int->int operation
public:
pair<bool, int> find(int x) const; // is there a value for x?
void set(int x, int v); // make y the value for x
// ...
private:
// ...
};
class X {
public:
int get_val(int x)
{
auto p = cache.find(x);
if (p.first) return p.second;
int val = compute(x);
cache.set(x, val); // insert value for x
return val;
}
// ...
private:
Cache cache;
};
Here, get_val()
is logically constant, so we would like to make it a const
member.
To do this we still need to mutate cache
, so people sometimes resort to a const_cast
:
class X { // Suspicious solution based on casting
public:
int get_val(int x) const
{
auto p = cache.find(x);
if (p.first) return p.second;
int val = compute(x);
const_cast<Cache&>(cache).set(x, val); // ugly
return val;
}
// ...
private:
Cache cache;
};
Fortunately, there is a better solution:
State that cache
is mutable even for a const
object:
class X { // better solution
public:
int get_val(int x) const
{
auto p = cache.find(x);
if (p.first) return p.second;
int val = compute(x);
cache.set(x, val);
return val;
}
// ...
private:
mutable Cache cache;
};
An alternative solution would be to store a pointer to the cache
:
class X { // OK, but slightly messier solution
public:
int get_val(int x) const
{
auto p = cache->find(x);
if (p.first) return p.second;
int val = compute(x);
cache->set(x, val);
return val;
}
// ...
private:
unique_ptr<Cache> cache;
};
That solution is the most flexible, but requires explicit construction and destruction of *cache
(most likely in the constructor and destructor of X
).
In any variant, we must guard against data races on the cache
in multi-threaded code, possibly using a std::mutex
.
Enforcement
- Flag
const_cast
s. - This rule is part of the type-safety profile for the related Profile.
ES.55: Avoid the need for range checking
Reason
Constructs that cannot overflow do not overflow (and usually run faster):
Example
for (auto& x : v) // print all elements of v
cout << x << '\n';
auto p = find(v, x); // find x in v
Enforcement
Look for explicit range checks and heuristically suggest alternatives.
ES.56: Write std::move()
only when you need to explicitly move an object to another scope
Reason
We move, rather than copy, to avoid duplication and for improved performance.
A move typically leaves behind an empty object (C.64), which can be surprising or even dangerous, so we try to avoid moving from lvalues (they might be accessed later).
Notes
Moving is done implicitly when the source is an rvalue (e.g., value in a return
treatment or a function result), so don't pointlessly complicate code in those cases by writing move
explicitly. Instead, write short functions that return values, and both the function's return and the caller's accepting of the return will be optimized naturally.
In general, following the guidelines in this document (including not making variables' scopes needlessly large, writing short functions that return values, returning local variables) help eliminate most need for explicit std::move
.
Explicit move
is needed to explicitly move an object to another scope, notably to pass it to a "sink" function and in the implementations of the move operations themselves (move constructor, move assignment operator) and swap operations.
Example, bad
void sink(X&& x); // sink takes ownership of x
void user()
{
X x;
// error: cannot bind an lvalue to a rvalue reference
sink(x);
// OK: sink takes the contents of x, x must now be assumed to be empty
sink(std::move(x));
// ...
// probably a mistake
use(x);
}
Usually, a std::move()
is used as an argument to a &&
parameter.
And after you do that, assume the object has been moved from (see C.64) and don't read its state again until you first set it to a new value.
void f()
{
string s1 = "supercalifragilisticexpialidocious";
string s2 = s1; // ok, takes a copy
assert(s1 == "supercalifragilisticexpialidocious"); // ok
// bad, if you want to keep using s1's value
string s3 = move(s1);
// bad, assert will likely fail, s1 likely changed
assert(s1 == "supercalifragilisticexpialidocious");
}
Example
void sink(unique_ptr<widget> p); // pass ownership of p to sink()
void f()
{
auto w = make_unique<widget>();
// ...
sink(std::move(w)); // ok, give to sink()
// ...
sink(w); // Error: unique_ptr is carefully designed so that you cannot copy it
}
Notes
std::move()
is a cast to &&
in disguise; it doesn't itself move anything, but marks a named object as a candidate that can be moved from.
The language already knows the common cases where objects can be moved from, especially when returning values from functions, so don't complicate code with redundant std::move()
's.
Never write std::move()
just because you've heard "it's more efficient."
In general, don't believe claims of "efficiency" without data (???).
In general, don't complicate your code without reason (??).
Never write std::move()
on a const object, it is silently transformed into a copy (see Item 23 in Meyers15)
Example, bad
vector<int> make_vector()
{
vector<int> result;
// ... load result with data
return std::move(result); // bad; just write "return result;"
}
Never write return move(local_variable);
, because the language already knows the variable is a move candidate.
Writing move
in this code won't help, and can actually be detrimental because on some compilers it interferes with RVO (the return value optimization) by creating an additional reference alias to the local variable.
Example, bad
vector<int> v = std::move(make_vector()); // bad; the std::move is entirely redundant
Never write move
on a returned value such as x = move(f());
where f
returns by value.
The language already knows that a returned value is a temporary object that can be moved from.
Example
void mover(X&& x)
{
call_something(std::move(x)); // ok
call_something(std::forward<X>(x)); // bad, don't std::forward an rvalue reference
call_something(x); // suspicious, why not std::move?
}
template<class T>
void forwarder(T&& t)
{
call_something(std::move(t)); // bad, don't std::move a forwarding reference
call_something(std::forward<T>(t)); // ok
call_something(t); // suspicious, why not std::forward?
}
Enforcement
- Flag use of
std::move(x)
wherex
is an rvalue or the language will already treat it as an rvalue, includingreturn std::move(local_variable);
andstd::move(f())
on a function that returns by value. - Flag functions taking an
S&&
parameter if there is noconst S&
overload to take care of lvalues. - Flag a
std::move
d argument passed to a parameter, except when the parameter type is anX&&
rvalue reference or the type is move-only and the parameter is passed by value. - Flag when
std::move
is applied to a forwarding reference (T&&
whereT
is a template parameter type). Usestd::forward
instead. - Flag when
std::move
is applied to other than an rvalue reference to non-const. (More general case of the previous rule to cover the non-forwarding cases.) - Flag when
std::forward
is applied to an rvalue reference (X&&
whereX
is a non-template parameter type). Usestd::move
instead. - Flag when
std::forward
is applied to other than a forwarding reference. (More general case of the previous rule to cover the non-moving cases.) - Flag when an object is potentially moved from and the next operation is a
const
operation; there should first be an intervening non-const
operation, ideally assignment, to first reset the object's value.
ES.60: Avoid new
and delete
outside resource management functions
Reason
Direct resource management in application code is error-prone and tedious.
Note
This is also known as the rule of "No naked new
!"
Example, bad
void f(int n)
{
auto p = new X[n]; // n default constructed Xs
// ...
delete[] p;
}
There can be code in the ...
part that causes the delete
never to happen.
See also: R: Resource management
Enforcement
Flag naked new
s and naked delete
s.
ES.61: Delete arrays using delete[]
and non-arrays using delete
Reason
That's what the language requires and mistakes can lead to resource release errors and/or memory corruption.
Example, bad
void f(int n)
{
auto p = new X[n]; // n default constructed Xs
// ...
delete p; // error: just delete the object p, rather than delete the array p[]
}
Note
This example not only violates the no naked new
rule as in the previous example, it has many more problems.
Enforcement
- If the
new
and thedelete
are in the same scope, mistakes can be flagged. - If the
new
and thedelete
are in a constructor/destructor pair, mistakes can be flagged.
ES.62: Don't compare pointers into different arrays
Reason
The result of doing so is undefined.
Example, bad
void f()
{
int a1[7];
int a2[9];
if (&a1[5] < &a2[7]) {} // bad: undefined
if (0 < &a1[5] - &a2[7]) {} // bad: undefined
}
Note
This example has many more problems.
Enforcement
???
ES.63: Don't slice
Reason
Slicing -- that is, copying only part of an object using assignment or initialization -- most often leads to errors because the object was meant to be considered as a whole. In the rare cases where the slicing was deliberate the code can be surprising.
Example
class Shape { /* ... */ };
class Circle : public Shape { /* ... */ Point c; int r; };
Circle c {{0, 0}, 42};
Shape s {c}; // copy construct only the Shape part of Circle
s = c; // or copy assign only the Shape part of Circle
void assign(const Shape& src, Shape& dest)
{
dest = src;
}
Circle c2 {{1, 1}, 43};
assign(c, c2); // oops, not the whole state is transferred
assert(c == c2); // if we supply copying, we should also provide comparison,
// but this will likely return false
The result will be meaningless because the center and radius will not be copied from c
into s
.
The first defense against this is to define the base class Shape
not to allow this.
Alternative
If you mean to slice, define an explicit operation to do so. This saves readers from confusion. For example:
class Smiley : public Circle {
public:
Circle copy_circle();
// ...
};
Smiley sm { /* ... */ };
Circle c1 {sm}; // ideally prevented by the definition of Circle
Circle c2 {sm.copy_circle()};
Enforcement
Warn against slicing.
ES.64: Use the T{e}
notation for construction
Reason
The T{e}
construction syntax makes it explicit that construction is desired.
The T{e}
construction syntax doesn't allow narrowing.
T{e}
is the only safe and general expression for constructing a value of type T
from an expression e
.
The casts notations T(e)
and (T)e
are neither safe nor general.
Example
For built-in types, the construction notation protects against narrowing and reinterpretation
void use(char ch, int i, double d, char* p, long long lng)
{
int x1 = int{ch}; // OK, but redundant
int x2 = int{d}; // error: double->int narrowing; use a cast if you need to
int x3 = int{p}; // error: pointer to->int; use a reinterpret_cast if you really need to
int x4 = int{lng}; // error: long long->int narrowing; use a cast if you need to
int y1 = int(ch); // OK, but redundant
int y2 = int(d); // bad: double->int narrowing; use a cast if you need to
int y3 = int(p); // bad: pointer to->int; use a reinterpret_cast if you really need to
int y4 = int(lng); // bad: long long->int narrowing; use a cast if you need to
int z1 = (int)ch; // OK, but redundant
int z2 = (int)d; // bad: double->int narrowing; use a cast if you need to
int z3 = (int)p; // bad: pointer to->int; use a reinterpret_cast if you really need to
int z4 = (int)lng; // bad: long long->int narrowing; use a cast if you need to
}
The integer to/from pointer conversions are implementation defined when using the T(e)
or (T)e
notations, and non-portable
between platforms with different integer and pointer sizes.
Note
Avoid casts (explicit type conversion) and if you must prefer named casts.
Note
When unambiguous, the T
can be left out of T{e}
.
complex<double> f(complex<double>);
auto z = f({2*pi, 1});
Note
The construction notation is the most general initializer notation.
Exception
std::vector
and other containers were defined before we had {}
as a notation for construction.
Consider:
vector<string> vs {10}; // ten empty strings
vector<int> vi1 {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}; // ten elements 1..10
vector<int> vi2 {10}; // one element with the value 10
How do we get a vector
of 10 default initialized int
s?
vector<int> v3(10); // ten elements with value 0
The use of ()
rather than {}
for number of elements is conventional (going back to the early 1980s), hard to change, but still
a design error: for a container where the element type can be confused with the number of elements, we have an ambiguity that
must be resolved.
The conventional resolution is to interpret {10}
as a list of one element and use (10)
to distinguish a size.
This mistake need not be repeated in new code. We can define a type to represent the number of elements:
struct Count { int n; };
template<typename T>
class Vector {
public:
Vector(Count n); // n default-initialized elements
Vector(initializer_list<T> init); // init.size() elements
// ...
};
Vector<int> v1{10};
Vector<int> v2{Count{10}};
Vector<Count> v3{Count{10}}; // yes, there is still a very minor problem
The main problem left is to find a suitable name for Count
.
Enforcement
Flag the C-style (T)e
and functional-style T(e)
casts.
ES.65: Don't dereference an invalid pointer
Reason
Dereferencing an invalid pointer, such as nullptr
, is undefined behavior, typically leading to immediate crashes,
wrong results, or memory corruption.
Note
This rule is an obvious and well-known language rule, but can be hard to follow. It takes good coding style, library support, and static analysis to eliminate violations without major overhead. This is a major part of the discussion of C++'s model for type- and resource-safety.
See also:
- Use RAII to avoid lifetime problems.
- Use unique_ptr to avoid lifetime problems.
- Use shared_ptr to avoid lifetime problems.
- Use references when
nullptr
isn't a possibility. - Use not_null to catch unexpected
nullptr
early. - Use the bounds profile to avoid range errors.
Example
void f()
{
int x = 0;
int* p = &x;
if (condition()) {
int y = 0;
p = &y;
} // invalidates p
*p = 42; // BAD, p might be invalid if the branch was taken
}
To resolve the problem, either extend the lifetime of the object the pointer is intended to refer to, or shorten the lifetime of the pointer (move the dereference to before the pointed-to object's lifetime ends).
void f1()
{
int x = 0;
int* p = &x;
int y = 0;
if (condition()) {
p = &y;
}
*p = 42; // OK, p points to x or y and both are still in scope
}
Unfortunately, most invalid pointer problems are harder to spot and harder to fix.
Example
void f(int* p)
{
int x = *p; // BAD: how do we know that p is valid?
}
There is a huge amount of such code.
Most works -- after lots of testing -- but in isolation it is impossible to tell whether p
could be the nullptr
.
Consequently, this is also a major source of errors.
There are many approaches to dealing with this potential problem:
void f1(int* p) // deal with nullptr
{
if (!p) {
// deal with nullptr (allocate, return, throw, make p point to something, whatever
}
int x = *p;
}
There are two potential problems with testing for nullptr
:
- it is not always obvious what to do if we find
nullptr
- the test can be redundant and/or relatively expensive
- it is not obvious if the test is to protect against a violation or part of the required logic.
void f2(int* p) // state that p is not supposed to be nullptr
{
assert(p);
int x = *p;
}
This would carry a cost only when the assertion checking was enabled and would give a compiler/analyzer useful information. This would work even better if/when C++ gets direct support for contracts:
void f3(int* p) // state that p is not supposed to be nullptr
[[expects: p]]
{
int x = *p;
}
Alternatively, we could use gsl::not_null
to ensure that p
is not the nullptr
.
void f(not_null<int*> p)
{
int x = *p;
}
These remedies take care of nullptr
only.
Remember that there are other ways of getting an invalid pointer.
Example
void f(int* p) // old code, doesn't use owner
{
delete p;
}
void g() // old code: uses naked new
{
auto q = new int{7};
f(q);
int x = *q; // BAD: dereferences invalid pointer
}
Example
void f()
{
vector<int> v(10);
int* p = &v[5];
v.push_back(99); // could reallocate v's elements
int x = *p; // BAD: dereferences potentially invalid pointer
}
Enforcement
This rule is part of the lifetime safety profile
- Flag a dereference of a pointer that points to an object that has gone out of scope
- Flag a dereference of a pointer that might have been invalidated by assigning a
nullptr
- Flag a dereference of a pointer that might have been invalidated by a
delete
- Flag a dereference to a pointer to a container element that might have been invalidated by dereference