콘텐츠로 이동

ES.expr

ES.expr: Expressions

Expressions manipulate values.

ES.40: Avoid complicated expressions

Reason

Complicated expressions are error-prone.

Example

// bad: assignment hidden in subexpression
while ((c = getc()) != -1)

// bad: two non-local variables assigned in sub-expressions
while ((cin >> c1, cin >> c2), c1 == c2)

// better, but possibly still too complicated
for (char c1, c2; cin >> c1 >> c2 && c1 == c2;)

// OK: if i and j are not aliased
int x = ++i + ++j;

// OK: if i != j and i != k
v[i] = v[j] + v[k];

// bad: multiple assignments "hidden" in subexpressions
x = a + (b = f()) + (c = g()) * 7;

// bad: relies on commonly misunderstood precedence rules
x = a & b + c * d && e ^ f == 7;

// bad: undefined behavior
x = x++ + x++ + ++x;

Some of these expressions are unconditionally bad (e.g., they rely on undefined behavior). Others are simply so complicated and/or unusual that even good programmers could misunderstand them or overlook a problem when in a hurry.

Note

C++17 tightens up the rules for the order of evaluation (left-to-right except right-to-left in assignments, and the order of evaluation of function arguments is unspecified; see ES.43), but that doesn't change the fact that complicated expressions are potentially confusing.

Note

A programmer should know and use the basic rules for expressions.

Example

x = k * y + z;             // OK

auto t1 = k * y;           // bad: unnecessarily verbose
x = t1 + z;

if (0 <= x && x < max)   // OK

auto t1 = 0 <= x;        // bad: unnecessarily verbose
auto t2 = x < max;
if (t1 && t2)            // ...

Enforcement

Tricky. How complicated must an expression be to be considered complicated? Writing computations as statements with one operation each is also confusing. Things to consider:

  • side effects: side effects on multiple non-local variables (for some definition of non-local) can be suspect, especially if the side effects are in separate subexpressions
  • writes to aliased variables
  • more than N operators (and what should N be?)
  • reliance of subtle precedence rules
  • uses undefined behavior (can we catch all undefined behavior?)
  • implementation defined behavior?
  • ???

ES.41: If in doubt about operator precedence, parenthesize

Reason

Avoid errors. Readability. Not everyone has the operator table memorized.

Example

const unsigned int flag = 2;
unsigned int a = flag;

if (a & flag != 0)  // bad: means a&(flag != 0)

Note: We recommend that programmers know their precedence table for the arithmetic operations, the logical operations, but consider mixing bitwise logical operations with other operators in need of parentheses.

if ((a & flag) != 0)  // OK: works as intended

Note

You should know enough not to need parentheses for:

if (a < 0 || a <= max) {
    // ...
}

Enforcement

  • Flag combinations of bitwise-logical operators and other operators.
  • Flag assignment operators not as the leftmost operator.
  • ???

ES.42: Keep use of pointers simple and straightforward

Reason

Complicated pointer manipulation is a major source of errors.

Note

Use gsl::span instead. Pointers should only refer to single objects. Pointer arithmetic is fragile and easy to get wrong, the source of many, many bad bugs and security violations. span is a bounds-checked, safe type for accessing arrays of data. Access into an array with known bounds using a constant as a subscript can be validated by the compiler.

Example, bad

void f(int* p, int count)
{
    if (count < 2) return;

    int* q = p + 1;    // BAD

    ptrdiff_t d;
    int n;
    d = (p - &n);      // OK
    d = (q - p);       // OK

    int n = *p++;      // BAD

    if (count < 6) return;

    p[4] = 1;          // BAD

    p[count - 1] = 2;  // BAD

    use(&p[0], 3);     // BAD
}

Example, good

void f(span<int> a) // BETTER: use span in the function declaration
{
    if (a.size() < 2) return;

    int n = a[0];      // OK

    span<int> q = a.subspan(1); // OK

    if (a.size() < 6) return;

    a[4] = 1;          // OK

    a[a.size() - 1] = 2;  // OK

    use(a.data(), 3);  // OK
}

Note

Subscripting with a variable is difficult for both tools and humans to validate as safe. span is a run-time bounds-checked, safe type for accessing arrays of data. at() is another alternative that ensures single accesses are bounds-checked. If iterators are needed to access an array, use the iterators from a span constructed over the array.

Example, bad

void f(array<int, 10> a, int pos)
{
    a[pos / 2] = 1; // BAD
    a[pos - 1] = 2; // BAD
    a[-1] = 3;    // BAD (but easily caught by tools) -- no replacement, just don't do this
    a[10] = 4;    // BAD (but easily caught by tools) -- no replacement, just don't do this
}

Example, good

Use a span:

void f1(span<int, 10> a, int pos) // A1: Change parameter type to use span
{
    a[pos / 2] = 1; // OK
    a[pos - 1] = 2; // OK
}

void f2(array<int, 10> arr, int pos) // A2: Add local span and use that
{
    span<int> a = {arr.data(), pos};
    a[pos / 2] = 1; // OK
    a[pos - 1] = 2; // OK
}

Use at():

void f3(array<int, 10> a, int pos) // ALTERNATIVE B: Use at() for access
{
    at(a, pos / 2) = 1; // OK
    at(a, pos - 1) = 2; // OK
}

Example, bad

void f()
{
    int arr[COUNT];
    for (int i = 0; i < COUNT; ++i)
        arr[i] = i; // BAD, cannot use non-constant indexer
}

Example, good

Use a span:

void f1()
{
    int arr[COUNT];
    span<int> av = arr;
    for (int i = 0; i < COUNT; ++i)
        av[i] = i;
}

Use a span and range-for:

void f1a()
{
     int arr[COUNT];
     span<int, COUNT> av = arr;
     int i = 0;
     for (auto& e : av)
         e = i++;
}

Use at() for access:

void f2()
{
    int arr[COUNT];
    for (int i = 0; i < COUNT; ++i)
        at(arr, i) = i;
}

Use a range-for:

void f3()
{
    int arr[COUNT];
    int i = 0;
    for (auto& e : arr)
         e = i++;
}

Note

Tooling can offer rewrites of array accesses that involve dynamic index expressions to use at() instead:

static int a[10];

void f(int i, int j)
{
    a[i + j] = 12;      // BAD, could be rewritten as ...
    at(a, i + j) = 12;  // OK -- bounds-checked
}

Example

Turning an array into a pointer (as the language does essentially always) removes opportunities for checking, so avoid it

void g(int* p);

void f()
{
    int a[5];
    g(a);        // BAD: are we trying to pass an array?
    g(&a[0]);    // OK: passing one object
}

If you want to pass an array, say so:

void g(int* p, size_t length);  // old (dangerous) code

void g1(span<int> av); // BETTER: get g() changed.

void f2()
{
    int a[5];
    span<int> av = a;

    g(av.data(), av.size());   // OK, if you have no choice
    g1(a);                     // OK -- no decay here, instead use implicit span ctor
}

Enforcement

  • Flag any arithmetic operation on an expression of pointer type that results in a value of pointer type.
  • Flag any indexing expression on an expression or variable of array type (either static array or std::array) where the indexer is not a compile-time constant expression with a value between 0 and the upper bound of the array.
  • Flag any expression that would rely on implicit conversion of an array type to a pointer type.

This rule is part of the bounds-safety profile.

ES.43: Avoid expressions with undefined order of evaluation

Reason

You have no idea what such code does. Portability. Even if it does something sensible for you, it might do something different on another compiler (e.g., the next release of your compiler) or with a different optimizer setting.

Note

C++17 tightens up the rules for the order of evaluation: left-to-right except right-to-left in assignments, and the order of evaluation of function arguments is unspecified.

However, remember that your code might be compiled with a pre-C++17 compiler (e.g., through cut-and-paste) so don't be too clever.

Example

v[i] = ++i;   //  the result is undefined

A good rule of thumb is that you should not read a value twice in an expression where you write to it.

Enforcement

Can be detected by a good analyzer.

ES.44: Don't depend on order of evaluation of function arguments

Reason

Because that order is unspecified.

Note

C++17 tightens up the rules for the order of evaluation, but the order of evaluation of function arguments is still unspecified.

Example

int i = 0;
f(++i, ++i);

Before C++17, the behavior is undefined, so the behavior could be anything (e.g., f(2, 2)). Since C++17, this code does not have undefined behavior, but it is still not specified which argument is evaluated first. The call will be f(1, 2) or f(2, 1), but you don't know which.

Example

Overloaded operators can lead to order of evaluation problems:

f1()->m(f2());          // m(f1(), f2())
cout << f1() << f2();   // operator<<(operator<<(cout, f1()), f2())

In C++17, these examples work as expected (left to right) and assignments are evaluated right to left (just as ='s binding is right-to-left)

f1() = f2();    // undefined behavior in C++14; in C++17, f2() is evaluated before f1()

Enforcement

Can be detected by a good analyzer.

ES.45: Avoid "magic constants"; use symbolic constants

Reason

Unnamed constants embedded in expressions are easily overlooked and often hard to understand:

Example

for (int m = 1; m <= 12; ++m)   // don't: magic constant 12
    cout << month[m] << '\n';

No, we don't all know that there are 12 months, numbered 1..12, in a year. Better:

// months are indexed 1..12
constexpr int first_month = 1;
constexpr int last_month = 12;

for (int m = first_month; m <= last_month; ++m)   // better
    cout << month[m] << '\n';

Better still, don't expose constants:

for (auto m : month)
    cout << m << '\n';

Enforcement

Flag literals in code. Give a pass to 0, 1, nullptr, \n, "", and others on a positive list.

ES.46: Avoid lossy (narrowing, truncating) arithmetic conversions

Reason

A narrowing conversion destroys information, often unexpectedly so.

Example, bad

A key example is basic narrowing:

double d = 7.9;
int i = d;    // bad: narrowing: i becomes 7
i = (int) d;  // bad: we're going to claim this is still not explicit enough

void f(int x, long y, double d)
{
    char c1 = x;   // bad: narrowing
    char c2 = y;   // bad: narrowing
    char c3 = d;   // bad: narrowing
}

Note

The guidelines support library offers a narrow_cast operation for specifying that narrowing is acceptable and a narrow ("narrow if") that throws an exception if a narrowing would throw away legal values:

i = gsl::narrow_cast<int>(d);   // OK (you asked for it): narrowing: i becomes 7
i = gsl::narrow<int>(d);        // OK: throws narrowing_error

We also include lossy arithmetic casts, such as from a negative floating point type to an unsigned integral type:

double d = -7.9;
unsigned u = 0;

u = d;                               // bad: narrowing
u = gsl::narrow_cast<unsigned>(d);   // OK (you asked for it): u becomes 4294967289
u = gsl::narrow<unsigned>(d);        // OK: throws narrowing_error

Note

This rule does not apply to contextual conversions to bool:

if (ptr) do_something(*ptr);   // OK: ptr is used as a condition
bool b = ptr;                  // bad: narrowing

Enforcement

A good analyzer can detect all narrowing conversions. However, flagging all narrowing conversions will lead to a lot of false positives. Suggestions:

  • Flag all floating-point to integer conversions (maybe only float->char and double->int. Here be dragons! we need data).
  • Flag all long->char (I suspect int->char is very common. Here be dragons! we need data).
  • Consider narrowing conversions for function arguments especially suspect.

ES.47: Use nullptr rather than 0 or NULL

Reason

Readability. Minimize surprises: nullptr cannot be confused with an int. nullptr also has a well-specified (very restrictive) type, and thus works in more scenarios where type deduction might do the wrong thing on NULL or 0.

Example

Consider:

void f(int);
void f(char*);
f(0);         // call f(int)
f(nullptr);   // call f(char*)

Enforcement

Flag uses of 0 and NULL for pointers. The transformation might be helped by simple program transformation.

ES.48: Avoid casts

Reason

Casts are a well-known source of errors and make some optimizations unreliable.

Example, bad

double d = 2;
auto p = (long*)&d;
auto q = (long long*)&d;
cout << d << ' ' << *p << ' ' << *q << '\n';

What would you think this fragment prints? The result is at best implementation defined. I got

2 0 4611686018427387904

Adding

*q = 666;
cout << d << ' ' << *p << ' ' << *q << '\n';

I got

3.29048e-321 666 666

Surprised? I'm just glad I didn't crash the program.

Note

Programmers who write casts typically assume that they know what they are doing, or that writing a cast makes the program "easier to read". In fact, they often disable the general rules for using values. Overload resolution and template instantiation usually pick the right function if there is a right function to pick. If there is not, maybe there ought to be, rather than applying a local fix (cast).

Notes

Casts are necessary in a systems programming language. For example, how else would we get the address of a device register into a pointer? However, casts are seriously overused as well as a major source of errors.

If you feel the need for a lot of casts, there might be a fundamental design problem.

The type profile bans reinterpret_cast and C-style casts.

Never cast to (void) to ignore a [[nodiscard]]return value. If you deliberately want to discard such a result, first think hard about whether that is really a good idea (there is usually a good reason the author of the function or of the return type used [[nodiscard]] in the first place). If you still think it's appropriate and your code reviewer agrees, use std::ignore = to turn off the warning which is simple, portable, and easy to grep.

Alternatives

Casts are widely (mis)used. Modern C++ has rules and constructs that eliminate the need for casts in many contexts, such as

  • Use templates
  • Use std::variant
  • Rely on the well-defined, safe, implicit conversions between pointer types
  • Use std::ignore = to ignore [[nodiscard]] values.

Enforcement

  • Flag all C-style casts, including to void.
  • Flag functional style casts using Type(value). Use Type{value} instead which is not narrowing. (See ES.64.)
  • Flag identity casts between pointer types, where the source and target types are the same (#Pro-type-identitycast).
  • Flag an explicit pointer cast that could be implicit.

ES.49: If you must use a cast, use a named cast

Reason

Readability. Error avoidance. Named casts are more specific than a C-style or functional cast, allowing the compiler to catch some errors.

The named casts are:

  • static_cast
  • const_cast
  • reinterpret_cast
  • dynamic_cast
  • std::move // move(x) is an rvalue reference to x
  • std::forward // forward<T>(x) is an rvalue or an lvalue reference to x depending on T
  • gsl::narrow_cast // narrow_cast<T>(x) is static_cast<T>(x)
  • gsl::narrow // narrow<T>(x) is static_cast<T>(x) if static_cast<T>(x) == x or it throws narrowing_error

Example

class B { /* ... */ };
class D { /* ... */ };

template<typename D> D* upcast(B* pb)
{
    D* pd0 = pb;                        // error: no implicit conversion from B* to D*
    D* pd1 = (D*)pb;                    // legal, but what is done?
    D* pd2 = static_cast<D*>(pb);       // error: D is not derived from B
    D* pd3 = reinterpret_cast<D*>(pb);  // OK: on your head be it!
    D* pd4 = dynamic_cast<D*>(pb);      // OK: return nullptr
    // ...
}

The example was synthesized from real-world bugs where D used to be derived from B, but someone refactored the hierarchy. The C-style cast is dangerous because it can do any kind of conversion, depriving us of any protection from mistakes (now or in the future).

Note

When converting between types with no information loss (e.g. from float to double or from int32 to int64), brace initialization might be used instead.

double d {some_float};
int64_t i {some_int32};

This makes it clear that the type conversion was intended and also prevents conversions between types that might result in loss of precision. (It is a compilation error to try to initialize a float from a double in this fashion, for example.)

Note

reinterpret_cast can be essential, but the essential uses (e.g., turning a machine address into pointer) are not type safe:

auto p = reinterpret_cast<Device_register>(0x800);  // inherently dangerous

Enforcement

  • Flag all C-style casts, including to void.
  • Flag functional style casts using Type(value). Use Type{value} instead which is not narrowing. (See ES.64.)
  • The type profile bans reinterpret_cast.
  • The type profile warns when using static_cast between arithmetic types.

ES.50: Don't cast away const

Reason

It makes a lie out of const. If the variable is actually declared const, modifying it results in undefined behavior.

Example, bad

void f(const int& x)
{
    const_cast<int&>(x) = 42;   // BAD
}

static int i = 0;
static const int j = 0;

f(i); // silent side effect
f(j); // undefined behavior

Example

Sometimes, you might be tempted to resort to const_cast to avoid code duplication, such as when two accessor functions that differ only in const-ness have similar implementations. For example:

class Bar;

class Foo {
public:
    // BAD, duplicates logic
    Bar& get_bar()
    {
        /* complex logic around getting a non-const reference to my_bar */
    }

    const Bar& get_bar() const
    {
        /* same complex logic around getting a const reference to my_bar */
    }
private:
    Bar my_bar;
};

Instead, prefer to share implementations. Normally, you can just have the non-const function call the const function. However, when there is complex logic this can lead to the following pattern that still resorts to a const_cast:

class Foo {
public:
    // not great, non-const calls const version but resorts to const_cast
    Bar& get_bar()
    {
        return const_cast<Bar&>(static_cast<const Foo&>(*this).get_bar());
    }
    const Bar& get_bar() const
    {
        /* the complex logic around getting a const reference to my_bar */
    }
private:
    Bar my_bar;
};

Although this pattern is safe when applied correctly, because the caller must have had a non-const object to begin with, it's not ideal because the safety is hard to enforce automatically as a checker rule.

Instead, prefer to put the common code in a common helper function -- and make it a template so that it deduces const. This doesn't use any const_cast at all:

class Foo {
public:                         // good
          Bar& get_bar()       { return get_bar_impl(*this); }
    const Bar& get_bar() const { return get_bar_impl(*this); }
private:
    Bar my_bar;

    template<class T>           // good, deduces whether T is const or non-const
    static auto& get_bar_impl(T& t)
        { /* the complex logic around getting a possibly-const reference to my_bar */ }
};

Note: Don't do large non-dependent work inside a template, which leads to code bloat. For example, a further improvement would be if all or part of get_bar_impl can be non-dependent and factored out into a common non-template function, for a potentially big reduction in code size.

Exception

You might need to cast away const when calling const-incorrect functions. Prefer to wrap such functions in inline const-correct wrappers to encapsulate the cast in one place.

Example

Sometimes, "cast away const" is to allow the updating of some transient information of an otherwise immutable object. Examples are caching, memoization, and precomputation. Such examples are often handled as well or better using mutable or an indirection than with a const_cast.

Consider keeping previously computed results around for a costly operation:

int compute(int x); // compute a value for x; assume this to be costly

class Cache {   // some type implementing a cache for an int->int operation
public:
    pair<bool, int> find(int x) const;   // is there a value for x?
    void set(int x, int v);             // make y the value for x
    // ...
private:
    // ...
};

class X {
public:
    int get_val(int x)
    {
        auto p = cache.find(x);
        if (p.first) return p.second;
        int val = compute(x);
        cache.set(x, val); // insert value for x
        return val;
    }
    // ...
private:
    Cache cache;
};

Here, get_val() is logically constant, so we would like to make it a const member. To do this we still need to mutate cache, so people sometimes resort to a const_cast:

class X {   // Suspicious solution based on casting
public:
    int get_val(int x) const
    {
        auto p = cache.find(x);
        if (p.first) return p.second;
        int val = compute(x);
        const_cast<Cache&>(cache).set(x, val);   // ugly
        return val;
    }
    // ...
private:
    Cache cache;
};

Fortunately, there is a better solution: State that cache is mutable even for a const object:

class X {   // better solution
public:
    int get_val(int x) const
    {
        auto p = cache.find(x);
        if (p.first) return p.second;
        int val = compute(x);
        cache.set(x, val);
        return val;
    }
    // ...
private:
    mutable Cache cache;
};

An alternative solution would be to store a pointer to the cache:

class X {   // OK, but slightly messier solution
public:
    int get_val(int x) const
    {
        auto p = cache->find(x);
        if (p.first) return p.second;
        int val = compute(x);
        cache->set(x, val);
        return val;
    }
    // ...
private:
    unique_ptr<Cache> cache;
};

That solution is the most flexible, but requires explicit construction and destruction of *cache (most likely in the constructor and destructor of X).

In any variant, we must guard against data races on the cache in multi-threaded code, possibly using a std::mutex.

Enforcement

ES.55: Avoid the need for range checking

Reason

Constructs that cannot overflow do not overflow (and usually run faster):

Example

for (auto& x : v)      // print all elements of v
    cout << x << '\n';

auto p = find(v, x);   // find x in v

Enforcement

Look for explicit range checks and heuristically suggest alternatives.

ES.56: Write std::move() only when you need to explicitly move an object to another scope

Reason

We move, rather than copy, to avoid duplication and for improved performance.

A move typically leaves behind an empty object (C.64), which can be surprising or even dangerous, so we try to avoid moving from lvalues (they might be accessed later).

Notes

Moving is done implicitly when the source is an rvalue (e.g., value in a return treatment or a function result), so don't pointlessly complicate code in those cases by writing move explicitly. Instead, write short functions that return values, and both the function's return and the caller's accepting of the return will be optimized naturally.

In general, following the guidelines in this document (including not making variables' scopes needlessly large, writing short functions that return values, returning local variables) help eliminate most need for explicit std::move.

Explicit move is needed to explicitly move an object to another scope, notably to pass it to a "sink" function and in the implementations of the move operations themselves (move constructor, move assignment operator) and swap operations.

Example, bad

void sink(X&& x);   // sink takes ownership of x

void user()
{
    X x;
    // error: cannot bind an lvalue to a rvalue reference
    sink(x);
    // OK: sink takes the contents of x, x must now be assumed to be empty
    sink(std::move(x));

    // ...

    // probably a mistake
    use(x);
}

Usually, a std::move() is used as an argument to a && parameter. And after you do that, assume the object has been moved from (see C.64) and don't read its state again until you first set it to a new value.

void f()
{
    string s1 = "supercalifragilisticexpialidocious";

    string s2 = s1;             // ok, takes a copy
    assert(s1 == "supercalifragilisticexpialidocious");  // ok

    // bad, if you want to keep using s1's value
    string s3 = move(s1);

    // bad, assert will likely fail, s1 likely changed
    assert(s1 == "supercalifragilisticexpialidocious");
}

Example

void sink(unique_ptr<widget> p);  // pass ownership of p to sink()

void f()
{
    auto w = make_unique<widget>();
    // ...
    sink(std::move(w));               // ok, give to sink()
    // ...
    sink(w);    // Error: unique_ptr is carefully designed so that you cannot copy it
}

Notes

std::move() is a cast to && in disguise; it doesn't itself move anything, but marks a named object as a candidate that can be moved from. The language already knows the common cases where objects can be moved from, especially when returning values from functions, so don't complicate code with redundant std::move()'s.

Never write std::move() just because you've heard "it's more efficient." In general, don't believe claims of "efficiency" without data (???). In general, don't complicate your code without reason (??). Never write std::move() on a const object, it is silently transformed into a copy (see Item 23 in Meyers15)

Example, bad

vector<int> make_vector()
{
    vector<int> result;
    // ... load result with data
    return std::move(result);       // bad; just write "return result;"
}

Never write return move(local_variable);, because the language already knows the variable is a move candidate. Writing move in this code won't help, and can actually be detrimental because on some compilers it interferes with RVO (the return value optimization) by creating an additional reference alias to the local variable.

Example, bad

vector<int> v = std::move(make_vector());   // bad; the std::move is entirely redundant

Never write move on a returned value such as x = move(f()); where f returns by value. The language already knows that a returned value is a temporary object that can be moved from.

Example

void mover(X&& x)
{
    call_something(std::move(x));         // ok
    call_something(std::forward<X>(x));   // bad, don't std::forward an rvalue reference
    call_something(x);                    // suspicious, why not std::move?
}

template<class T>
void forwarder(T&& t)
{
    call_something(std::move(t));         // bad, don't std::move a forwarding reference
    call_something(std::forward<T>(t));   // ok
    call_something(t);                    // suspicious, why not std::forward?
}

Enforcement

  • Flag use of std::move(x) where x is an rvalue or the language will already treat it as an rvalue, including return std::move(local_variable); and std::move(f()) on a function that returns by value.
  • Flag functions taking an S&& parameter if there is no const S& overload to take care of lvalues.
  • Flag a std::moved argument passed to a parameter, except when the parameter type is an X&& rvalue reference or the type is move-only and the parameter is passed by value.
  • Flag when std::move is applied to a forwarding reference (T&& where T is a template parameter type). Use std::forward instead.
  • Flag when std::move is applied to other than an rvalue reference to non-const. (More general case of the previous rule to cover the non-forwarding cases.)
  • Flag when std::forward is applied to an rvalue reference (X&& where X is a non-template parameter type). Use std::move instead.
  • Flag when std::forward is applied to other than a forwarding reference. (More general case of the previous rule to cover the non-moving cases.)
  • Flag when an object is potentially moved from and the next operation is a const operation; there should first be an intervening non-const operation, ideally assignment, to first reset the object's value.

ES.60: Avoid new and delete outside resource management functions

Reason

Direct resource management in application code is error-prone and tedious.

Note

This is also known as the rule of "No naked new!"

Example, bad

void f(int n)
{
    auto p = new X[n];   // n default constructed Xs
    // ...
    delete[] p;
}

There can be code in the ... part that causes the delete never to happen.

See also: R: Resource management

Enforcement

Flag naked news and naked deletes.

ES.61: Delete arrays using delete[] and non-arrays using delete

Reason

That's what the language requires and mistakes can lead to resource release errors and/or memory corruption.

Example, bad

void f(int n)
{
    auto p = new X[n];   // n default constructed Xs
    // ...
    delete p;   // error: just delete the object p, rather than delete the array p[]
}

Note

This example not only violates the no naked new rule as in the previous example, it has many more problems.

Enforcement

  • If the new and the delete are in the same scope, mistakes can be flagged.
  • If the new and the delete are in a constructor/destructor pair, mistakes can be flagged.

ES.62: Don't compare pointers into different arrays

Reason

The result of doing so is undefined.

Example, bad

void f()
{
    int a1[7];
    int a2[9];
    if (&a1[5] < &a2[7]) {}       // bad: undefined
    if (0 < &a1[5] - &a2[7]) {}   // bad: undefined
}

Note

This example has many more problems.

Enforcement

???

ES.63: Don't slice

Reason

Slicing -- that is, copying only part of an object using assignment or initialization -- most often leads to errors because the object was meant to be considered as a whole. In the rare cases where the slicing was deliberate the code can be surprising.

Example

class Shape { /* ... */ };
class Circle : public Shape { /* ... */ Point c; int r; };

Circle c {{0, 0}, 42};
Shape s {c};    // copy construct only the Shape part of Circle
s = c;          // or copy assign only the Shape part of Circle

void assign(const Shape& src, Shape& dest)
{
    dest = src;
}
Circle c2 {{1, 1}, 43};
assign(c, c2);   // oops, not the whole state is transferred
assert(c == c2); // if we supply copying, we should also provide comparison,
                 // but this will likely return false

The result will be meaningless because the center and radius will not be copied from c into s. The first defense against this is to define the base class Shape not to allow this.

Alternative

If you mean to slice, define an explicit operation to do so. This saves readers from confusion. For example:

class Smiley : public Circle {
    public:
    Circle copy_circle();
    // ...
};

Smiley sm { /* ... */ };
Circle c1 {sm};  // ideally prevented by the definition of Circle
Circle c2 {sm.copy_circle()};

Enforcement

Warn against slicing.

ES.64: Use the T{e}notation for construction

Reason

The T{e} construction syntax makes it explicit that construction is desired. The T{e} construction syntax doesn't allow narrowing. T{e} is the only safe and general expression for constructing a value of type T from an expression e. The casts notations T(e) and (T)e are neither safe nor general.

Example

For built-in types, the construction notation protects against narrowing and reinterpretation

void use(char ch, int i, double d, char* p, long long lng)
{
    int x1 = int{ch};     // OK, but redundant
    int x2 = int{d};      // error: double->int narrowing; use a cast if you need to
    int x3 = int{p};      // error: pointer to->int; use a reinterpret_cast if you really need to
    int x4 = int{lng};    // error: long long->int narrowing; use a cast if you need to

    int y1 = int(ch);     // OK, but redundant
    int y2 = int(d);      // bad: double->int narrowing; use a cast if you need to
    int y3 = int(p);      // bad: pointer to->int; use a reinterpret_cast if you really need to
    int y4 = int(lng);    // bad: long long->int narrowing; use a cast if you need to

    int z1 = (int)ch;     // OK, but redundant
    int z2 = (int)d;      // bad: double->int narrowing; use a cast if you need to
    int z3 = (int)p;      // bad: pointer to->int; use a reinterpret_cast if you really need to
    int z4 = (int)lng;    // bad: long long->int narrowing; use a cast if you need to
}

The integer to/from pointer conversions are implementation defined when using the T(e) or (T)e notations, and non-portable between platforms with different integer and pointer sizes.

Note

Avoid casts (explicit type conversion) and if you must prefer named casts.

Note

When unambiguous, the T can be left out of T{e}.

complex<double> f(complex<double>);

auto z = f({2*pi, 1});

Note

The construction notation is the most general initializer notation.

Exception

std::vector and other containers were defined before we had {} as a notation for construction. Consider:

vector<string> vs {10};                           // ten empty strings
vector<int> vi1 {1, 2, 3, 4, 5, 6, 7, 8, 9, 10};  // ten elements 1..10
vector<int> vi2 {10};                             // one element with the value 10

How do we get a vector of 10 default initialized ints?

vector<int> v3(10); // ten elements with value 0

The use of () rather than {} for number of elements is conventional (going back to the early 1980s), hard to change, but still a design error: for a container where the element type can be confused with the number of elements, we have an ambiguity that must be resolved. The conventional resolution is to interpret {10} as a list of one element and use (10) to distinguish a size.

This mistake need not be repeated in new code. We can define a type to represent the number of elements:

struct Count { int n; };

template<typename T>
class Vector {
public:
    Vector(Count n);                     // n default-initialized elements
    Vector(initializer_list<T> init);    // init.size() elements
    // ...
};

Vector<int> v1{10};
Vector<int> v2{Count{10}};
Vector<Count> v3{Count{10}};    // yes, there is still a very minor problem

The main problem left is to find a suitable name for Count.

Enforcement

Flag the C-style (T)e and functional-style T(e) casts.

ES.65: Don't dereference an invalid pointer

Reason

Dereferencing an invalid pointer, such as nullptr, is undefined behavior, typically leading to immediate crashes, wrong results, or memory corruption.

Note

This rule is an obvious and well-known language rule, but can be hard to follow. It takes good coding style, library support, and static analysis to eliminate violations without major overhead. This is a major part of the discussion of C++'s model for type- and resource-safety.

See also:

Example

void f()
{
    int x = 0;
    int* p = &x;

    if (condition()) {
        int y = 0;
        p = &y;
    } // invalidates p

    *p = 42;            // BAD, p might be invalid if the branch was taken
}

To resolve the problem, either extend the lifetime of the object the pointer is intended to refer to, or shorten the lifetime of the pointer (move the dereference to before the pointed-to object's lifetime ends).

void f1()
{
    int x = 0;
    int* p = &x;

    int y = 0;
    if (condition()) {
        p = &y;
    }

    *p = 42;            // OK, p points to x or y and both are still in scope
}

Unfortunately, most invalid pointer problems are harder to spot and harder to fix.

Example

void f(int* p)
{
    int x = *p; // BAD: how do we know that p is valid?
}

There is a huge amount of such code. Most works -- after lots of testing -- but in isolation it is impossible to tell whether p could be the nullptr. Consequently, this is also a major source of errors. There are many approaches to dealing with this potential problem:

void f1(int* p) // deal with nullptr
{
    if (!p) {
        // deal with nullptr (allocate, return, throw, make p point to something, whatever
    }
    int x = *p;
}

There are two potential problems with testing for nullptr:

  • it is not always obvious what to do if we find nullptr
  • the test can be redundant and/or relatively expensive
  • it is not obvious if the test is to protect against a violation or part of the required logic.
void f2(int* p) // state that p is not supposed to be nullptr
{
    assert(p);
    int x = *p;
}

This would carry a cost only when the assertion checking was enabled and would give a compiler/analyzer useful information. This would work even better if/when C++ gets direct support for contracts:

void f3(int* p) // state that p is not supposed to be nullptr
    [[expects: p]]
{
    int x = *p;
}

Alternatively, we could use gsl::not_null to ensure that p is not the nullptr.

void f(not_null<int*> p)
{
    int x = *p;
}

These remedies take care of nullptr only. Remember that there are other ways of getting an invalid pointer.

Example

void f(int* p)  // old code, doesn't use owner
{
    delete p;
}

void g()        // old code: uses naked new
{
    auto q = new int{7};
    f(q);
    int x = *q; // BAD: dereferences invalid pointer
}

Example

void f()
{
    vector<int> v(10);
    int* p = &v[5];
    v.push_back(99); // could reallocate v's elements
    int x = *p; // BAD: dereferences potentially invalid pointer
}

Enforcement

This rule is part of the lifetime safety profile

  • Flag a dereference of a pointer that points to an object that has gone out of scope
  • Flag a dereference of a pointer that might have been invalidated by assigning a nullptr
  • Flag a dereference of a pointer that might have been invalidated by a delete
  • Flag a dereference to a pointer to a container element that might have been invalidated by dereference