콘텐츠로 이동

F.def

F.def: Function definitions

A function definition is a function declaration that also specifies the function's implementation, the function body.

F.1: "Package" meaningful operations as carefully named functions

Reason

Factoring out common code makes code more readable, more likely to be reused, and limit errors from complex code. If something is a well-specified action, separate it out from its surrounding code and give it a name.

Example, don't

void read_and_print(istream& is)    // read and print an int
{
    int x;
    if (is >> x)
        cout << "the int is " << x << '\n';
    else
        cerr << "no int on input\n";
}

Almost everything is wrong with read_and_print. It reads, it writes (to a fixed ostream), it writes error messages (to a fixed ostream), it handles only ints. There is nothing to reuse, logically separate operations are intermingled and local variables are in scope after the end of their logical use. For a tiny example, this looks OK, but if the input operation, the output operation, and the error handling had been more complicated the tangled mess could become hard to understand.

Note

If you write a non-trivial lambda that potentially can be used in more than one place, give it a name by assigning it to a (usually non-local) variable.

Example

sort(a, b, [](T x, T y) { return x.rank() < y.rank() && x.value() < y.value(); });

Naming that lambda breaks up the expression into its logical parts and provides a strong hint to the meaning of the lambda.

auto lessT = [](T x, T y) { return x.rank() < y.rank() && x.value() < y.value(); };

sort(a, b, lessT);

The shortest code is not always the best for performance or maintainability.

Exception

Loop bodies, including lambdas used as loop bodies, rarely need to be named. However, large loop bodies (e.g., dozens of lines or dozens of pages) can be a problem. The rule Keep functions short and simple implies "Keep loop bodies short." Similarly, lambdas used as callback arguments are sometimes non-trivial, yet unlikely to be reusable.

Enforcement

F.2: A function should perform a single logical operation

Reason

A function that performs a single operation is simpler to understand, test, and reuse.

Example

Consider:

void read_and_print()    // bad
{
    int x;
    cin >> x;
    // check for errors
    cout << x << "\n";
}

This is a monolith that is tied to a specific input and will never find another (different) use. Instead, break functions up into suitable logical parts and parameterize:

int read(istream& is)    // better
{
    int x;
    is >> x;
    // check for errors
    return x;
}

void print(ostream& os, int x)
{
    os << x << "\n";
}

These can now be combined where needed:

void read_and_print()
{
    auto x = read(cin);
    print(cout, x);
}

If there was a need, we could further templatize read() and print() on the data type, the I/O mechanism, the response to errors, etc. Example:

auto read = [](auto& input, auto& value)    // better
{
    input >> value;
    // check for errors
};

void print(auto& output, const auto& value)
{
    output << value << "\n";
}

Enforcement

  • Consider functions with more than one "out" parameter suspicious. Use return values instead, including tuple for multiple return values.
  • Consider "large" functions that don't fit on one editor screen suspicious. Consider factoring such a function into smaller well-named suboperations.
  • Consider functions with 7 or more parameters suspicious.

F.3: Keep functions short and simple

Reason

Large functions are hard to read, more likely to contain complex code, and more likely to have variables in larger than minimal scopes. Functions with complex control structures are more likely to be long and more likely to hide logical errors

Example

Consider:

double simple_func(double val, int flag1, int flag2)
    // simple_func: takes a value and calculates the expected ASIC output,
    // given the two mode flags.
{
    double intermediate;
    if (flag1 > 0) {
        intermediate = func1(val);
        if (flag2 % 2)
             intermediate = sqrt(intermediate);
    }
    else if (flag1 == -1) {
        intermediate = func1(-val);
        if (flag2 % 2)
             intermediate = sqrt(-intermediate);
        flag1 = -flag1;
    }
    if (abs(flag2) > 10) {
        intermediate = func2(intermediate);
    }
    switch (flag2 / 10) {
    case 1: if (flag1 == -1) return finalize(intermediate, 1.171);
            break;
    case 2: return finalize(intermediate, 13.1);
    default: break;
    }
    return finalize(intermediate, 0.);
}

This is too complex. How would you know if all possible alternatives have been correctly handled? Yes, it breaks other rules also.

We can refactor:

double func1_muon(double val, int flag)
{
    // ???
}

double func1_tau(double val, int flag1, int flag2)
{
    // ???
}

double simple_func(double val, int flag1, int flag2)
    // simple_func: takes a value and calculates the expected ASIC output,
    // given the two mode flags.
{
    if (flag1 > 0)
        return func1_muon(val, flag2);
    if (flag1 == -1)
        // handled by func1_tau: flag1 = -flag1;
        return func1_tau(-val, flag1, flag2);
    return 0.;
}

Note

"It doesn't fit on a screen" is often a good practical definition of "far too large." One-to-five-line functions should be considered normal.

Note

Break large functions up into smaller cohesive and named functions. Small simple functions are easily inlined where the cost of a function call is significant.

Enforcement

  • Flag functions that do not "fit on a screen." How big is a screen? Try 60 lines by 140 characters; that's roughly the maximum that's comfortable for a book page.
  • Flag functions that are too complex. How complex is too complex? You could use cyclomatic complexity. Try "more than 10 logical paths through." Count a simple switch as one path.

F.4: If a function might have to be evaluated at compile time, declare it constexpr

Reason

constexpr is needed to tell the compiler to allow compile-time evaluation.

Example

The (in)famous factorial:

constexpr int fac(int n)
{
    constexpr int max_exp = 17;      // constexpr enables max_exp to be used in Expects
    Expects(0 <= n && n < max_exp);  // prevent silliness and overflow
    int x = 1;
    for (int i = 2; i <= n; ++i) x *= i;
    return x;
}

This is C++14. For C++11, use a recursive formulation of fac().

Note

constexpr does not guarantee compile-time evaluation; it just guarantees that the function can be evaluated at compile time for constant expression arguments if the programmer requires it or the compiler decides to do so to optimize.

constexpr int min(int x, int y) { return x < y ? x : y; }

void test(int v)
{
    int m1 = min(-1, 2);            // probably compile-time evaluation
    constexpr int m2 = min(-1, 2);  // compile-time evaluation
    int m3 = min(-1, v);            // run-time evaluation
    constexpr int m4 = min(-1, v);  // error: cannot evaluate at compile time
}

Note

Don't try to make all functions constexpr. Most computation is best done at run time.

Note

Any API that might eventually depend on high-level run-time configuration or business logic should not be made constexpr. Such customization can not be evaluated by the compiler, and any constexpr functions that depended upon that API would have to be refactored or drop constexpr.

Enforcement

Impossible and unnecessary. The compiler gives an error if a non-constexpr function is called where a constant is required.

F.5: If a function is very small and time-critical, declare it inline

Reason

Some optimizers are good at inlining without hints from the programmer, but don't rely on it. Measure! Over the last 40 years or so, we have been promised compilers that can inline better than humans without hints from humans. We are still waiting. Specifying inline (explicitly, or implicitly when writing member functions inside a class definition) encourages the compiler to do a better job.

Example

inline string cat(const string& s, const string& s2) { return s + s2; }

Exception

Do not put an inline function in what is meant to be a stable interface unless you are certain that it will not change. An inline function is part of the ABI.

Note

constexpr implies inline.

Note

Member functions defined in-class are inline by default.

Exception

Function templates (including member functions of class templates A<T>::function() and member function templates A::function<T>()) are normally defined in headers and therefore inline.

Enforcement

Flag inline functions that are more than three statements and could have been declared out of line (such as class member functions).

F.6: If your function must not throw, declare it noexcept

Reason

If an exception is not supposed to be thrown, the program cannot be assumed to cope with the error and should be terminated as soon as possible. Declaring a function noexcept helps optimizers by reducing the number of alternative execution paths. It also speeds up the exit after failure.

Example

Put noexcept on every function written completely in C or in any other language without exceptions. The C++ Standard Library does that implicitly for all functions in the C Standard Library.

Note

constexpr functions can throw when evaluated at run time, so you might need conditional noexcept for some of those.

Example

You can use noexcept even on functions that can throw:

vector<string> collect(istream& is) noexcept
{
    vector<string> res;
    for (string s; is >> s;)
        res.push_back(s);
    return res;
}

If collect() runs out of memory, the program crashes. Unless the program is crafted to survive memory exhaustion, that might be just the right thing to do; terminate() might generate suitable error log information (but after memory runs out it is hard to do anything clever).

Note

You must be aware of the execution environment that your code is running when deciding whether to tag a function noexcept, especially because of the issue of throwing and allocation. Code that is intended to be perfectly general (like the standard library and other utility code of that sort) needs to support environments where a bad_alloc exception could be handled meaningfully. However, most programs and execution environments cannot meaningfully handle a failure to allocate, and aborting the program is the cleanest and simplest response to an allocation failure in those cases. If you know that your application code cannot respond to an allocation failure, it could be appropriate to add noexcept even on functions that allocate.

Put another way: In most programs, most functions can throw (e.g., because they use new, call functions that do, or use library functions that reports failure by throwing), so don't just sprinkle noexcept all over the place without considering whether the possible exceptions can be handled.

noexcept is most useful (and most clearly correct) for frequently used, low-level functions.

Note

Destructors, swap functions, move operations, and default constructors should never throw. See also C.44.

Enforcement

  • Flag functions that are not noexcept, yet cannot throw.
  • Flag throwing swap, move, destructors, and default constructors.

F.7: For general use, take T* or T& arguments rather than smart pointers

Reason

Passing a smart pointer transfers or shares ownership and should only be used when ownership semantics are intended. A function that does not manipulate lifetime should take raw pointers or references instead.

Passing by smart pointer restricts the use of a function to callers that use smart pointers. A function that needs a widget should be able to accept any widget object, not just ones whose lifetimes are managed by a particular kind of smart pointer.

Passing a shared smart pointer (e.g., std::shared_ptr) implies a run-time cost.

Example

// accepts any int*
void f(int*);

// can only accept ints for which you want to transfer ownership
void g(unique_ptr<int>);

// can only accept ints for which you are willing to share ownership
void g(shared_ptr<int>);

// doesn't change ownership, but requires a particular ownership of the caller
void h(const unique_ptr<int>&);

// accepts any int
void h(int&);

Example, bad

// callee
void f(shared_ptr<widget>& w)
{
    // ...
    use(*w); // only use of w -- the lifetime is not used at all
    // ...
};

// caller
shared_ptr<widget> my_widget = /* ... */;
f(my_widget);

widget stack_widget;
f(stack_widget); // error

Example, good

// callee
void f(widget& w)
{
    // ...
    use(w);
    // ...
};

// caller
shared_ptr<widget> my_widget = /* ... */;
f(*my_widget);

widget stack_widget;
f(stack_widget); // ok -- now this works

Note

We can catch many common cases of dangling pointers statically (see lifetime safety profile). Function arguments naturally live for the lifetime of the function call, and so have fewer lifetime problems.

Enforcement

  • (Simple) Warn if a function takes a parameter of a smart pointer type (that overloads operator-> or operator*) that is copyable but the function only calls any of: operator*, operator-> or get(). Suggest using a T* or T& instead.
  • Flag a parameter of a smart pointer type (a type that overloads operator-> or operator*) that is copyable/movable but never copied/moved from in the function body, and that is never modified, and that is not passed along to another function that could do so. That means the ownership semantics are not used. Suggest using a T* or T& instead.

See also:

F.8: Prefer pure functions

Reason

Pure functions are easier to reason about, sometimes easier to optimize (and even parallelize), and sometimes can be memoized.

Example

template<class T>
auto square(T t) { return t * t; }

Enforcement

Not possible.

F.9: Unused parameters should be unnamed

Reason

Readability. Suppression of unused parameter warnings.

Example

widget* find(const set<widget>& s, const widget& w, Hint);   // once upon a time, a hint was used

Note

Allowing parameters to be unnamed was introduced in the early 1980s to address this problem.

If parameters are conditionally unused, declare them with the [[maybe_unused]] attribute. For example:

template <typename Value>
Value* find(const set<Value>& s, const Value& v, [[maybe_unused]] Hint h)
{
    if constexpr (sizeof(Value) > CacheSize)
    {
        // a hint is used only if Value is of a certain size
    }
}

Enforcement

Flag named unused parameters.

F.10: If an operation can be reused, give it a name

Reason

Documentation, readability, opportunity for reuse.

Example

struct Rec {
    string name;
    string addr;
    int id;         // unique identifier
};

bool same(const Rec& a, const Rec& b)
{
    return a.id == b.id;
}

vector<Rec*> find_id(const string& name);    // find all records for "name"

auto x = find_if(vr.begin(), vr.end(),
    [&](Rec& r) {
        if (r.name.size() != n.size()) return false; // name to compare to is in n
        for (int i = 0; i < r.name.size(); ++i)
            if (tolower(r.name[i]) != tolower(n[i])) return false;
        return true;
    }
);

There is a useful function lurking here (case insensitive string comparison), as there often is when lambda arguments get large.

bool compare_insensitive(const string& a, const string& b)
{
    if (a.size() != b.size()) return false;
    for (int i = 0; i < a.size(); ++i) if (tolower(a[i]) != tolower(b[i])) return false;
    return true;
}

auto x = find_if(vr.begin(), vr.end(),
    [&](Rec& r) { return compare_insensitive(r.name, n); }
);

Or maybe (if you prefer to avoid the implicit name binding to n):

auto cmp_to_n = [&n](const string& a) { return compare_insensitive(a, n); };

auto x = find_if(vr.begin(), vr.end(),
    [](const Rec& r) { return cmp_to_n(r.name); }
);

Note

whether functions, lambdas, or operators.

Exception

  • Lambdas logically used only locally, such as an argument to for_each and similar control flow algorithms.
  • Lambdas as initializers

Enforcement

  • (hard) flag similar lambdas
  • ???

F.11: Use an unnamed lambda if you need a simple function object in one place only

Reason

That makes the code concise and gives better locality than alternatives.

Example

auto earlyUsersEnd = std::remove_if(users.begin(), users.end(),
                                    [](const User &a) { return a.id > 100; });

Exception

Naming a lambda can be useful for clarity even if it is used only once.

Enforcement

  • Look for identical and near identical lambdas (to be replaced with named functions or named lambdas).