F.def
F.def: Function definitions
A function definition is a function declaration that also specifies the function's implementation, the function body.
F.1: "Package" meaningful operations as carefully named functions
Reason
Factoring out common code makes code more readable, more likely to be reused, and limit errors from complex code. If something is a well-specified action, separate it out from its surrounding code and give it a name.
Example, don't
void read_and_print(istream& is) // read and print an int
{
int x;
if (is >> x)
cout << "the int is " << x << '\n';
else
cerr << "no int on input\n";
}
Almost everything is wrong with read_and_print
.
It reads, it writes (to a fixed ostream
), it writes error messages (to a fixed ostream
), it handles only int
s.
There is nothing to reuse, logically separate operations are intermingled and local variables are in scope after the end of their logical use.
For a tiny example, this looks OK, but if the input operation, the output operation, and the error handling had been more complicated the tangled
mess could become hard to understand.
Note
If you write a non-trivial lambda that potentially can be used in more than one place, give it a name by assigning it to a (usually non-local) variable.
Example
sort(a, b, [](T x, T y) { return x.rank() < y.rank() && x.value() < y.value(); });
Naming that lambda breaks up the expression into its logical parts and provides a strong hint to the meaning of the lambda.
auto lessT = [](T x, T y) { return x.rank() < y.rank() && x.value() < y.value(); };
sort(a, b, lessT);
The shortest code is not always the best for performance or maintainability.
Exception
Loop bodies, including lambdas used as loop bodies, rarely need to be named. However, large loop bodies (e.g., dozens of lines or dozens of pages) can be a problem. The rule Keep functions short and simple implies "Keep loop bodies short." Similarly, lambdas used as callback arguments are sometimes non-trivial, yet unlikely to be reusable.
Enforcement
- See Keep functions short and simple
- Flag identical and very similar lambdas used in different places.
F.2: A function should perform a single logical operation
Reason
A function that performs a single operation is simpler to understand, test, and reuse.
Example
Consider:
void read_and_print() // bad
{
int x;
cin >> x;
// check for errors
cout << x << "\n";
}
This is a monolith that is tied to a specific input and will never find another (different) use. Instead, break functions up into suitable logical parts and parameterize:
int read(istream& is) // better
{
int x;
is >> x;
// check for errors
return x;
}
void print(ostream& os, int x)
{
os << x << "\n";
}
These can now be combined where needed:
void read_and_print()
{
auto x = read(cin);
print(cout, x);
}
If there was a need, we could further templatize read()
and print()
on the data type, the I/O mechanism, the response to errors, etc. Example:
auto read = [](auto& input, auto& value) // better
{
input >> value;
// check for errors
};
void print(auto& output, const auto& value)
{
output << value << "\n";
}
Enforcement
- Consider functions with more than one "out" parameter suspicious. Use return values instead, including
tuple
for multiple return values. - Consider "large" functions that don't fit on one editor screen suspicious. Consider factoring such a function into smaller well-named suboperations.
- Consider functions with 7 or more parameters suspicious.
F.3: Keep functions short and simple
Reason
Large functions are hard to read, more likely to contain complex code, and more likely to have variables in larger than minimal scopes. Functions with complex control structures are more likely to be long and more likely to hide logical errors
Example
Consider:
double simple_func(double val, int flag1, int flag2)
// simple_func: takes a value and calculates the expected ASIC output,
// given the two mode flags.
{
double intermediate;
if (flag1 > 0) {
intermediate = func1(val);
if (flag2 % 2)
intermediate = sqrt(intermediate);
}
else if (flag1 == -1) {
intermediate = func1(-val);
if (flag2 % 2)
intermediate = sqrt(-intermediate);
flag1 = -flag1;
}
if (abs(flag2) > 10) {
intermediate = func2(intermediate);
}
switch (flag2 / 10) {
case 1: if (flag1 == -1) return finalize(intermediate, 1.171);
break;
case 2: return finalize(intermediate, 13.1);
default: break;
}
return finalize(intermediate, 0.);
}
This is too complex. How would you know if all possible alternatives have been correctly handled? Yes, it breaks other rules also.
We can refactor:
double func1_muon(double val, int flag)
{
// ???
}
double func1_tau(double val, int flag1, int flag2)
{
// ???
}
double simple_func(double val, int flag1, int flag2)
// simple_func: takes a value and calculates the expected ASIC output,
// given the two mode flags.
{
if (flag1 > 0)
return func1_muon(val, flag2);
if (flag1 == -1)
// handled by func1_tau: flag1 = -flag1;
return func1_tau(-val, flag1, flag2);
return 0.;
}
Note
"It doesn't fit on a screen" is often a good practical definition of "far too large." One-to-five-line functions should be considered normal.
Note
Break large functions up into smaller cohesive and named functions. Small simple functions are easily inlined where the cost of a function call is significant.
Enforcement
- Flag functions that do not "fit on a screen." How big is a screen? Try 60 lines by 140 characters; that's roughly the maximum that's comfortable for a book page.
- Flag functions that are too complex. How complex is too complex? You could use cyclomatic complexity. Try "more than 10 logical paths through." Count a simple switch as one path.
F.4: If a function might have to be evaluated at compile time, declare it constexpr
Reason
constexpr
is needed to tell the compiler to allow compile-time evaluation.
Example
The (in)famous factorial:
constexpr int fac(int n)
{
constexpr int max_exp = 17; // constexpr enables max_exp to be used in Expects
Expects(0 <= n && n < max_exp); // prevent silliness and overflow
int x = 1;
for (int i = 2; i <= n; ++i) x *= i;
return x;
}
This is C++14.
For C++11, use a recursive formulation of fac()
.
Note
constexpr
does not guarantee compile-time evaluation;
it just guarantees that the function can be evaluated at compile time for constant expression arguments if the programmer requires it or the compiler decides to do so to optimize.
constexpr int min(int x, int y) { return x < y ? x : y; }
void test(int v)
{
int m1 = min(-1, 2); // probably compile-time evaluation
constexpr int m2 = min(-1, 2); // compile-time evaluation
int m3 = min(-1, v); // run-time evaluation
constexpr int m4 = min(-1, v); // error: cannot evaluate at compile time
}
Note
Don't try to make all functions constexpr
.
Most computation is best done at run time.
Note
Any API that might eventually depend on high-level run-time configuration or
business logic should not be made constexpr
. Such customization can not be
evaluated by the compiler, and any constexpr
functions that depended upon
that API would have to be refactored or drop constexpr
.
Enforcement
Impossible and unnecessary.
The compiler gives an error if a non-constexpr
function is called where a constant is required.
F.5: If a function is very small and time-critical, declare it inline
Reason
Some optimizers are good at inlining without hints from the programmer, but don't rely on it. Measure! Over the last 40 years or so, we have been promised compilers that can inline better than humans without hints from humans. We are still waiting. Specifying inline (explicitly, or implicitly when writing member functions inside a class definition) encourages the compiler to do a better job.
Example
inline string cat(const string& s, const string& s2) { return s + s2; }
Exception
Do not put an inline
function in what is meant to be a stable interface unless you are certain that it will not change.
An inline function is part of the ABI.
Note
constexpr
implies inline
.
Note
Member functions defined in-class are inline
by default.
Exception
Function templates (including member functions of class templates A<T>::function()
and member function templates A::function<T>()
) are normally defined in headers and therefore inline.
Enforcement
Flag inline
functions that are more than three statements and could have been declared out of line (such as class member functions).
F.6: If your function must not throw, declare it noexcept
Reason
If an exception is not supposed to be thrown, the program cannot be assumed to cope with the error and should be terminated as soon as possible. Declaring a function noexcept
helps optimizers by reducing the number of alternative execution paths. It also speeds up the exit after failure.
Example
Put noexcept
on every function written completely in C or in any other language without exceptions.
The C++ Standard Library does that implicitly for all functions in the C Standard Library.
Note
constexpr
functions can throw when evaluated at run time, so you might need conditional noexcept
for some of those.
Example
You can use noexcept
even on functions that can throw:
vector<string> collect(istream& is) noexcept
{
vector<string> res;
for (string s; is >> s;)
res.push_back(s);
return res;
}
If collect()
runs out of memory, the program crashes.
Unless the program is crafted to survive memory exhaustion, that might be just the right thing to do;
terminate()
might generate suitable error log information (but after memory runs out it is hard to do anything clever).
Note
You must be aware of the execution environment that your code is running when
deciding whether to tag a function noexcept
, especially because of the issue
of throwing and allocation. Code that is intended to be perfectly general (like
the standard library and other utility code of that sort) needs to support
environments where a bad_alloc
exception could be handled meaningfully.
However, most programs and execution environments cannot meaningfully
handle a failure to allocate, and aborting the program is the cleanest and
simplest response to an allocation failure in those cases. If you know that
your application code cannot respond to an allocation failure, it could be
appropriate to add noexcept
even on functions that allocate.
Put another way: In most programs, most functions can throw (e.g., because they
use new
, call functions that do, or use library functions that reports failure
by throwing), so don't just sprinkle noexcept
all over the place without
considering whether the possible exceptions can be handled.
noexcept
is most useful (and most clearly correct) for frequently used,
low-level functions.
Note
Destructors, swap
functions, move operations, and default constructors should never throw.
See also C.44.
Enforcement
- Flag functions that are not
noexcept
, yet cannot throw. - Flag throwing
swap
,move
, destructors, and default constructors.
F.7: For general use, take T*
or T&
arguments rather than smart pointers
Reason
Passing a smart pointer transfers or shares ownership and should only be used when ownership semantics are intended. A function that does not manipulate lifetime should take raw pointers or references instead.
Passing by smart pointer restricts the use of a function to callers that use smart pointers.
A function that needs a widget
should be able to accept any widget
object, not just ones whose lifetimes are managed by a particular kind of smart pointer.
Passing a shared smart pointer (e.g., std::shared_ptr
) implies a run-time cost.
Example
// accepts any int*
void f(int*);
// can only accept ints for which you want to transfer ownership
void g(unique_ptr<int>);
// can only accept ints for which you are willing to share ownership
void g(shared_ptr<int>);
// doesn't change ownership, but requires a particular ownership of the caller
void h(const unique_ptr<int>&);
// accepts any int
void h(int&);
Example, bad
// callee
void f(shared_ptr<widget>& w)
{
// ...
use(*w); // only use of w -- the lifetime is not used at all
// ...
};
// caller
shared_ptr<widget> my_widget = /* ... */;
f(my_widget);
widget stack_widget;
f(stack_widget); // error
Example, good
// callee
void f(widget& w)
{
// ...
use(w);
// ...
};
// caller
shared_ptr<widget> my_widget = /* ... */;
f(*my_widget);
widget stack_widget;
f(stack_widget); // ok -- now this works
Note
We can catch many common cases of dangling pointers statically (see lifetime safety profile). Function arguments naturally live for the lifetime of the function call, and so have fewer lifetime problems.
Enforcement
- (Simple) Warn if a function takes a parameter of a smart pointer type (that overloads
operator->
oroperator*
) that is copyable but the function only calls any of:operator*
,operator->
orget()
. Suggest using aT*
orT&
instead. - Flag a parameter of a smart pointer type (a type that overloads
operator->
oroperator*
) that is copyable/movable but never copied/moved from in the function body, and that is never modified, and that is not passed along to another function that could do so. That means the ownership semantics are not used. Suggest using aT*
orT&
instead.
See also:
F.8: Prefer pure functions
Reason
Pure functions are easier to reason about, sometimes easier to optimize (and even parallelize), and sometimes can be memoized.
Example
template<class T>
auto square(T t) { return t * t; }
Enforcement
Not possible.
F.9: Unused parameters should be unnamed
Reason
Readability. Suppression of unused parameter warnings.
Example
widget* find(const set<widget>& s, const widget& w, Hint); // once upon a time, a hint was used
Note
Allowing parameters to be unnamed was introduced in the early 1980s to address this problem.
If parameters are conditionally unused, declare them with the [[maybe_unused]]
attribute.
For example:
template <typename Value>
Value* find(const set<Value>& s, const Value& v, [[maybe_unused]] Hint h)
{
if constexpr (sizeof(Value) > CacheSize)
{
// a hint is used only if Value is of a certain size
}
}
Enforcement
Flag named unused parameters.
F.10: If an operation can be reused, give it a name
Reason
Documentation, readability, opportunity for reuse.
Example
struct Rec {
string name;
string addr;
int id; // unique identifier
};
bool same(const Rec& a, const Rec& b)
{
return a.id == b.id;
}
vector<Rec*> find_id(const string& name); // find all records for "name"
auto x = find_if(vr.begin(), vr.end(),
[&](Rec& r) {
if (r.name.size() != n.size()) return false; // name to compare to is in n
for (int i = 0; i < r.name.size(); ++i)
if (tolower(r.name[i]) != tolower(n[i])) return false;
return true;
}
);
There is a useful function lurking here (case insensitive string comparison), as there often is when lambda arguments get large.
bool compare_insensitive(const string& a, const string& b)
{
if (a.size() != b.size()) return false;
for (int i = 0; i < a.size(); ++i) if (tolower(a[i]) != tolower(b[i])) return false;
return true;
}
auto x = find_if(vr.begin(), vr.end(),
[&](Rec& r) { return compare_insensitive(r.name, n); }
);
Or maybe (if you prefer to avoid the implicit name binding to n):
auto cmp_to_n = [&n](const string& a) { return compare_insensitive(a, n); };
auto x = find_if(vr.begin(), vr.end(),
[](const Rec& r) { return cmp_to_n(r.name); }
);
Note
whether functions, lambdas, or operators.
Exception
- Lambdas logically used only locally, such as an argument to
for_each
and similar control flow algorithms. - Lambdas as initializers
Enforcement
- (hard) flag similar lambdas
- ???
F.11: Use an unnamed lambda if you need a simple function object in one place only
Reason
That makes the code concise and gives better locality than alternatives.
Example
auto earlyUsersEnd = std::remove_if(users.begin(), users.end(),
[](const User &a) { return a.id > 100; });
Exception
Naming a lambda can be useful for clarity even if it is used only once.
Enforcement
- Look for identical and near identical lambdas (to be replaced with named functions or named lambdas).