Friday, May 5

Using Modern C++ Techniques with Arduino

C++ has been quickly modernizing itself over the last few years. Starting with the introduction of C++11, the language has made a huge step forward and things have changed under the hood. To the average Arduino user, some of this is irrelevant, maybe most of it, but the language still gives us some nice features that we can take advantage of as we program our microcontrollers.

Modern C++ allows us to write cleaner, more concise code, and make the code we write more reusable. The following are some techniques using new features of C++ that don’t add memory overhead, reduce speed, or increase size because they’re all handled by the compiler. Using these features of the language you no longer have to worry about specifying a 16-bit variable, calling the wrong function with NULL, or peppering your constructors with initializations. The old ways are still available and you can still use them, but at the very least, after reading this you’ll be more aware of the newer features as we start to see them roll out in Arduino code.

How big are your integers?

C++11 introduced a series of fixed width integer types aliases that will give you the number of bits (in multiples of 8) you want. There are both signed and unsigned versions. If you use int16_t for a variable you know it’s 16 bits, regardless of which Arduino is being targeted.

int8_t/uint8_t - a signed/unsigned type that is exactly 8 bits in size.
int16_t/uint16_t - a signed/unsigned type that is exactly 16 bits in size.
int32_t/uint32_t - a signed/unsigned type that is exactly 32 bits in size.
int64_t/uint64_t - a signed/unsigned type that is exactly 64 bits in size.

These are aliases of the underlying types, so on Arduino Uno, int8_t is the same type as a char, and uint16 is the same size as an unsigned int. One note, if you’re editing a ‘.cpp’ file, you’ll have to #include <stdint.h>, but in a ‘.ino’ file, you don’t.

Anonymous Namespaces

The concept of an anonymous namespace has been around for a while in C++, but it came to prominence with the introduction of C++11. In C++11, the anonymous namespace is now the preferred way to specify a variable, function, or class that is accessible only in the current file (ie., they have internal, rather than external, linkage). It’s also a nice way to consolidate things which only need to be accessed within the current file. Anonymous namespace are also called ‘unnamed namespaces’ because, as the name suggests, they are defined without a name:

namespace {
  ... 
}

Anonymous namespaces take the place of declaring things static. Anything you may have declared as static or const, or written as a #define can now put into an anonymous namespace and have the same effect – anything defined in inside cannot be accessed outside of the ‘.cpp’ file that the namespace is in. In the Arduino IDE, though, if you add a tab and give the new tab a name that doesn’t end in ‘.cpp’ or ‘.h’ then the file is given an extension of ‘.ino.’ When you click the ‘Verify’ button, all the ‘.ino’ files are concatenated into a single ‘.cpp’ file. This means that anything you declare as being static or const or in an anonymous namespace will be available in each ‘.ino’ file because, when they are concatenated, they all end up in the same ‘.cpp’ file. This usually isn’t a problem for small Arduino programs, but it’s good to be aware of it.

Okay, so now we know about internal and external linkages and how files are concatenated before being compiled. How does this help us in our code? Well, we now know how to define variables so that they won’t leak into areas they’re not supposed to be.

rather than write:

// This should be used sparingly anyway
#define SOME_VAR 1000    
// Static variables declared in a file are local to the file
static int16_t count = 0;    
// const variables declared in a file are local to the file as well
const int16_t numLeds = 4;  

you can now write:

namespace {
  const int16_t SOME_VAR = 1000;  // Now it's type-safe
  int16_t count = 0;  // No need to use static
  const int16_t numLeds = 0;  // Still declared const.

  class thisClassHasACommonName {
    ...
  };
}

Everything’s contained within the anonymous namespace at the beginning of the file. The compiler won’t get confused if there’s another SOME_VAR, count or numLeds defined in a different file. And unlike static, classes declared in an anonymous namespace are local to the file as well.

Automatic for the people

The auto keyword, added in C++11, allows you to define a variable without knowing its type. Once defined, though, like other variables, it’s type can’t be changed, just like regular C++ variables. The C++ compiler uses deduction figure out the variable’s type.

auto i = 5;          // i is of type int
auto j = 7.5f;       // j is of type float
auto k = GetResult();  // k is of whatever type GetResult() returns 

in order for you to specify that you want a reference, you can do this:

auto& temp1 = myValue; 
const auto& temp2 = myValue;

The proper type is deduced for pointers as well:

 
int myValue = 5; 
auto myValuePtr = &myValue; // myValuePtr is a pointer to an int.

Auto is a great shorthand for those especially long, complicated types.  It works great for defining iterator instances!

Using using

There have been a couple of ways to create aliases in C++: The lowly (and dangerous) #define and the less dangerous typedef. The use of typedef is preferred to a #define, but they do have a couple of issues. The first is readability. Consider the following typedef:

typedef void(*fp)(int, const char*);

At first glance, especially if you don’t use C++ a lot, it can be difficult to determine that this is creating an alias, fp, that is a pointer to a function that returns void and takes an int and a string parameter.  Now, let’s see the C++11 way:

using fp = void(*)(int, const char*);

We can now see that fp is an alias to something, and it’s a bit easier to determine that it’s a function pointer that it’s an alias of.

The second difference is a bit more esoteric, at least in Arduino programming. Usings can be templatized while typedefs cannot.

Nullptr

In C and C++98, NULL is actually defined as 0. To be backwards compatible, the C++ compiler will allow you to initialize a pointer variable using 0 or NULL.

When you start using auto, though, you’ll start seeing code like this:


auto result = GetResult(...);

if (result == 0) { ... }

Just from looking at this, you can’t tell if GetResult returns a pointer or an integer. However, even among those who still use C, not many will check if result == 0, they’ll check if result == NULL.

auto result = GetResult(...);

if (result == NULL) { ... }

If, later on, you change GetResult to return an int, there’s no compiler error – the statement is still valid, although it still looks like it should be a pointer.

C++11 changes this by introducing nullptr, which is an actual pointer type. Now, you type:

auto result = GetResult(...);

if (result == nullptr) { ... }

Now you know that GetResult can only return a pointer. And if someone changes it on you, then they’ll get a compile error if they don’t also change the if statement on you. The introduction of nullptr also means that it’s safer to overload a method on an integral type and a pointer type. For example:

void SetValue(int i);     // (1)
void SetValue(Widget* w); // (2)
... 
SetValue(5);    // Calls 1 
SetValue(NULL); // Also calls 1 

Because NULL isn’t a pointer, the second call to SetValue calls the version that takes an integer parameter. We can now call the second SetValue properly by using nullptr:

SetValue(nullptr);  // Calls 2

This is why it’s usually considered dangerous to overload a function or method based on an integer parameter and a pointer (to anything). It’s safer now, but still frowned upon.

Default Initialization

Considering the following class:

class Foo {
  Foo() : fooString(nullptr) { ... }
  Foo(const char* str) : fooString(nullptr) { ... }
  Foo(const Foo& other) : fooString(nullptr) { ... }
  ...
  private:
    char* fooString;
};

We’ve initialized all the variable with nullptr, which is good. If another member variable is added to this class we now have to add three more initializations to the constructors. If your class has several variables, you have to add initializers for all of them in all the constructors. C++11 gives you the option to initialize variables inline with the declaration.

... 
private:
  char* fooString = nullptr;

With C++11, we can specify a default initial value – we can still override this in each constructor if we need to, but, if we don’t, it doesn’t matter how many constructors we add, we only need to set the value in one place. If we’ve separated our class out in to a ‘.h’ file and a ‘.cpp’ file, then an added benefit is that we only have to open the ‘.h’ file to add and initialize a variable.

Scoping Your Enums

One of the things that C++ tries to do is allow the programmer to encapsulate things so that, for example, when you name a variable, you’re not accidentally naming your variable the same as something else with the same name. C++ gives you tools to allow you to do this, such as namespaces and classes. The lowly enum, however, leaks its entries into the surrounding scope:


enum Colour {
  white,
  blue,
  yellow
};
// Doesn't compile.  There's already something in this scope with the name 'white'
auto white = 5;  

The variable ‘white’ can’t be defined because ‘white’ is part of an enum, and that enum leaks it’s entries into the surrounding scope. C++11 introduces scoped enums which allow a couple of things that C++98 didn’t allow. The first, as the name implies, is that the enums are fully scoped. The way to create a scoped enum is with the use of the ‘class’ keyword:


enum class Colour {
  white,
  blue,
  yellow
};

auto white = 5; // This now works.

Colour c = white; // Error, nothing in this scope has been defined called white.
Colour c = Colour::white; // Correct.

By default, scoped enums have an underlying type: int, so whatever size an int is on your platform, that’s the size that sizeof will return for your enum. Before C++11, unscoped enums also had an underlying type, but the compiler tried to be smart about it, so it would determine what the underlying type was and not tell us – it could optimize it for size and create an underlying type that was the smallest that could fit the number of entries. Or it could optimize for speed and create a type that was the fastest for the platform. All this means is that the compiler knew what type an enum was, but we didn’t, so we couldn’t forward declare the enum in a different file. For example,


file1.h:

enum Colour {
  white,
  blue,
  yellow
};

file2.h:

enum Colour; // Error, in this file, the compiler doesn't know what the type of Colour is.

void SomeFunction(Colour c);

The only way is to #include header1.h in header2.h. For small projects, this is fine, but in bigger projects, adding an entry to the Colour enum will mean a recompilation of anything that includes header1.h or header2.h. Scoped enums have a default size: int, so the compiler always knows what size they are. And, you can change the size if you wish:


file1.h:

enum class Colour: std::int8_t {
  white,
  blue,
  yellow
};

file2.h:

enum class Colour: std::int8_t;

void SomeFunction(Colour c);

Now any file that includes file2.h doesn’t necessarily have to be recompiled (file2.cpp will have to, because you’ll have to #include file1.h in it in order get it to compile properly.)

In Conclusion

You do have choice when moving forward programming your microcontroller. There’s no reason to use any of these if you don’t want to. I’ve found, however, that they help clean up my code and help me catch bugs at compile time rather than at run time. Many are just syntactic sugar, and you may not like the change. Some are just niceties that make working in a complicated language a bit nicer. What’s holding you back from trying them out?


Filed under: Hackaday Columns, Microcontrollers

No comments:

Post a Comment