Friday, October 2

Embed with Elliot: Interrupts, the Ugly

Welcome to part three of “Interrupts: The Good, the Bad, and the Ugly”. We’ve already professed our love for interrupts, showing how they are useful for solving multiple common microcontroller tasks with aplomb. That was surely Good. And then we dipped into some of the scheduling and priority problems that can crop up, especially if your interrupt service routines (ISRs) run for too long, or do more than they should. That was Bad, but we can combat those problems by writing lightweight ISRs.

This installment, for better or worse, uncovers our least favorite side effect of running interrupts on a small microcontroller, and that is that your assumptions about what your code is doing can be wrong, and sometimes with disastrous consequences. It’s gonna get Ugly

TL;DR: Once you’ve started changing variables from inside interrupts, you can no longer count on their values staying constant — you never know when the interrupt is going to strike! Murphy’s law says that it will hit at the worst times. The solution is to temporarily turn off interrupts for critical blocks of code, so that your own ISRs can’t pull the rug out from under your feet. (Sounds easy, but read on!)

In this installment, we’ll cover two faces of essentially the same problem, and demonstrate one and a half solutions. But don’t worry. By the end of this article you should have the confidence to write interrupts without fear, because you’ll know the enemy.

Race Conditions

If you remember from our column on the volatile keyword, when you’d like to share data between an ISR and your main body of code, you have to use a global variable because the ISR can’t take any arguments. Because the compiler can’t see that the global variables are changing anywhere, we mark them with the volatile keyword to tell the compiler not to optimize those variables away. We should heed our own advice: variables that are accessed by the ISR can change at any time without notice.

The “obvious” pitfall when sharing variables with ISRs is the race condition: your code sets the variable’s value here and then uses it again there. The “race” in question is whether your code can get fast enough from point A to point B without an interrupt occurring in the middle.

Let’s start off with a quiz, written for an AVR microcontroller:

 
volatile uint16_t raw_accelerometer;
enum {NO, YES} has_new_accelerometer_reading;

ISR(INT1_vector){
        raw_accelerometer = read_accelerometer_z();
        has_new_accelerometer_reading = YES;
}

...

int main(void){
        while (1){

                if (has_new_accelerometer_reading == YES){
                        if (raw_accelerometer != 0){
                                display(raw_accelerometer);
                        }
                        // Flag that we've handled this sample
                        has_new_accelerometer_reading = NO;
                }
        }
}

We’ve apparently already read the last Embed with Elliot, because the ISR here is short and does the minimum it needs to. All the logic and the print command are in the main loop where they won’t cause trouble for other potential interrupts. Gold star!

The problem, however, is that we see occasional zero values printed out on our display. How can that be? We only display the accelerometer value after explicitly testing that it isn’t zero.

The key word in the above sentence is “after”. Think about the worst possible time for an accelerometer update interrupt to occur How about between testing that raw_accelerometer isn’t zero, but before printing the value out? Yup, that would do it.

Of course, the chance of an interrupt firing off just exactly during the last instruction in the test statement is relatively small. It won’t happen that often, will it? Bearing in mind that your code runs through the main() loop a very large number of times, if the accelerometer updates often enough, you can be sure that the glitch will happen. It will just happen infrequently, and that’s the worst kind of bug to trace and squash.

Shadow Variables: A Tentative Solution

Here’s a trial solution:

 
int main(void){
        uint16_t local_accelerometer_data = 0;
        while (1){

                if (has_new_accelerometer_reading == YES){
                        local_accelerometer_data = raw_accelerometer;
                        if (local_accelerometer_data != 0){
                                display(local_accelerometer_data);
                        }
                        // Flag that we've handled this sample
                        has_new_accelerometer_reading = NO;
                }
        }
}

See what happened there? We added a variable that isn’t shared with the ISR, local_accelerometer_data, and then we work on that copy instead of the variable that the ISR is able to change. Now both the test for zero-ness and the display statement are guaranteed to be based on the same value.

This is only a half solution because the raw_accelerometer variable can still change while we’re working on its local shadow copy. Here, it’s no problem, but if we were going to store our local copy back into the shared variable we’d run the risk that the raw_accelerometer variable had changed while we were working on our local copy, and by writing our local copy back, we’d overwrite the just-changed value. This isn’t trivial, because we’d lose a data point, but at least the work done on the local_accelerometer_data is correct for the data we had on hand at the time that we copied it.

If you’re working on a fancy-schmancy ARM machine with your luxurious 32-bit integers, this half solution might work for you. For those of us accessing shared 16-bit variables on 8-bit machines, there’s one more complication that we’ll have to cover before we present the silver-bullet solution to all of this.

Atomicity and the 8-Bit Blues

Atomicity is the property of being indivisible. (Like atoms were until the early 1900s.) When you write a single operation in a higher-level language, it can end up compiling into a single instruction in machine code, but more often than not it’ll take a few instructions. If these instructions can get interrupted in a way that changes the outcome, they’re not atomic.

The bad news: almost nothing is atomic. For instance, even something simple like ++i; can translate into three machine instructions: one to fetch the value of i from memory into a temporary register, one to add to the register, and a third to write the result back into memory.

The non-atomicity in ++i; is fairly benign because the temporary register works just like the shadow variable did above. If i is changed in memory, while the addition operation is taking place in the CPU’s registers, the memory will get overwritten by the addition operation. But at least the value stored back into i will be the value at the start of the operation plus one.

On an 8-bit machine, dealing with 16-bit or longer shared variables, much stranger stuff can happen. Accessing 16-bit variables is fundamentally different because even creating the shadow variable version of the variable isn’t atomic, taking one instruction per byte, and is thus interruptible. To see this, we’ll need to turn to a bit of assembly language, and think back to making a local copy of our shared 16-bit raw_accelerometer variable.

For this example we’re using avr-gcc and it comes with a whole slew of tools. One such tool is avr-objdump which takes the compiled machine code and disassembles it back into “readable” assembly language. Specifically, if you’ve compiled the code using the debugging flags (-g) then the output of something like avr-objdump -S myCode.elf > myCode.lst returns your original code interspersed with the assembly-language version of it.

 
     local_accelerometer_data = raw_accelerometer;
b8:     80 91 00 01     lds     r24, 0x0100
bc:     90 91 01 01     lds     r25, 0x0101

Anyway, you don’t have to be an assembly language guru to see that what we thought was one operation takes two different assembly language instructions. The first instruction copies the lower eight bits of our raw_accelerometer data into the r24 working register, and the second the upper eight bits.

Now where’s the worst place that the ISR could strike and change the shared raw_accelerometer value out from under our feet? Between the two lds instructions. When variable access or manipulation takes more than a single machine / assembly instruction, there’s yet another chance for a failure of atomicity, and for horrible glitches, caused by variables shared with ISRs. And in particular, this breaks our previous “solution” to the atomicity problem.

And notice just how ugly this is. When the interrupt hits between the two lds commands, the copy of the variable includes the low bits from one reading, and the high bits from an entirely different value. It’s a hybrid of the two values, a number that never was meant to be. And it can arise any time you use a 16-bit variable that’s shared with an ISR.

We’ve cooked up some example code to demo the phenomenon for you. It will run on any AVR microcontroller, including an AVR-based Arduino. The code is written to blink an LED every time a corrupted value is discovered, and it ends up looking like a firefly convention even though we’ve throttled the interrupt speed down to one interrupt every 65,536 CPU cycles. We’re going to need to fix this!

Arduino Aside

The worst failure of atomicity above is caused by sharing a 16-bit variable with the ISR on an 8-bit machine. If we only needed to read eight bits from the accelerometer, the “shadow variable” solution would have worked just fine. We’re reiterating this here because we see a lot of 16-bit ints used in Arduino code when a shorter data type would suffice.

And we’re not blaming the Arduino users — most of the built-in Arduino example code uses int for everything, including pin numbers. The 16-bit int on an AVR Arduino has a range from -32,768 to 32,767. It’s probably good future-proofing for when Arduinos have more than 255 pins, but we can’t imagine needing to access pin number -32,000, whatever that would mean. The rationale behind just using int for everything, we suppose, is that learning about different data types is tough.

If it were only a matter of wasting RAM or CPU cycles, it’d be OK. But using ints by default means that you’re introducing all sorts of non-atomicity into your code without knowing it. As soon as you add in interrupts into the mix, as you can see here, it’s a recipe for disaster: don’t use 16-bit numbers on an 8-bit machine unless you need to.

So how does anything ever work on Arduino? The libraries (mostly) aren’t written in this sloppy / naïve fashion (any more). In fact, we’ll take apart the millis() function once we’ve solved our atomicity problems once and for all. You’ll see that it does exactly the right thing.

Finally, True Atomicity

To re-recap: variables that are shared with an ISR can change without notice, and making a local copy of the variable only half-solves the problem because local copies only work when the operation to make the copy is atomic. How can we make sure that interrupts aren’t changing our variables while we’re not looking? The answer is shockingly simple: turn the interrupts off.

In fact, we didn’t even need to worry about making the shadow local variable at all if we were willing to just turn interrupts on and off again.

 
int main(void){
        while (1){

                if (has_new_accelerometer_reading == YES){
                        
                        cli();  // clears the global interrupt flag
                
                        if (raw_accelerometer != 0){
                                display(raw_accelerometer);
                        }
                        // Flag that we've handled this sample
                        has_new_accelerometer_reading = NO;

                        sei();  // re-enables interrupts
                }
        }
}

Simple enough, and works. Heck, on the AVR, the interrupt disable/enable commands translates directly into a machine code that only takes one cycle each. Hard to beat that. Unless you’re disabling interrupts for a relatively long time.

But then we could use the local variable copy trick:

 
int main(void){
        uint16_t local_accelerometer_data = 0;
        while (1){
                if (has_new_accelerometer_reading == YES){
                        
                        cli();  // clears the global interrupt flag
                        local_accelerometer_data = raw_accelerometer;
                        sei();  // re-enables interrupts
                        
                        if (local_accelerometer_data != 0){
                                display(local_accelerometer_data);
                        }
                        // Flag that we've handled this sample
                        has_new_accelerometer_reading = NO;
                }
        }
}

Now the assignment of the local variable is all that needs protecting, and so the interrupt is only out of action for a couple cycles. That’s pretty awesome. (Again, remember the caveat about the underlying “raw” data getting out of sync with the local copy.)

Arduino’s Millis

One thing our code above didn’t do was to check if interrupts were set in the first place. We simply assumed that interrupts were on, and after we were done with our critical section, turned them back on. But if they weren’t on in the first place, then we’ve changed something unintentionally. The solution is to record the value of the status register, SREG, which contains the global interrupt enable bit, and restore it after the critical section is over.

As promised, here is the millisecond counter routine from the Arduino library. (It’s in “wiring.c” if you’re interested.)

 
unsigned long millis()
{
        unsigned long m;
        uint8_t oldSREG = SREG;

        // disable interrupts while we read timer0_millis or we might get an
        // inconsistent value (e.g. in the middle of a write to timer0_millis)
        cli();
        m = timer0_millis;
        SREG = oldSREG;

        return m;
}

Note that it does the right things for just the right reasons — this is why your Arduino timer code actually works. First, it copies over the value of SREG which contains the old global interrupt enable bit. Then it creates a local copy of the current (ISR-shared) milliseconds counter. Finally, it restores the SREG and returns the number of milliseconds. Well done.

Atomic Blocks

Making a general-purpose solution for protecting critical sections like this turns out to be a little bit tricky. At a minimum, we’d like to copy over the SREG contents as is done in millis(). The AVR library has a “util/atomic.h” header that defines the ATOMIC_BLOCK wrapper that basically does that for us.

There are two possible arguments, ATOMIC_RESTORESTATE and ATOMIC_FORCEON, and they correspond to the two cases where the block first figures out if the global interrupt vector was on beforehand, or just assumes that it was and sets it on at the end, respectively.

And ATOMIC_BLOCK is actually cleverer than we’ve discussed so far. It includes a code trick that allows you to use return or break statements inside the block, and it still takes care of re-setting the global interrupt flag for you.

As an example, here’s Arduino’s millis() re-written to take advantage of the standard GCC AVR library:

 
unsigned long millis(){
    ATOMIC_BLOCK(ATOMIC_RESTORESTATE){
            return timer0_millis;
    }
}

That’s a lot clearer than the original, in our estimation. Using ATOMIC_BLOCK saves you a lot of hassle, and makes the code easier to read to boot. Indeed, if you want to save the call overhead, you can simply use timer0_millis yourself from your main routine as long as you wrap it in an ATOMIC_BLOCK:

 
ATOMIC_BLOCK(ATOMIC_RESTORESTATE){
    if (timer0_millis > alarm_time){
        do_whatever();
    }
}

The Last Possible Refinement

If you’re one of those optimizer-type people, you’ll have noticed that only one particular ISR shares our variable with the critical section. Instead of disabling all interrupts, you could imagine disabling only the particular interrupt that’s causing the trouble.

This will also get us atomicity, and we can’t think of any reason not to do so except for code complexity. It means giving up on the one-size-fits-all ATOMIC_BLOCK-style global interrupt manipulations, but it’s probably worth it if you’ve got real-time constraints on other interrupts that you don’t want to block. Nonetheless, we’ve seen a lot of global-interrupt disabling in practice. Would any readers care to chime in with some real-world examples where only specific interrupts were disabled to protect a critical atomic section?

Conclusion

This has been an epic trip through the topic of interrupts, and we hope you liked it. Interrupts are the most powerful tool in the microcontroller arsenal because they directly answer the number-one concern with microcontroller applications: interfacing with real-world peripherals. Mastering interrupts brings you from a microcontroller beginner fully into the intermediate camp, so it’s worth it.

Interrupts also introduce a degree of multitasking, which brings along with it such issues as scheduling and priorities (the bad) and atomicity failures and race conditions (the ugly). Knowing about these issues is half the battle. Are there any other big pitfalls that come with using interrupts that we’ve missed? What are your favorite interrupt horror stories? Post up in the comments.


Filed under: Hackaday Columns

No comments:

Post a Comment