Thread Safety in Embedded Systems?

tatteredpotato · May 19, 2010

I'm working on an embedded system that runs a Nios II soft core microprocessor on an Altera FPGA. This device has an rs232 port that receives data via interrupts and I have created a FIFO ring buffer structure (in C) to store these characters until the main program logic can deal with them. Now I've gotten the structure to function correctly however I've realized that there is a possible mutual exclusion issue since the interrupt handlers write to the buffer which at this point isn't thread safe. I realize that this is only a single threaded system, but the ISR is acting as an effective second thread.

I'm trying to think of the best way to approach this since this isn't a typical thread safety issue, since pre-emption can only happen in one direction.

Code:

typedef struct{
	char* buffer;
	int size, read_index, write_index, elements;
} ring_buffer;

void new_ring_buffer(ring_buffer** buf, int size) {
    *buf = (ring_buffer*)malloc(sizeof(ring_buffer));
    (*buf) -> buffer = (char*)malloc(size * sizeof(char));
    (*buf) -> size = size;
    (*buf) -> read_index = 0;
    (*buf) -> write_index = 0;
    (*buf) -> elements = 0;
}

void delete_ring_buffer(ring_buffer* buf) {
    free(buf->buffer);
    free(buf);
}

char pop_ring_buffer(ring_buffer *buf) {
    if (buf->elements == 0) {
        // Empty case... not sure this is the optimal solution.
        return '\0';
    }
    char c = buf->buffer[buf->read_index];
    buf->read_index = (buf->read_index + 1) % buf->size;
    buf->elements--;
    return c;
}

void push_ring_buffer(ring_buffer *buf, char c) {
    buf->buffer[buf->write_index] = c;
    buf->write_index = (buf->write_index + 1) % buf -> size;
    buf->elements++;
    if (buf->elements > buf->size) {
        buf->read_index++;
        buf->elements = buf->size;
    }
    return;
}

void clear_ring_buffer(ring_buffer *buf) {
    buf->read_index = buf->write_index;
    buf->elements = 0;
}

Also feel free to critisize my code... I've mostly done OOP (C++/Java/C#) so I'm still getting the hang of the C programming style.

Cogman · May 19, 2010

[code]insert code here[/code]

You should keep your interrupts as short as possible, and if all possible avoid using common data inside them to avoid the very issue your describing.

The x86 architecture IIRC has a flag to make code uniterruptable. Perhaps the Nios II has a similar flag (I don't know, I've never dealt with a Nios II).

Other then that, you'll probably have to create a mutex/semiphore type system. Where this is single threaded, that should be as setting a write bit somewhere and clearing it when you are done. Nothing more fancy is really needed. (unless you expect this to someday work on a multi-core system.)

Cogman · May 19, 2010

Did you program the interrupt? If not, Your code looks fine. The interrupt shouldn't screw things up.

degibson · May 19, 2010

I recommend you look at whatever synchronization primitives come with your libraries that came with your Nios-II. If it doesn't come with locks, look up Nios-II's hardware support operation for synchronization (probably compare-and-swap or fetch-and-add, or both). You can make your own locks from that.

IF you're sure you will never have truly concurrent access, you can try to get away with setting and clearing bits to handle synchronization. It's almost certainly not worth your effort, however, if locks are available.

Code Criticism:

Code:

    if (buf->elements > buf->size) {
        buf->read_index++;
        buf->elements = buf->size;
    }

I suggest you change this to:

Code:

    if (buf->elements > buf->size) {
        buf->read_index = (buf->read_index+1) &#37; buf->size;
        buf->elements = buf->size;
    }

tatteredpotato · May 19, 2010

I've actually taken a closer look at the code, and all the references to the reading of the buffer occur within functions called by the interrupt handler for a timer, so I think it's a moot point for now. I do think many of the interrupt handlers are getting to be a little too bloated so it may be something I have to take a look at in the future if some of the code gets rearranged.

It's possible to disable interrupts on the Nios II processor, however I guess that would be the easy solution if there aren't many calls, however I don't think it should be a solution used too regularly.

tatteredpotato · May 19, 2010

degibson said:

Code Criticism:

Code:

    if (buf->elements > buf->size) {
        buf->read_index++;
        buf->elements = buf->size;
    }

I suggest you change this to:

Code:

    if (buf->elements > buf->size) {
        buf->read_index = (buf->read_index+1) % buf->size;
        buf->elements = buf->size;
    }

Good catch... Thanks.

PandaBear · May 26, 2010

I think there is a serious problem with the push and pop routine if there is no mutex / semaphore wrapper around it.

While you are having the read and write index done correctly (in circular way), using element count to determine whether there is an item in the circular buffer is going to get you out of sync eventually, because the element count is written / updated by both the push and pop (hence the main code and the interrupt).

What you need to do is to make comparison between the head and tail (aka your read and write index) every single time you add or remove element. The logic of the check for add should be if the wrap around occur, i.e. write index + 1 == read index for write, and read index == write index for read. Also make sure the condition that read index == write index should be empty, and write index + 1 == read index should be full.

Since only write index is updated when write, and read index is updated when read, there is no write after read issue, and no need for mutex / semaphore / etc, as long as these indexes are volatile.

This is my first assignment after I got out of school (10 years ago), so I remember it well.

Kirby · May 26, 2010

PandaBear said:
I think there is a serious problem with the push and pop routine if there is no mutex / semaphore wrapper around it.

While you are having the read and write index done correctly (in circular way), using element count to determine whether there is an item in the circular buffer is going to get you out of sync eventually, because the element count is written / updated by both the push and pop (hence the main code and the interrupt).

What you need to do is to make comparison between the head and tail (aka your read and write index) every single time you add or remove element. The logic of the check for add should be if the wrap around occur, i.e. write index + 1 == read index for write, and read index == write index for read. Also make sure the condition that read index == write index should be empty, and write index + 1 == read index should be full.

Since only write index is updated when write, and read index is updated when read, there is no write after read issue, and no need for mutex / semaphore / etc, as long as these indexes are volatile.

This is my first assignment after I got out of school (10 years ago), so I remember it well.

I'm working on a similar project right now, and this is what I'm doing. If you need to get the size of your buffer, just make a function that subtracts the pointers.

tatteredpotato · May 27, 2010

Yea I realize that the current implementation definitely isn't thread safe, however I've been sidetracked by another task lately. Thanks for the tip Panda... I've been looking for the simplest solution (and one that doesn't require me to make my own semaphore... although that may be a good exercise anyways).

The Nios processor (a soft processor for Altera FPGAs) doesn't have much in the way of mutex support built-in (either hardware or shipped software libraries from what I've found). Anyways when I do get around to updating this I'll try to remember to at least update my solution here.

PandaBear · May 27, 2010

nkgreen said:
I'm working on a similar project right now, and this is what I'm doing. If you need to get the size of your buffer, just make a function that subtracts the pointers.

The problem with using SIZE to estimate how much you can write is that by the time you get the size and the time you add stuff, the size would/could have changed. For a non semaphore/barrier protected mechanism, you have to check for one count every item you add (in this case it could be a character or a word, depends on what the index count is).

The reason it work is that you check your availability before you add / remove, and if race condition arrive, it will only make your situation better rather than worse. For example, if you try to add to the circular buffer and it is full, you do not add, but if the check is in a race with remove, then the worst it could do is to fail when an empty slot comes up. Likewise, if you try to remove an empty buffer, your fail could only compete with the addition of a new data, so you will still not cause fatal condition.

I think this is the only reasonable way to do interrupt because you cannot stop an interrupt from happening (especially if it is a real time system) for a time long enough for the other side to finish its business. Semaphore and Barrier block the failed scenario from proceeding, but would not guarantee timing of the execution.

Good luck.

degibson · May 27, 2010

PandaBear said:
What you need to do is to make comparison between the head and tail (aka your read and write index) every single time you add or remove element. The logic of the check for add should be if the wrap around occur, i.e. write index + 1 == read index for write, and read index == write index for read. Also make sure the condition that read index == write index should be empty, and write index + 1 == read index should be full.

Since only write index is updated when write, and read index is updated when read, there is no write after read issue, and no need for mutex / semaphore / etc, as long as these indexes are volatile.

This works fine in the specific case of 1 reader thread and 1 writer thread, but breaks down for multiple readers or multiple writers. For that case, you still need a mutex.

More on the Nios II:
Unfortunately, I can't find any reference to Nios II's memory consistency model, but it's certainly weak (Nios II's ISA contains a memory fence, called sync in their ISA manual, http://www.altera.com/literature/hb/nios2/n2cpu_nii51017.pdf). Moreover, the ISA appears to have no intrinsic atomic operations. Hence, if you need to make a mutex resilient against current threads, you're going to need to implement the Bakery Algorithm with syncs between all the loads and stores. You can elide the syncs if your mutex need only be resilient against interrupts on the same processor.

The J · May 27, 2010

According to the Programming Model Reference (http://www.altera.com/literature/hb/nios2/n2cpu_nii51003.pdf), you can use the 'ienable' register to enable or disable specific interrupts. If your buffer will be fed only by the UART interrupt routine, then you should be able to disable the UART interrupt while you are reading from your buffer and re-enable it when you are done.

PandaBear · May 27, 2010

degibson said:
This works fine in the specific case of 1 reader thread and 1 writer thread, but breaks down for multiple readers or multiple writers. For that case, you still need a mutex.

Yes, I agree, and this is usually used only for the bare metal one writer one reader scenario. Normally in embedded system if you have multiple sender and receiver you will be using an RTOS, but if you do not have one, in theory you have to do infinite loop waiting for all of them with semaphore / barrier, but you will be wasting a lot of resource (CPU cycles) on it.

In practice, this kind of loop is mainly used to get things off the interrupt or hardware buffer, and then the reader as a thread / task within the system will route it to the next / final reader.

PandaBear · May 27, 2010

The J said:
According to the Programming Model Reference (http://www.altera.com/literature/hb/nios2/n2cpu_nii51003.pdf), you can use the 'ienable' register to enable or disable specific interrupts. If your buffer will be fed only by the UART interrupt routine, then you should be able to disable the UART interrupt while you are reading from your buffer and re-enable it when you are done.

This approach is fine if you have a hardware buffer in the UART. In many situations where there is no buffer from the input and you have to make sure the system is in real time (i.e. sampling a real hardware, Analog input, etc), you cannot just turn off the interrupt as you wish. This is going beyond just software, because the system design must consider the worst case performance and deadline, in order to avoid this in the first place.

degibson · May 28, 2010

PandaBear said:
Yes, I agree, and this is usually used only for the bare metal one writer one reader scenario. Normally in embedded system if you have multiple sender and receiver you will be using an RTOS, but if you do not have one, in theory you have to do infinite loop waiting for all of them with semaphore / barrier, but you will be wasting a lot of resource (CPU cycles) on it.

If interrupts are enabled during the ISR, then the ISRs are reentrant and the code is therefore MWMR. An RTOS may choose to leave interrupts enabled, yes, but it is not only an RTOS that will do so (i.e., interrupts can be enabled by accident or bug).

If interrupts are disabled, and OP is otherwise careful to avoid multiple readers or multiple writers, everything is probably OK.

tatteredpotato · May 28, 2010

Since there is essentially 2 threads currently (main execution and ISR), my first thought was to use Dekker's algorithm for critical code. This would work without the atomic operations and is (IMO) a simpler implementation than the Bakery algorithm, however if an interrupt occurs while the main thread is in the critical section then the system would hang in the ISR indefinitely since there is no scheduler to eventually preempt the ISR and return execution to the main program thread.

Unless there's a way to delay the ISR execution, I don't see a way to implement these algorithms in an ISR. Most of them rely on preemption and waiting for the user of a resource to finish, but if the resource is in use when the ISR is called then it can wait all it wants but without preemption of the ISR, the original user will never finish.

I'm leaning toward disabling interrupts for the very small amount of time I would need to at this point (UART does have a hardware buffer)... It's not the most elegant solution but it will probably be the best in this application.

PandaBear · May 28, 2010

Normally ISR is not considered a thread, because they come up on their own and it is not entered from scheduler decision.

If your definition of a thread include ISR, then critical section involve waiting for ISR to finish and disabling / enabling of ISR. Main code should not preempt an ISR ever, only a higher priority ISR can preempt a lower priority ISR if choose to do so, otherwise it is a disaster waiting to happen.

I think you should take a step back and see what is the correct system behavior before you decide on what algorithm or model will work. In your system, if you have input coming in faster than you can consume / use, what is the correct behavior? Should it be dropping new input, kicking out old input, or just disable any input from happening? Thread model or algorithm will not answer these questions and assume the designer will have answers for these, because thread model only tells you when to pause and when to resume.

In your case, what you can do is to have the main code look for a flag that ISR will set when input arrives, and in the main loop (1 thread only, not including ISR) or a lower priority thread (if more than 1 thread, not including ISR) check for this flag at the convenience and retrieve the data from the hardware buffer. Only disable interrupt for a short time to retrieve data from hardware buffer if needed, and quickly enable it back. Check the hardware UART manual to see what is the correct / reference design.

This uart input buffering has been done for decades without threading model or anything more complicated than ISR. I think you are over analyzing the problem here.

PandaBear · May 28, 2010

Here is a design that is very common:

1) Hardware - enable interrupt, hardware buffer, reading the byte will automatically clear the interrupt and reduce the count / increase read index to next byte.

2) ISR - set a flag (or send a message, set an event) when a byte arrive, depend on design it may kick out the oldest unread byte to make room for new byte, or cause an overflow error and drop new input if overflow

3) Main loop, or other threads responsible for parsing the bytes. The main code or thread is responsible for copying the input character to an internal buffer (command, data, response, etc) and check if a complete message has arrived. If it has, either start execution of a procedure directly or send a message, set event to notify others to do it. If execute procedure directly, you need to be sure that it will finish before the deadline so we will not have an overflown hardware buffer in UART.

Feel free to copy this, as it is the industrial standard way of doing UART input.

PandaBear · May 28, 2010

degibson said:
If interrupts are enabled during the ISR, then the ISRs are reentrant and the code is therefore MWMR. An RTOS may choose to leave interrupts enabled, yes, but it is not only an RTOS that will do so (i.e., interrupts can be enabled by accident or bug).

If interrupts are disabled, and OP is otherwise careful to avoid multiple readers or multiple writers, everything is probably OK.

Agree, what I meant was during main code / threads operation one should not disable interrupt for any longer than necessary (i.e. adjusting registers that should only be adjust when interrupt is disabled).

The J · May 28, 2010

PandaBear said:
This approach is fine if you have a hardware buffer in the UART. In many situations where there is no buffer from the input and you have to make sure the system is in real time (i.e. sampling a real hardware, Analog input, etc), you cannot just turn off the interrupt as you wish. This is going beyond just software, because the system design must consider the worst case performance and deadline, in order to avoid this in the first place.

Admittedly I work with PIC24Hs, which have a 4-place hardware buffer. That being my only experience with embedded programming, I need to stop for a moment and tell myself that this isn't a PIC.

With that said, I think that turning off the UART interrupt while reading the buffer can work as long as the circular buffer-reading code is sufficiently quick (or the baud rate is sufficiently slow) and as long as the interrupt is re-enabled as soon as possible. Having the hardware buffer seems like it helps most in the instances where the OP is having to service nested interrupts (does this architecture support that?) and therefore servicing the UART takes longer than normal. The OP will have to take that into account when he determines what baud rate he can run at. If the OP didn't have a hardware buffer, he would need to run a slower baud rate to account for the worst case of hitting a bunch of nested interrupts.

I think I'm spoiled, though, because I have a whole 40MHz to play with! :awe:

ObscureCaucasian said:
Unless there's a way to delay the ISR execution, I don't see a way to implement these algorithms in an ISR. Most of them rely on preemption and waiting for the user of a resource to finish, but if the resource is in use when the ISR is called then it can wait all it wants but without preemption of the ISR, the original user will never finish.

I'm leaning toward disabling interrupts for the very small amount of time I would need to at this point (UART does have a hardware buffer)... It's not the most elegant solution but it will probably be the best in this application.

Depending on how the architecture works, temporarily disabling the interrupt would have the effect of delaying its execution. In addition to your interrupt enable register is another register than contains interrupt flags. These are set when the corresponding interrupt fires. It's possible that your soft core will always set the interrupt flag to indicate that the interrupt has triggered, even if you have interrupts disabled. So, at the very least you know if the interrupt had fired and can act on it.

However, let's say that you disable your interrupt and then the flag is set. Now you re-enable the interrupt. What happens? It's possible that the CPU will immediately jump to the interrupt routine because that flag is set. In this manner, disabling the UART interrupt will effectively delay it. You should be fine as long as you are fast enough to handle the baud rate at which data is being sent to you.

You'll have to experiment or dig up the soft core docs to figure out how this all works.

PandaBear said:
In your case, what you can do is to have the main code look for a flag that ISR will set when input arrives, and in the main loop (1 thread only, not including ISR) or a lower priority thread (if more than 1 thread, not including ISR) check for this flag at the convenience and retrieve the data from the hardware buffer. Only disable interrupt for a short time to retrieve data from hardware buffer if needed, and quickly enable it back. Check the hardware UART manual to see what is the correct / reference design.

This uart input buffering has been done for decades without threading model or anything more complicated than ISR. I think you are over analyzing the problem here.

This approach could work very well and sounds like it'd be easy to implement since you shouldn't need the circular buffer to do this. Since you are reading right from the UART register/buffer, you can simply use a normal single-ended buffer. Just watch for the character that signifies the end of your command or data and go from there. If the interrupt flags work as I described above, then you wouldn't even need the ISR anymore; just check the interrupt flag and manually clear it. As long as your main loop is fast enough to read data without missing any (depends on the baud rate you are after).

PandaBear's approach also gives you some diagnostic info if you use the ISR he suggested. As PandaBear mentioned, you would have an interrrupt that set a separate flag to indicate that new data is available. Your main loop code will read the new data and clear that flag. When the interrupt fires again, it first checks the state of the flag. If it is clear then you know that you got the data in time. If it is still set then you know that you have missed data (because your main loop code never cleared it). You can then act on it by perhaps setting another flag or doing something to let you know that something was up.

If you do want a faster baud rate or your main loop code is really slow (300 checks/second would mean you must go less than 300 bytes/second or 2400 baud), then your use of the interrupt and circular buffer I think would be your best bet. Your interrupt should only get the data from the UART module and shove it into the buffer. Your main loop code will disable the interrupt, pull some data out (you can do one char at a time), and then re-enable the interrupt. Now, rather than being limited by the speed of your whole main loop, you are instead limited only by the speed of your interrupt plus the time that the interrupt is disabled (plus the time of other interrupts if they can nest) in the worst case. This time should be much lower than the time needed for your whole main loop.

tatteredpotato · Jun 1, 2010

PandaBear said:
I think you are over analyzing the problem here.

You're probably right on that one... I guess I was just making sure I had the right strategy for this system. I realized that there was a possibility for issues to arise from using a shared resource in your main loop and in an ISR and wanted to ensure I was handling it correctly.

Thanks for all the input.

degibson · Jun 1, 2010

Is your serial channel flow controlled? Can it be enabled? If so, you needn't worry about occasional software slowness.

Or you can modify your serial protocol to tolerate occasional loss.

tatteredpotato · Jun 1, 2010

degibson said:
Is your serial channel flow controlled? Can it be enabled? If so, you needn't worry about occasional software slowness.

Or you can modify your serial protocol to tolerate occasional loss.

It's going to be running the Modbus protocol which requires acknowledgment messages. A loss would mean the device would get a bad message (checksum values are included) and wouldn't send a response, or would send an error response. The master SHOULD attempt to repeat the action, however this will eventually be controlled via a PLC and the code on that is out of my hands.

PandaBear · Jun 3, 2010

The J's last recommendation is spot on regarding to how high throughput data handling are typically done with circular buffer to buy time between data flow rate and processing rate. If you run into performance issue this is as simple of a solution as it gets.

Regarding to switching flow control instead of solving this in software. I'd be very careful about this, depends on who the end user of the system is. Many industrial system design with only 3 wires for UART and as a defensive measure, I'd not design something that takes more than 3 wires or need them to switch to a different flow control type to work with. Stick with what you have unless you know for sure that the other end oks with it (with written approval), and even if you can get away with it, asking for flow control switch because you couldn't come up with a better software solution looks bad on your professional reputation.

Thread Safety in Embedded Systems?

Diamond Member

Lifer

Lifer

Golden Member

Diamond Member

Diamond Member

Golden Member

Lifer

Diamond Member

Golden Member

Golden Member

Senior member

Golden Member

Golden Member

Golden Member

Diamond Member

Golden Member

Golden Member

Golden Member

Senior member

Diamond Member

Golden Member

Diamond Member

Golden Member