In his most recent Embedded Bridge Newsletter, Gary Stringham describes a solution to a common read-modify-write race-condition hazard on device registers accessed by multiple software units in parallel. Some of the solutions are really neat!
I have seen the “write 1 clears” solution before in real hardware, but I was not aware of the other two variants. The idea of having a “write mask” in one half of a 32-bit word is really clever.
However, this got me thinking about what the fundamental issue here really is.
As I see it, it is the fact that the processor cannot address small enough units atomically. The read-modify-write that was used to start the discussion in the Embedded Bridge #37 was needed in order to get the current state of a configuration register, change some setting that only occupied a few bits in it, and write back the result to the register. The way most configuration registers that I have seen in practice works.
But if each setting could be given its own register, the problem would go away. Each operation would target a unique address, achieving the same effect as the bit-wise masks or write-1 solutions proposed. The core problem is that hardware tends to share settings into registers, as it has been considered too expensive to put information that might cover a range as small as [0,1] into a 32-bit register. Probably, since there is a lack of addresses for registers, you cannot have 1000 settings cause each simple device to use up 1000 words of physical addresses.
But is that really an issue, if we look forward?
It seems to me that, as 64-bit instruction sets and addressing systems penetrate down into more and more embedded systems, a simple solution would be to throw address space at the problem. I don’t think it is uneconomical to allocate huge chunks of memory space to each device, giving each setting its own register, when you have 64 bit virtual addresses to work with. There is no way you can fill up a physical memory system (guess that will some day come back to haunt me)… even the highest-end machines today only use something like 40 bits for actually addressing physical memories.
The software would be simpler and more robust, with virtually no cost.
Another solution that I have also seen starting to appear is to dispense with register settings altogether, and rather define a command API that the processor “calls” by putting in command packets into some memory area. This does require quite a bit of silicon for a decoder, but it provides for a much higher level of interaction with devices. As hardware devices get defined in successively higher-level languages (C, C++, UML, MatLab, …), and their programming interfaces and associated drivers get autogenerated, this solution makes eminent sense.
Hm. Interesting… assuming Moore’s law is still a good predictor, that would mean we’ll start hitting 64 bits somewhere around 2046. Let’s see: 40 bits today makes 60 years, which brings us back to 1950. Yep, almost spookily accurate.