Most of the time when talking about the impact of multicore processing on software, we complain that it makes the software more complicated because it has to cope with the additional complexities of parallelism. There are some cases, however, when moving to multicore hardware allows a software structure to be simplified. The case of Integrated Modular Avionics (IMA) and the honestly idiotic design of the ARINC 653 standard is one such case.
The idea behind IMA is to make it easier to write safety-critical software by allowing a single processor to run code certified at different safety levels. The problem is that the current regulations require that every set of programs that can run together and share some computer system have to be verified to the level of the most critical program in the set. This makes excellent sense — you do not want a failure in the non-critical entertainment program to mess up the very important flight control software, right?
But since it is rarely the case that you can partition your system in such a way that the programs at the same level of integrity nicely fill up a single available processor, you risk either having underutilized processors or to overqualify your software. Enter the IMA vision and the ARINC 623 standard. The idea here is to have a powerful and correct operating system temporally divide a processor into multiple logical partitions, where each partition is isolated from the other partitions. Inside each partition, you then have programs of a single level of criticality. But each partition can have a different level. Very similar to classic virtualization in this respect.
However, the problem with this comes in implementation and efficiency. Basically, to ensure that partitions cannot starve each other out of CPU resources or otherwise peek or change each other’s state, you have to clean out the entire CPU state each time you change partition. This means cleaning out caches, TLBs, and making sure all memory operations have completed. On a single processor, this is obviously at odds with any kind of performance engineering. And this happens several times per second.
Now that airborne systems are starting to be built using multicore processors, the solution chosen by Green Hills and other providers is to divide the multicore complex temporally as well. So you need to do a synchronized total state clean across all cores when changing partition. And inside each partition, you have to write parallel software. So you get the worst of both worlds: huge inefficiencies and overheads, along with the complexities of writing parallel software.
One word: ridiculous.
The obvious solution here is instead to take advantage of the fact that multicore is making processor cores cheap and plentiful. The fundamental assumption behind IMA, that processors are few and expensive resources, is not really true any more. Instead, it makes perfect sense to engineer a system so that each safety level gets its own physical CPU to run on.
This brings several benefits:
- Slightly increased safety, since you have physical separation that is not dependent on a software OS kernel.
- Better resource utilization, since wasteful cache flushes and TLB flushes are avoided.
- Simpler software engineering, since you do not have to write parallel code.
- Lower clock frequencies and lower power consumption, since each processor core can be slower.
- Simpler software structure, without the extra baggage of a supervisor OS scheduling partitions.
- If you need timing assurances for some software, you can put that on a specially simplified processor without caches or complex performance-enhancing features. The ARM966 core is a good example of this, or assigning DSPs or specialized math processors to compute-intensive algorithms.
Some detail notes:
- If some program does require the full power of a multicore device to run, this solution still makes sense. In every case where you use IMA, you are wasting performance doing switching between partitions.
- You do need hardware where you can assign particular resources to each core on a multicore device — there are usually shared resources on a chip that need to be properly maintained.
- The best off-the-shelf hardware to buy would seem to be a classic AMP design like a TI OMAP or some automotive devices. Shared-memory-default designs like Intel and AMD processors seem utterly unsuitable.
- Hardware hypervisors on a multicore chip would be very helpful to implement the control over shared hardware resources in the case that multiple cores can indeed access the same hardware.
- A simple existing example of this kind of design the Freescale 5514 and 5516 automotive processors, with two separate CPU cores. One is intended to handle interrupts and other low-latency hardware interfaces, and one does the heavy computations. The result is greater performance on both accounts with simpler software and cheaper hardware.
I filed this under “futile rants”, since I do not believe that we can change out of the IMA thinking mode for a long time to come. But you could always hope. I really think aggressive use of appropriately sized processor cores is a brilliant way ahead for real-time software system design.
See http://jakob.engbloms.se/archives/83 for a denial-of-service attack that has some implications on this idea. You need to ensure memory access fairness between cores to be really safe. Did not think of that when I wrote this post… but I still think it can be solved with some appropriate chip system architecture.