I presented a tutorial about the “verification of virtual platforms models” at DVCon Europe last week. The tutorial was prepared by me and Ola Dahl at Ericsson, but Ola unfortunately could not attend and present his part – so I had to learn his slides and style and do my best to be an Ola stand-in (tall order, we really missed you there Ola!). The title maybe did not entirely describe the contents – it was more a discussion around how to think about correctness and in particular specifications vs implementations. The best part was the animated discussion that we got going in the room, including some new insights from the audience that really added to the presented content.
Updated: Included an important point on software correctness that I forgot in the first publication.
Our goal with the tutorial was not to provide solutions or answers, but to make people reflect and think about how they work. That part worked out very well I think. We had a ninety-minute slot, but my initial presentation only filled about sixty of those minutes. That is a very scary situation to be in as a speaker – what if nobody says anything to fill out the remaining time? But there was no need to worry. Instead, I had to cut the debate and discussion short thirty minutes later. When we saw two audience members debating with each other based on the presentation, it had clearly fulfilled its goal.
Short Summary
The core topic of the tutorial was about how to think about specifications vs implementations, and how to test implementations. This diagram was key, maybe we can call it the specification quadrant:
The tutorial included a number of examples of how the specification quadrant goes wrong in practice. This one drew the most laughs and recognizing nods from the audience. Seems that hardware people leading the way without talking to other teams is a common experience:
One important aspect of testing is positive and negative testing. Typically, VPs are used to ensure that things work – you want that operating system to boot and the software stack to run before silicon arrives. However, it is also important to be able to test negative scenarios. For example, failing to boot in case some secure boot steps are not followed correctly. Or refusing to work if hardware setup is not correct. You need to design virtual platforms to support both. The slide addressing this was intentionally provocative:
Next, we talked about the relationship between unit/module testing and integration testing. Essentially, if you only do integration testing, you have no stable ground to stand on when discussing issues. When each team just tests their work against the deliverables of another team, every issue turns into a bug hunt.
Instead, a key message was that every team has to do some form of unit testing that is entirely under their control and that does not include artifacts borrowed from other teams. This goes both for the developers of each module, as well as the developers of the integration. An integration requires integration tests, but also tests of the integration tests. If you do not know that your tests are tested, how do you know they do anything useful?
A key concept here is the difference between doing the thing right (DTR) and doing the right thing (DRT). DTR means making sure that you unit test your work so that you are sure that what you do is correct from your perspective. DRT means doing what other teams expect from your team, or maybe what the users expect from the product overall. If all teams involved are DTR, and we still have issues on integration, it means we have a DRT bug. I.e., a discussion on requirements and interpretation of specifications. Such discussions are far more productive than just debugging without really knowing what the expected result is.
Key points, in slide form:
After this slide, it was on to discussions…
Applies Everywhere!
One attendee noted that the concepts really apply to many other settings. We talked about specification feeding hardware, software, and virtual platforms. Plus testing with software on virtual platforms.
But it applies equally to other cases where specifications drive dependent deliverables and you do integration between units from different teams. The key is the need in all cases to think about how the process works and how to ensure eventual consistency between the deliverables.
Iteration vs Waterfall
Another line of discussion was about the flow of information and implementation. The slides as presented are kind of waterfall-based in the sense that we show implementations derived from a specification. But how does that specification come to be, in practice? It is clearly something that is developed over time, but it also affected by discoveries from the implementation teams.
For example, it is common to use a VP to evaluate hardware design decisions early – to collect feedback from software developers. Really, all the arrows in the specification quadrant could be bidirectional. Ideally, the development process is an agile process where everyone is providing feedback to the common specification and iterating towards the eventual design together.
Formal Methods and Development as Test
Another comment following on the above was that formal methods are another way to evaluate and test a specification. I mentioned that building a VP is an excellent way to get another set of eyes on a spec, and an audience member pointed out that building checkers and properties for formal methods have much the same effect.
Being Stricter than Hardware
Looping back to the issue on VP being more forgiving than hardware, the audience reminded us of the point that a VP can also be used to stress-test software to ensure its robustness. Especially in the light of multiple different hardware implementations, as well as hardware sometimes being more forgiving than the specification is. I had to draw this as an image to (which was not shown at DVCon):
Specification is not that Easy
Another complaint was that the specification quadrant oversimplifies the specification itself. In practice, don’t we have several different specifications or at least views on it? The information that an RTL developer needs is not entirely the same as that expected from a software driver developer. It is easy to claim you have a single specification, just with different aspects presented separately. Right? However, that easily leads to this anti-pattern presented early in the tutorial:
Still, how do we handle that? To me, the key is probably to make sure that there is only one expression for each type of information. For example, most stakeholders want the register map for a device. That should be possible to express only once. Different views or skeleton implementations can be generated from a single underlying specification. As soon as the information is maintained in multiple separate documents and updated separately, chaos ensues.
Finding Solid Ground
In the end, I think the key message of the tutorial is the absolute need to know what the truth is when developing interdependent artifacts. When the software does not work on the hardware, you need to have something to fall back on tell you where the error is. To me, that is what we call “specification” in the presentation. But you might call it something else.
The key is that we need some piece of stable solid ground to stand on. Something that all can agree on represents “the truth”. Otherwise we get stuck in eternal debug and discussion loops. That is a key to disciplined and successful execution.
As Archimedes (I think) said: Give me a lever, and a place to stand, and I will move the earth.