
In August, a strange security vulnerability dubbed “Ghostwrite” was making the rounds in the press. Basically, a vector store instruction on an Alibaba T-Head C910 RISC-V-based processor would just write to a physical address without doing a virtual-to-physical translation or checking any kind of access rights. That is just totally weird. Just how could that be implemented and slip through testing???
The Vuln
The Ghostwrite website describes the issue like this:
The GhostWrite vulnerability affects the T-Head XuanTie C910 and C920 RISC-V CPUs. This vulnerability allows unprivileged attackers, even those with limited access, to read and write any part of the computer’s memory and to control peripheral devices like network cards. GhostWrite renders the CPU’s security features ineffective and cannot be fixed without disabling around half of the CPU’s functionality.
Short and sweet. Many more details are found in the paper describing how the issue was found (using differential fuzzing, comparing different RISC-V processor implementations to each other). The paper describes the issue like this:
- Address-handling: RISCVuzz finds different bugs around virtual address handling. The vse128.v instruction on the C910 does not translate the provided virtual address to a physical address but instead interprets it directly as a physical address, giving attackers a physical write primitive (cf. Section VI). Additionally, on the C910, reading from physically-backed virtual address ‘0’ locks the CPU, requiring a hard reset.
This is a functionality bug. It is not a subtle issue with speculative execution like so many issues from the past few years. It is just a broken processor where an instruction does not do what the specification says it should do, and the actual behavior is totally bonkers. From what I understand from the paper, the instruction is also off in its writing behavior. It only writes a single byte on each invocation, even if given a rather wide vector.
Say What?
The facts are clear. This is a functional behavior that is plain wrong. The question is just how this would happen?
There are three possible explanations:
- Incompetence (or just lack of time)
- Intention
- Intention and incompetence
I bet on the first. Somehow, the implementation of the instruction ended up wrong, and the functional testing of the core failed to find the issue.
If this was an intentional backdoor, as some people appear to suspect, it is an incompetent backdoor since it breaks the behavior of an architectural instruction in such a way that correct code would fail.
The key problem is really the whole “skip virtual memory translation” bit – that the instruction ignores the virtual memory means that any code actually using the instruction like it is intended would fail. Imagine a simple case of a Linux user-space program allocating some memory for a vector…
- The allocated address is a virtual address
- The code performs some vector computations and stores the results using the instruction in question
- The data would then be stored to a physical address, which is basically a random location in physical RAM. Or even devices.
- Other code tries to read back the stored data from the virtual address. This operation does get the translation, and thus reads from some place where the data most definitely is not located.
- Kaboom.
What is Going On?
The only conclusion from the above is that testing, if it happened at all, was very perfunctory.

On the implementation side, it could sound like this vector store instruction has its own path to memory as sketched in the image above. However, that would be a really strange way of designing the processor memory path.

More likely, the store is performed using some kind of special mode. Since the stores skip the cache, according to the paper, maybe this is an optimized mode for writing vectors that are not expected to be reloaded (i.e., what X86 calls non-temporal stores). Maybe that is what happening, and somehow the code for that mode also accidentally missed the MMU. As a programmer, I can see how that could happen, at least in software.
But on the testing side… any kind of functional testing should have caught this, right? Just use the instruction in a piece of user code and the problem should be obvious. Clearly, that did not happen.
Which might not be all that strange, since typically processor validation code that tests instruction semantics is run bare-metal and not using an operating system. When simulating RTL, every cycle takes a painful amount of time to run, and thus test code is not in general based on booting a full operating system. When running bare-metal with either no MMU active or an MMU set to 1-to-1 mapping, virtual and physical addresses are the same, and thus it could seem like the instruction worked. Except for maybe the fact that it apparently does not even write what it should…
It just makes no sense. Maybe no testing was done at all, at least not on this particular aspect of the implementation. The other bugs reported in the paper on other T-Head processors do seem to point to a processor design team that is not very good at testing, and that definitely is not doing a lot negative testing to make sure incorrect instructions are handled correctly. It all feels like a typical rushed job, making things work enough to get through some set of tests, but not taking the time to check everything in detail or to stop and think about possible additional tests.
I guess there should be a healthy market for RISC-V test suites and test software.
A Case for Microcode!
A surprising point made in the paper was this call for microcode:
We assume that with a microcode layer, as on x86, GhostWrite could be mitigated. An x86 microcode update can hook and patch instructions [5], which could have been used to hook the broken vector instruction and simply raise an illegalinstruction exception. Given the increasing complexity of RISC-V CPUs, we advocate such a microcode layer on RISC-V to have the possibility of mitigating CPU vulnerabilities.
From the perspective of security, having microcode in a processor is a feature, not the performance-sucking complexity-inducing bug it was often considered in the past! I must admit that I never thought of it that way.
Thanks for another well-written article!
Re “it apparently does not even write what it should” perhaps this is another bug related to mask!
The observations re “a healthy market for RISC-V test suites” and “a case for microcode” are very astute!
Hello
Thank you for your good article.
can I to get from some advice with you?
Is it possible to ask here?
My question is relatively outside the topic of your article, of course, it is somewhat on the level of the topics you mentioned.
Thnaks