Intel Blog: Parallelizing a Virtual Platform Model

There are many ways to use threading and parallelization to improve the performance of virtual platforms. It is not always easy to successfully use parallelization – it very much depends on the nature of the workloads and model setup – but when it works it can really help. I recently published a long blog post at Intel, detailing an idealized example of threading for a device model that is shipping in the Simics training package.

Continue reading “Intel Blog: Parallelizing a Virtual Platform Model”

The first Computer and System Architecture Unraveled Event in Kista – Great Speakers, Great Fun!

On the evening of the last Wednesday in September, we had our first CaSA, Computer and System Architecture Unraveled, event. CaSA is a meetup in Kista (Sweden) for people interested in computer architecture, system architecture, and how software and hardware interact down towards the lower levels of the stack. The topic for the inaugural event was “Core Count Explosion: A Challenge for Hardware and Software”, and it was great in some many ways!

Continue reading “The first Computer and System Architecture Unraveled Event in Kista – Great Speakers, Great Fun!”

Intel Blog: Demonstrating Simics Threading using RISC-V Simple

In my third post based on the Simics RISC-V simple virtual platform, I use the it to demonstrate how the Intel Simics simulator uses multiple host threads to simulate multiple target cores. The RISC-V platform is nice in that it has less noise than more complex platforms, allowing for clear and simple measurements.

What’s in a Kilowatt Hour?

The current price spikes for electricity in Europe has driven a new interest in saving energy, and part of doing that is to understand just how much energy different things use. I realized while I knew that modern LED lights are magically efficient, just how much electricity is used by other machines? No idea! So, I set out to find some examples the utility you get from a one kilowatt hour of electricity.

Updated in November 2022 with additional data.

Continue reading “What’s in a Kilowatt Hour?”

Intel Blog: Catching a Tricky Bug by Running Simics on Simics

I recently published a long post on the Intel Community Blog, talking about how my colleague Evgeny solved a nicely complicated bug using Simics-on-Simics. The bug involved UEFI, an operating system, SMM, SMI, and virtualization. Just another day in the office (or more like a year, given how long it took to get this one resolved).

SystemC Evolution Fika: Parallel SystemC

The SystemC Evolution Fika on April 7 had threading/parallelism as its theme. There were four speakers who presented various angles on how to parallelize SystemC models. The presentations and following discussion provided a variety of perspectives on threading as it can be applied in virtual platforms and other computer architecture simulations. It was pretty clear that the presenters and audience had quite different ideas about just what the target domain looks like and the best way to introduce parallelism to SystemC. Here is my take on what was said.

Continue reading “SystemC Evolution Fika: Parallel SystemC”

Some Notes on Temporal Decoupling (Reposted)

This blog post was originally posted at Intel back in 2018, but it has since been retired from the Intel blog system. As it is of general interest (in my opinion), here is a reposting (with a few small updates here and there).

Temporal decoupling is a key technology in virtual platforms, and can speed up the execution of a system by several orders of magnitude. In my own experiments, I have seen it provide a speedup of more than 1000x. Here, I will dig a little deeper into temporal decoupling and its semantic effects.

Continue reading “Some Notes on Temporal Decoupling (Reposted)”

CACM on DSAs

The July 2020 edition of the Communications of the ACM (CACM) had a front-page theme of “Domains-Specific Hardware Accelerators”, or DSAs. It contained two articles about the subject, one about an academic genomics accelerator, and one about the Google TPU. Hardware accelerators dedicated to particular types of computation are basically everywhere today, and an accepted part of the evolution of computers. The CACM articles have some good tidbits and points about how accelerators are designed and used today. At the same time, I also found a youtube talk about the first hardware accelerator, the IBM Stretch HARVEST, showing both contrasts with today as well as a remarkable continuity in concept.

Continue reading “CACM on DSAs”

Intel Blog: A Mountain and Threading for Simics 6

A new short blog post on my Intel Developer Zone blog talks about the improved threading simulation core we have added in Simics version 6… and about how a colleague of mine climbed to the top of the highest mountain in Europe and showed a flag with our new Simics icon! Read the story at https://software.intel.com/en-us/blogs/2019/09/10/simics-6-at-the-mountain-top.

Microsoft REPT: You CAN Reverse from a Core Dump!

There are some things in computing that seem “obviously” true and that “clearly” make it “impossible” to do some things.  One example of this is the idea that you cannot go backwards in time from the current state of a program or computer system and recover previous state by just reversing the semantics of the instructions in the program.  In particular, that you cannot take a core dump from a failed system and reverse-execute back from it – how could you?  In order to do reverse debugging and reverse execution, you “have to” record the state at the first point in time that you want to be able to go back to, and then record all changes to the state. Turns out I was wrong, as shown by a recent Usenix OSDI paper.

Continue reading “Microsoft REPT: You CAN Reverse from a Core Dump!”

Talking about Temporal Decoupling at DVCon Europe

This year’s Design and Verification Conference and Exhibition (DVCon Europe) takes place on October 24 and 25 (2018).  DVCon Europe has turned into the  best conference for virtual platform topics, and this year is no exception. There are some good talks coming!

Continue reading “Talking about Temporal Decoupling at DVCon Europe”

gem5 Full Speed Ahead (FSA)

I had many interesting conversations at the HiPEAC 2017 conference in Stockholm back in January 2017. One topic that came up several times was the GEM5 research simulator, and some cool tricks implemented in it in order to speed up the execution of computer architecture experiments. Later, I located some research papers explaining the “full speed ahead” technology in more detail. The mix of fast simulation using virtualization and clever tricks with cache warming is worth a blog post.

Continue reading “gem5 Full Speed Ahead (FSA)”

SiCS Multicore Day 2016 – In Review

sics-logo The SiCS Multicore Day took place last week, for the tenth year in a row!  It is still a very good event to learn about multicore and computer architecture, and meet with a broad selection of industry and academic people interested in multicore in various ways.  While multicore is not bright shiny new thing it once was, it is still an exciting area of research – even if much of the innovation is moving away from the traditional field of making a bunch of processor cores work together, towards system-level optimizations.  For the past few years, SiCS has had to good taste to publish all the lectures online, so you can go to their Youtube playlist and see all the talks for free, right now!

Continue reading “SiCS Multicore Day 2016 – In Review”

Clocks or Cores? Choose One

Once upon a time, when multicore processors were novelties, multicore was motivated by the simple fact that it was impossible to keep raising the clock frequency of processors. More “clocks” simply would result in an overheated mess. Instead, by adding more cores, much more performance could be obtained without having to go to extreme frequencies and power budgets. The first multicore processors pretty much kept clock frequencies of the single-core processors preceding them, and that has remained the mainstream fact until today. Desktop and laptop processors tend to stay at 4 cores or less. But when you go beyond 4 cores, clock frequencies tend to start to go down in order to keep power consumption per package under control. A nice example of this can be found in Intel’s Xeon lineup.
Continue reading “Clocks or Cores? Choose One”

Intel Blog: Finding a Linux Kernel bug by running Simics on Simics

intel sw smallI love bug and debug stories in general. Bugs are a fun and interesting part of software engineering, programming, and systems development. Stories that involve running Simics on Simics to find bugs are a particular category that is fascinating, as it shows how to apply serious software technology to solve problems related to said serious software technology.  On the Intel Software and Services blog, I just posted a story about just that: debugging a Linux kernel bug provoked by Simics, by running Simics on a small network of machines inside of Simics. See https://blogs.intel.com/evangelists/2016/05/30/finding-kernel-1-2-3-bug-running-wind-river-simics-simics/ for the full story.

Continue reading “Intel Blog: Finding a Linux Kernel bug by running Simics on Simics”