Windows 10 Reboot Loop – CUDA & Alienware

Late last year I was trying to do some machine learning work on my brand new Alienware 15 R4 gaming laptop. I had bought the laptop in order to have something portable with sufficient performance to actually do convolutional neural network (CNN) training and inference “on the road”. The GTX 1060 in the laptop is just as powerful as my home desktop machine, and should run Tensorflow and Keras well. I had the setup working on the desktop already, and copied the code over to the laptop. When trying to run the code the first time, I got some rather strange errors that I finally figured out meant that I was missing the CUDA toolkit. I downloaded CUDA version 10, installed, and the machine rebooted into the Windows 10 automatic repair mode.

Continue reading “Windows 10 Reboot Loop – CUDA & Alienware”

Intel Blog Post: Using Wind River® Simics® to Inspire Teachers and Researchers in Costa Rica

A while ago, I visited my Intel colleagues in Costa Rica and ran a workshop for university teachers and researchers, showing how Simics could be used in academia.  I worked with a very smart and talented intern, Jose Fernando Molina, and after a rather long process I have published an interview with him on my Intel blog: https://software.intel.com/en-us/blogs/2017/12/05/windriver-simics-to-inspire-teachers-costarica

Continue reading “Intel Blog Post: Using Wind River® Simics® to Inspire Teachers and Researchers in Costa Rica”

Neat Register Design to Avoid Races

raceconditionIn his most recent Embedded Bridge Newsletter, Gary Stringham describes a solution to a common read-modify-write race-condition hazard on device registers accessed by multiple software units in parallel. Some of the solutions are really neat!

I have seen the “write 1 clears” solution before in real hardware, but I was not aware of the other two variants. The idea of having a “write mask” in one half of a 32-bit word is really clever.

However, this got me thinking about what the fundamental issue here really is.

Continue reading “Neat Register Design to Avoid Races”

Conquering Software with Software High-Level Synthesis

46daclogoThis post is a follow-up to the DAC panel discussion we had yesterday on how to conquer hardware-dependent software development. Most of the panel turned into a very useful dialogue on virtual platforms and how they are created, not really discussing how to actually use them for easing low-level software development. We did get to software eventually though, and had another good dialogue with the audience. Thanks to the tough DAC participants who held out to the end of the last panel of the last day!

As is often the case, after the panel has ended, I realized several good and important points that I never got around to making… and of those one struck me as worthy of a blog post in its own right.It is the issue of how high-level synthesis can help software design.

Continue reading “Conquering Software with Software High-Level Synthesis”

I Want One… Trillion Instructions…

There is an eternal debate going on in virtual platform land over what the right kind of abstraction is for each job. Depending on background, people favor different levels. For those with a hardware background, more details tend to be the comfort zone, while for those with a software background like myself, we are quite comfortable with less details. I recently did some experiments about the use of quite low levels of hardware modeling details for early architecture exploration and system specification.

Continue reading “I Want One… Trillion Instructions…”

Shaking a Linux Device Driver on a Virtual Platform

To continue from last week’s post about my Linux device driver and hardware teaching setup in Simics, here is a lesson I learnt this week when doing some performance analysis based on various hardware speeds.

Continue reading “Shaking a Linux Device Driver on a Virtual Platform”