The 59th Design Automation Conference (DAC) took place in San Francisco, July 10-14, 2022. As always, the DAC provided a great place to learn about what is going on in EDA. The DAC is really three events in one: there is an industry trade-show/exhibition, a research conference that is considered the premier in EDA, and an engineering track where practitioners present their work in a less formal setting.
I had two talks in the engineering track – one on the Intel device modeling language (which actually won the best presentation award in the embedded sub-track), and one on using simulation technology to build hardware software-first.
The DAC was almost overwhelming in the richness of people and companies, but this blog tries to summarize the most prominent observations.
The Conference Overall
The event was held across all three floors of the Moscone West conference center. Exhibitions on floors 1 and 2, and then research presentations on floor 3. Kind of worked, but a bit suboptimal for traffic compared to having it all one floor I would say.
Everyone was happy to get back to in-person events. Even so, attendance was not what it used to be. Some of it could be because of last-minute cancellations due to people getting Covid (heard of several cases of that), and some might still be from travel restrictions between some countries (in particular China).
The exhibition is definitely shrinking year by year, some of it as a result of the ongoing consolidation in the EDA industry. The conference side is not shrinking though. The engineering track has expanded over the year, and the research track had record submissions. The conference had poster sessions every day which were quite well attended. The receptions in the evening worked really well for networking and I made quite a few new connections (which is the kind of things that never happen in a virtual conference).
The big three players all had big booths, and there was a separate Cadence booth at the “Design in Cloud” section of the conference. Siemens also had two – one main, and then the UltraSoC team had their own little booth (which was a bit odd as they were acquired in 2020).
AI/Machine Learning – both as hardware to design, and as a technology driving design tools
- 9% of all research papers involved AI.
- The huge processing demands of AI/ML drives the need to design better accelerators.
- AI/ML is starting to be used in tool flows to optimize the execution of tools and to find more optimal solutions, smarter test cases, etc.
- All the keynote speakers had something to say about AI, and the third keynote was very specifically about ML.
Cloud – as a way to run EDA tools
- Google Cloud and Microsoft Azure were both exhibiting at the conference, showcasing their EDA-optimized cloud setups. There are special hardware setups being offered that are optimized for the characteristics of EDA workloads (see below).
- There was a “Design on Cloud” special corner on floor 1.
- Synopsys and Cadence are marketing their cloud-based offerings quite heavily.
- I took a look at just how some of these flows work. It is obviously comparatively easy to just offload a server job on somebody else’s computer (i.e., Cloud). For interactive usage, at least Cadence OnCloud use a browser to display and interact with the classic desktop interface of their tools. Simple, easy, consistent, and sensible in my opinion.
Chiplets and 3D integration
- Chiplets are getting huge!
- Moore’s law is slowing down – chiplets offer a way to put more things “on a chip”, without relying on better processes to give us more transistors to use.
- Chiplets mean that packaging becomes a critical part of EDA tool flows, leading to a new wave of tools and tools sales. For EDA, chiplets are a current business driver. It drives demands for rather new types of tools.
- Chiplet can be considered as reuse at the silicon level, not just design level as with IP. You could imagine pre-fabricating a set of chiplets, keep them on a shelf, and ship them to customers for instant use. This could actually be easier and simpler than classic IP integration, if we can make the interfaces work well.
System companies doing their own silicon
- There is a definite trend towards big users of silicon building their silicon and not just buying standard solutions from the traditional chip companies.
- Important effect on EDA tooling market and foundry market – it keeps growing even if there are swings in the overall semiconductor market.
Open-source and RISC-V – these get conflated, even though they are really quite different.
- There was a section of the show floor dedicated to RISC-V and OpenHW (open-source hardware IP blocks). They had an open-source theatre, where various companies presented about either open tools or things related to RISC-V (even if they were closed-source).
- Open EDA tools – there were sessions dedicated to whether we can do useful EDA with open-source tools.
- It is clear that at least users of EDA tools are looking at the open-source software phenomenon and wondering if the same could not happen in EDA. However, there is nothing really like the extensive use of open-source software that is seen in many other fields. Core EDA tools do remain proprietary.
- RISC-V – there are some open-source cores, but mostly cores are private and/or commercial. The distinction between an open instruction-set standard and open-source software is often lost it seems.
- If you just need a base instruction set, everyone is converging on RISC-V. The standard is free and open, and there is no point in rolling your own complete instruction set anymore.
Speaking of RISC-V… this was a very common topic in papers and posters, especially from the smaller players. In a way, ARM was almost palpably absent from the DAC – I guess they just prefer their own shows.
RISC-V is attracting a lot of attention, even though from an EDA perspective it is really just another instruction set for a core, and that does not affect anything about the RTL in the backend… It is also clear that not everyone understands the difference between an open standard and an open-source implementation of something.
There seem to be two main takes on how to use RISC-V:
- As a standard instruction set architecture to rival ARM (and other ISAs) for cores sold as IPs. SiFive and Andes are both doing that, building cores with RISC-V as the ISA, and selling them. Where in the past they would have had to invent their own ISA (Andes started with their own ISA, but today they claim all their sales are on RISC-V).
- As a customizable instruction set for custom cores. You start with some RISC-V instruction set variant, and then add your own application-specific instructions in a custom core. Doing this with ARM would require an architectural licenses and lot more work. With RISC-V, there is no license cost associated with the ISA at least, and you might be able to start from an open-source core. Or do one from scratch.
The customizable aspect is clearly more interesting from a tooling perspective. Customized processors mean more processor designers, and therefore more customers. At the DAC, we had several examples of this:
- Codasip – provides tools to build custom processors. They have switched to using RISC-V as the starting-point ISA, because it just makes everything simpler. Less custom tooling to create, a bigger ecosystem of at least basic software tools.
- Imperas – provides instruction-set simulators that can be customized, and some tools to explore the impact on software of using custom instructions.
- Meta – a panel presentation from Meta’s VR/AR division seemed to indicate that they have designed multiple custom little controller cores to optimize power/performance.
What is missing today in the market is something like the polished tooling for Cadence Xtensa processors – even though most RISC-V people I talked to felt that its model was actually mostly selling pre-configured cores. The customization aspect is not all that important. To me this seems like an obvious play, but maybe something like Codasip is the solution. Even though Codasip did not think they competed with Xtensas, as the Codasip tools allow much more customization.
I listened to a panel presentation by one of the SiFive founders, and he was very gung-ho about the technical benefits of the ISA itself. Claiming that its modern design makes for more efficient processors than alternatives like ARM and x86. I am not convinced that this matters much for larger advanced cores, but he was pretty adamant.
The open-source angle on RISC-V is confusing to many people. There is an ecosystem of simple open-source cores around, and a certain open-source/stuff-should-be-free attitude around the ISA. Long-term there is no reason why RISC-V will not have the same kind of commercial tool support as ARM. And most users will likely buy a ready-to-use commercial core in a way that is no different from IP licensing in the past. Currently, RISC-V is clearly not supported by the same extremely well-developed ecosystem that surrounds the cores from ARM.
Keynote: Mark Papermaster, AMD
Brought in Ansys and Synopsys CEOs in, as recordings!
Chiplet: Moore’s law slowing down. Solution is to go for chiplet designs. Which drives the need for more automation of design as there are more moving parts when a design is split across chips and spread out in both 2D and 3D.
Pre-silicon: Claims 225x increase in use of emulation gates at AMD from 2013 to 2023. Doing firmware presilicon on emulator and FPGA. Did not mention virtual platforms, but that does not mean they could be using it anyway.
Cloud: AMD is using Google and Microsoft Cloud to run EDA software! Hybrid model.
Specialized compute engines: “Hybrid execution engines” provides 10x or more perf per watt compared to GPU(?). They want to get to 30x improvement in HPC and ML workloads by 2025. Measured as energy for training. AMD seems to be adopting Xilinx Vitis AI as their front-end tool for AI, including for running on CPU. Their oneAPI?
Small computer architecture note: 3D v-cache provides performance improvements for EDA software like Synopsys VCS. Claim 66% improvements in throughput.
His presentation contained a good illustration of different types of accelerators… but one that seems to have been used elsewhere before:
Keynote: Anirudh Devgan, Cadence
Semiconductor industry is not cyclical, just a steady growth. 2021: 553BUSD, doubling to 1000BUSD in 2028. I believe more in the Needham analyst – long-term trend is clear, but supply/demand for manufacturing will not always balance.
Exploding design cost – but if you amortize it over the volume, maybe it is getting cheaper since volumes keep increasing.
EDA development and improvements: it is there! It might look like EDA is “doing the same”, but we can produce much bigger designs with smaller teams today compared to 10-20 years ago.
Moore’s law is slowing down. Growth in performance comes from putting more things on a chip. In 2022, we still have another four generations to go, and then we can use 3D IC to keep scaling performance for yet another decade. Specialized compute accelerators needed to handle all the data being produced.
System companies doing more and more silicon.
About the metaverse trend: wants to include the use of digital twins in the metaverse!
Software has to become more parallel – even for coupled problems, not just the easy parallel problems. This is a huge area of investment for the EDA industry.
AI – can provide a 100x improvement in productivity. For him, AI is about pattern-based algorithms. Requires data. Still needs the basics in place, and do AI on top of that. AI could be about doing many runs on a design and getting value out of the variants – which is what a designer does manually. Based on proper physics-based algorithms.
- Gradient-based algorithms do not work on optimizations like place-and-route and synthesis
- AI can maybe be used to do gradient-free optimization that can work with problems like those. Can run faster and produce better results than a human can produce. With reinforcement learning, you can do a cold start without a lot of initial training.
- AI can be used to drive smarter verification. Look across many runs to see patterns.
Keynote: Steve Teig, Perceive
This talk was all about machine learning, and how we are doing it wrong as an industry. Much more technical and focused compared to the previous two. Steve is the CEO of Perceive, a company building better machine-learning solutions.
Trend in deep learning (DL): more and more parameters, running on bigger and bigger piles of GPUs (single GPU to a rack to a building to a campus full of GPUs). Where did efficiency go? DL is “anti efficiency”, bragging about just how big and expensive their models is. Training a giant model costs 8MUSD or more, and produces more carbon than your car will for its entire life.
Bloated giant models are also not trustworthy. We don’t really know what is going on. He showed some examples of where adding noise to an image totally changes the results from trained networks.
Message: we have to make machine learning into a science. Rigorous machine learning is compression, finding structure in data. With that, he went through three points about machine learning…
Myth: average accuracy is what you should optimize
- Can be useless even if average accuracy is very high:
- Serious error vs non-serious error
- It almost never what customers want – “works mostly” is quite useless
- You should look for whether the models make mistakes.
- Penalize errors based on severity, not based on how often then happen.
- Not all data points are equal!
- Training set should be weighted based on the amount of unique information. Balancing rarely works.
Misunderstanding: networks are richly expressive
- Only functions you can build are combinational – functions that need memory cannot be represented.
- CNN, TCNs, Attention, Transformers, … None of them have any state or memory
- Neural networks are not actually turing-complete or universal approximators. Only under certain circumstances that do not happen in reality.
- Recurring networks are just final state machines with some registers in them. Equivalent to a regexp in terms of the languages they can recognize – they are FSMs that can count.
- Today’s neural networks cannot add memory, and thus they cannot even really parse C code perfectly. Cannot recognize a palindrome or repetition.
- So no, they are not conscious.
- Heavy lifting is done by activation functions – and if we only use ReLu it is not surprise that we need so many of them to do anything.
Mistake: compression hurts accuracy
- Learning, inference of structure from data, is a compression operation
- “The simplest model is the best” (but what is model, simplest, best)?
- More structure -> better compression, fewer arbitrary choices, and with less arbitrariness, higher-quality results.
- Naive models often have far too many parameters, and that makes them brittle as they are too specific, too many arbitrary choices. They have really over-learned.
- Ad-hoc bit reduction. Use 8 bits instead of 32 bits? Often just arbitrary.
- Ad-hoc network architectures
- Principled: Kolmogorov complexity. The smallest program you can write to generate the data. Uncomputable, but it gives us a mental framework.
His team has achieved 100x compression from principled design, getting better performance and lower power. For example: train resnet50 with 25M parameters, on 1M images from imagenet. They compress it by a factor 100x. Meaning they can run it on a very small chip off of internal memory. Way more efficient. He believes we could find another 100x.
Analysts and EDA
There were two analyst presentations at the DAC. Charles Shi from Needham & Company provided one view of the market with a focus on ecosystem and trends. Jay Vleeschhouwer from Griffin Securities talked more about the stock price and revenue numbers. It is not very technical, but I find it offers some rather useful insights to consider the business side.
Charles Shi: “EDA Powers Through Semiconductor Cycles”
Charles Shi’s main message was that the EDA stocks (Cadence and Synopsys, primarily, as well as TSMC) will come through the current market situation in better shape than most stocks. While the overall semiconductor market is probably going into an oversupply-driven downturn fairly soon, the hardware design ecosystem is going to continue to do well.
Charles shared a diagram that I have seen repeated in a few more places, showing the various players involved in the (fabless) semiconductor industry.
His presentation provided a good rundown of current industry trends.
Moore’s law is losing speed. The cost per transistor is not falling as much after 7nm – seems pretty constant since 2018. “Transistors are no longer free” – going to a next-generation process gives you the same cost per transistor. Where you used to get a whole lot more at the same price. And this was before the latest price increases from the foundries. As a result of increased ambition in chip design, the cost per chip is actually increasing. Which is a new effect – until now, semiconductors have fundamentally been a deflationary pressure on the global economy.
Scaling is slowing down. It has already hit analog and memory, and logic is starting to be affected. This means that a chip with the same market position tends to get bigger across generations. Charles used data about Apple’s iPhone chips to show how they were slowly becoming bigger.
The slowed scaling means that large designs tend to come up against the reticle limit, and that is in turn a driver for chiplet-based designs. Chiplets will be the only possible way to do leading-edge designs since there is no way to get enough content into a single manufacturable chip.
Finally, the above pressures make it more profitable for companies to bring chip design in-house, provided they have the volume. When chips become more expensive to produce, the immense cost of design becomes comparatively smaller.
The rise of more in-house designs and the technical complexities of chiplets are both really good news for EDA, as they both drive the sale of more tools.
Jay Vleeschhouwer: “EDA: A View from Wall Street”
This was far more about financials and size.
The EDA industry passed 10 BUSD revenue in 2021. Expect to see 11BUSD in 2022 – growth is accelerating. The “big three” (or four) take a bigger share of the industry: 85% today, up from 75% a decade ago, and from less than 70% in 2008. If Ansys’ EDA revenues are included, the big four companies together get 90% of total EDA revenue.
Ansys is always interesting in this context. Jay counts 20-25% of their revenue as EDA, which still means they are much smaller than Cadence, Synopsys, and Siemens.
The combined stock value of CNDS + SNPS is about 90BUSD, or 11x 2022 revenue. This is a significant increase in the valuation compared to 6 years ago, when the combined market cap of CNDS+SNPS+MENT was just 17BUSD!
- Total of 59BUSD of spending in 2021.
- Intel has the largest R&D budget in the industry, does about 25% of total industry R&D.
- Intel spends about 5% of its total R&D budget on EDA, making it have the largest spending on commercial EDA, 600 MUSD.
- All the players have accelerated their spend on R&D in 2021 and early 2022.
2022 revenues in EDA, estimates and analysis:
- Cadence will grow 14% to 3.4 BUSD. They have gained market share in recent years (basically taking it from Synopsys).
- Synopsys EDA will grow 13% to 4.64 BUSD (note that part of Synopsys total revenue is the software business that Jay does not count into EDA).
- The EDA part of Siemens will sell about 1.8 BUSD (estimate). Siemens EDA has maintained approximately 20% market share since being acquired in 2017. Siemens has good momentum in physical verification (Calibre) and PCB.
- Ansys is about 380MUSD in EDA – definitely much smaller than the big three.
Current growth areas for the EDA vendors, as Jay sees it: hardware-based verification (emulation+prototyping) and IP. If you think about it, that is a bit different from the other perspectives that talk about tools for chiplet design – or maybe it is a matter of time horizon. Could be that the big chiplet revenues are still to come.
- Hardware-based verification is about 600 MUSD total in 2021.
- Synopsys has 65% market share in EDA IP. Total EDA IP market in 2021 was 1.7 BUSD. Note that this analysis does not include IP providers that are not counted as part of EDA. Not sure where Jay draws that line.
Boxes on the Show Floor
There is always something interesting to see on the show floor.
The Azure EDA Server
Not all cloud servers are alike. Both Google Cloud and Microsoft Azure were at the DAC talking about their more specialized offerings for EDA. Note that Amazon web services were notably absent. EDA tools are compute-intense, and sometimes memory-intense. And in some cases, they do not necessarily scale well to multicore.
Microsoft Azure had an example of an optimized design on the floor. The board had two Cascade Lake processors, each with 16 physical cores — but the customer only gets to use 12 cores, as the other four are dedicated to the management system and hypervisor (if I got it right). The concept is that the 12 cores can be used to fully run customer software. The board also contained an FPGA-based custom smart NIC for network execution offload, further reducing the load on application processors. It is even possible to customize the configuration to shut down all but four cores on each processor to get maximum turbo frequencies for workloads that are poorly threaded (or maybe have annoying license terms).
Yet More FPGA Prototypes
To my surprise, there were several independent hardware accelerators for RTL on the floor. In addition to the big-three vendors.
- S2C are a long-term player in the market. Talked a sales guy who told me they might sell a few units to some dept at a company like Intel, but the big deals get taken in bundles by Synopsys and Cadence. Notable that they are starting to use Intel FPGAs as the basis for their prototypes, in addition to the Xilinx FPGAs that for some reason seem to be the most common.
- Corigine – company that is only a couple of years old. Founded by ex-Cadence Palladium and ex-Synopsys Zebu engineers to build a single box that can be either a prototype or an emulator depending on how it is used. Claims more features, higher speed, and lower price than Cadence and Synopsys.
- X-Epic – another new company. From Hong Kong. They had a large booth.
Automotive Virtual Platforms
There was an engineering track session about digital twins for automotive, organized by Martin Barnasconi from NXP. Both he and his colleague Manfred Tanner got Covid though, so Martin was missing and Manfred showed up as a recorded presentation instead of live.
Manfred Tanner: “Automotive Digital Twin from a VP Perspective”
Manfred talked about the ecosystem and supply chain involved in building virtual platforms for automotive (with virtual platforms being part of a digital twin).
The virtual platforms build up towards digital twins:
- IP and SoC models – single blocks or chips
- ECU model – multiple chips
- Vehicle model – multiple ECUs
- Integrate with automotive-specific simulations, like network simulation
- Environment models – add the interaction with a simulation of the environment, other simulation environments
He made the very good point (familiar) that a simulation model must be designed after the question you want to answer.
This slide provided a good summary of the sometimes hard modeling and simulation problems:
Manfred called for more standards to make it easier to build up combined platforms. Currently, he is not seeing larger simulation being put together, going beyond the single ECU. Need to standardize how to do Ethernet or CAN across models from different sources.
The talk ended with another good summary slide:
Fred Hannert: “Virtualization at GM”
Fred Hannert provided a packed talk, describing how General Motors is currently defining and using virtual platforms for automotive. He is part of GM’s “Virtual ECU” (VECU) development team. His team has done an impressive job of defining just what it is that they are working on.
First of all, there are five levels of VECU levels:
- Level 0 – Simulink and similar abstract algorithm/control law modeling – no code yet
- Level 1 – Software application, host-compiled, on its own with unit tests
- Level 2 – Software application, host-compiled, with an emulation layer to run on the host
- Level 3 – host-compiled binaries, production software almost complete, with emulation layer to run on host
- Level 4 – target-compiled actual complete binary. This level was further split into four sublevels, basically corresponding to levels of completeness in the model. They can start with a model that can boot an operating system, and go on to add more and more details and connectivity.
- Level 5 – the real ECU, obviously running the real code.
For execution, they want to run purely on software. If they were to employ hardware like FPGA prototypes (like some vendors have been pushing to get RTL-equivalence), the simulation would just not scale. They need 1000s of parallel runs for automatic regression testing.
He had some interesting things to say about the quality of models they get from industry today:
I found the part about modeling wireless quite accurate. That is not an easy thing to do, for many reasons. I have seen the same myself.
Fred also said that GM now requires simulation models to be made available for any new chip being acquired. That is definitely a good idea, and I think he is right to prioritize model availability over any particular standard. As long as the models can be made to interact and connect to other simulation tools, exactly how they are built should not matter to the user.
He also said that the main obstacle to usability is simulation speed. Apparently, many of the models they have got are not particularly great in that area. Sounds familiar… It is not all that easy building truly fast models.
To finish up this long post… what about San Francisco? It has been a while since I last went there, and I must say it was quite sad to see the state of the city today. There were many closed shops and restaurants, and the city felt a bit more worn and dirty than I recall it. Downtown is obviously the worst hit, and definitely the area around the Moscone center. Still, the tourists appear to be back in massive numbers, so hopefully the city can recover.
Covid was both a presence and not. It really felt like people are quite done with it, at least those who choose to attend a large event like the DAC. Compared to the way it used to be, people behaved as if there was no pandemic. Still, maybe half the attendees chose to wear facemasks, and masks were mandatory in a few of the larger booths. The conference handed out buttons to indicate your level of acceptable social distancing:
The effect of the pandemic was clearly seen in no-shows. Several people who would have attended could not come due to sickness. In my first session, both the session chair, the back-up session chair, and two out of six speakers were missing! Many posters never showed up to the poster sessions, possibly for the same reason.
Best Presentation Award for our DML Paper
As already said, I went to the DAC to do two presentations.
The first, “Automatic Checkpoint Support in the Device Modeling Language (DML)”, was prepared by my colleague Erik Carstensen and myself. It was in the Embedded Systems sub-track of the Designer track. Our presentation covers two features added to the device modeling language in the last year: saved, to automatically make model variables into a checkpointable and inspectable attribute, and the ability to have arguments to the after statement. These are actually just a few of the improvements made to DML in past year or two. In addition to having been open-sourced in late 2021, the language is constantly evolving to better solve the problems of our modeler users.
The presentation won the best presentation award (in its track)! The last time I attended an in-person DAC, in Las Vegas in 2019, I also won the best presentation award (for a talk on cloud-based simulation). I never expected a repeat of that.
The second talk, “Building Systems Software-First”, was an invited presentation in the back-end subtrack of the Designer track. I talked about you can use simulations of various types and abstractions to bring software into hardware design (and vice versa). I showed some examples of how we use Intel ISIM to simulate across domains, including not just the functional aspects of a design, but also power, thermal, and performance.