Earlier this year, Mark Horowitz gave a talk at the 1st Berkeley Symposium on Energy Efficient Systems titled “Why Design Must Change: Rethinking Digital Design.” I did not have the chance to attend that symposium, but was lucky enough to attend an encore presentation by Mark a few days ago at the University of Texas, Austin as part of the Computer Architecture Seminar Series. If you don’t know who Mark Horowitz is then you truly must have been hiding underneath a rock. Suffice it to say he is pretty well regarded in the electrical engineering community, having received several best paper awards over the years, being the current chair of electrical engineering at Stanford University, and also having co-founded a little company called Rambus – so yes, when he speaks people generally tend to listen. The complete presentation slide as well as a recording of the Berkeley presentation can be found here . What follows below is quick summary of the talk and s few comments of my own.
The main point around which the talk revolves is as follow: Design costs and power dissipation have over the last decades gone through the roof. The good news is that some of the major contributing factors such as die growth and super frequency scaling have somewhat stopped. The bad news is that voltage scaling has also hit some limits. As voltage seizes to scale downwards, our biggest tool for scaling energy becomes significantly less effective. Mark is not optimistic on overcoming these challenges since overcoming limitations set by fundamental physics is rather difficult. For example he is very concerned about the on/off current ratio, which is a valid concern, but with regards to that I’m fairly certain that for several of the upcoming process generations, the process engineers have quite a few neat tricks up their sleeves. Mark is convinced that while silicon is not going away any time soon, the growth rate is going slow down significantly and eventually we will think of silicon the same way think about concrete and steel. Thus, instead of hoping that the process guys will save the day, we really need to figure out how to use what we already have, and by this he means we need to figure out how to reduce the amount of waste in our systems.
Some of the waste stems from the fact that maybe we are simply doing more work than we really need to – after all, if we do less work we will need less energy. This of course, could be caused by the fact that we are using the wrong tool for the job, and thus are creating more work than needed. The fact is that for specific tasks ASIC designs are more efficient that DSPs/Vector Engines, which in turn are more efficient than CPUs. Unfortunately, ASICs while efficient are also prohibitively expensive and few markets can justify their use. Which brings us full-circle back to where we started a few paragraphs ago: designing specific chips that do what we need, and do it well while consuming little power is way too expensive. The solution: A chip generator. Now, before you panic and say we’ve been there and tried that, relax, Mark is not talking about silicon compilers, that will take your nicely written high-level C++ algorithm and convert it into a perfectly working, super power efficient silicon that has been verified automatically - a nice dream indeed.
Instead, the message that Mark is sending to the chip and SoC design houses is that the designs they are putting together are most likely not the optimal solutions for what applications the customers have in mind. Instead, he would prefer if these companies instead of trying to guess what the customers want should rather put their efforts into developing a chip generator that would allow the eventual consumer to configure the final SoC as needed. Want half the cache: no problem - want to eliminate some not needed IOs: no sweat - need to configure your memory differently to optimize the performance, or add some extra math processing: no sweat. Essentially, let the customer figure out what they need, and you just focus on developing a tool that will put it together for them. I have to admit Mark, this does sound fantastic indeed. As a matter of fact, it is something that occurred to me a while ago while working on several SoCs that were very similar, but not similar enough for some of our customers, and thus required separate design spins. The problem is that while the chip generator might be more feasible than a silicon compiler, it is still something immensely difficult to pull off.
For one thing, if chips are difficult to design, it is quite conceivable that to design something that will design these chips might be even more so – is it worth it? Also, while process scaling might slow down one day, and library updates might become less frequent, that day is not here yet. Once again, if migrating a chip from one process to another is a major undertaking, optimizing the generator for the next process might be more work then re-doing some designs while adding a few customer requested enhancements. I’ve worked on several tools that had to be ported to new processes and most of the time it took more work than was anticipated. Finally, parameterizing a few things here and there is probably possible, but making things you as a company own configurable and then ensuring that they will play nice with third party IP provided by other companies, that can also be parametrized, is another story. Don’t get me wrong, I completely agree with Mark’s vision, and I do think that the future does require what he is suggesting, but process scaling really needs to slow down significantly for this to happen, and that is simply not yet the case. Too bad I forgot to ask him about the time horizon that he had in mind for the chip generator to become a reality.