ASICs Behind the Scenes
A lot of people love to talk about ASICs and merchant silicon, but very few really understand the basics. Now there’s a quick way to fix that: watch the excellent Tech Field Day video with Dave Zacks from Cisco Systems.
If you know nothing about ASICs, the first part of the video is pure gold – it describes the basics of the underlying technologies and development processes as well as differences between generic CPUs, FPGAs and ASICs.
The second part of the video is the reason Cisco decided to do this presentation: Dave started praising the beauties of Cisco’s ASICs, including the recirculation and programmable pipeline… but even though there’s plenty of marketing there, the video is still well worth your time.
BTW, the programmable ASIC pipeline isn’t exactly a Cisco-only feature. It’s also available in Intel’s FM6000 and probably in new HP ASICs we discussed on Software Gone Wild.
Finally: we all know that someone paid for that video to be shot, but whenever you get something valuable, it’s time to say thank you regardless of how and why it was produced (particularly if it’s free) – so THANK YOU Dave and Cisco Enterprise Switching!
Is the new ASIC just a Cisco custom FPGA, or are the technologies fundamentally different?
An ASIC includes a fixed hardware configuration with a almost fixed software instruction set - you cannot change this without re-engineering and replacing the chip once it's been constructed. That's why a Cat3550 will never do IPv6 or GRE in hardware, f.e..
A programmable ASIC is somewhere in between. It has a fixed hardware configuration but was designed and built with certain features & goals in mind that might be relevant in the future. The instruction set is not fixed and can be updated by software (a.k.a. "microcode").
As Ivan already mentioned, this concept is not entirely new, afaik it's been there with the EARL ASICs of the Cat6K platform for quite a while as well as with the Cat3560-X / Cat3750-X platform.
But the difference is, that UADP brings it to a entirely different scale in what you / Cisco can do with it.
All asics are programmable, you can either use Verilog or VHDL , but the thing is fpgas, cpu's and asics are programmable.
Tosach is right but he must also know that most of the differences in FPGA and ASIC are in terms of adaptability (heat distribution, space distribution, power consumption) because they're meant for prototyping.
"It has a fixed hardware configuration but was designed and built with certain features & goals in mind that might be relevant in the future. "
This refers to the UASIC , Unified (probably serves as a trademark here but i'll use it for simplicity's sake)because it's made for long term use, and technology adaptation regardless of further implementations.
ASIC: Fixed H/w functionality
Programmable ASIC: A pipeline that changes based on Software instructions written in C or Assembly or special microcode.
1) With program-ability is concerned, Is UADP similar to Barefoot chip ?. BTW, What is use of program-ability and how different it is from TCAMs ?.
2) This talk seems to open the debate of doing overlay's with general purpose CPUs. Given that chips like UADP optimizes re-circulation, is not doing overlay with general purpose CPU costly?
3) Is not similar question arise for NFV deployment, that is. Is not costly to switch using general purpose cpu when compared to ASIC ?
FPGAs are almost never used as the main "engine" due to several constraints (see the presentation).
Other ASRs utilize other chips, often merchant silicon (ASR9K -> Broadcom Trident / Typhoon / Tomahawk, ASR90x -> Broadcom) which also might be NPUs.
The performance gain comes due to the parallelization:
The glory Cisco 7200 (predecessor of the ASR1K) platform had only one CPU with a single core that could serve one thread. So everything had to be scheduled in a way the CPU time was "good enough" distributed between all the stuff going on (e.g. packet forwarding, CLI, routing protocols, QoS, ...). Went pretty decently if you had a decent amount of traffic & features but ugly with lots of features and traffic because no task got "enough" CPU time - pretty big deal if you have real-time traffic (jitter and so on, IEE1588 / PTP over that platform -> bad idea).
Now a ASR1K has per ESP one or maybe more QFPs with I think 100-something cores each now and whatever number of threads per core. You can also overload that thing for sure - but you need _a lot_ of traffic & features until you run out of CPU / NPU time.
It's basically the same concept with PC CPUs: A decade ago we had single core CPUs what a single process could block the whole box for a while. Than it went down with Intel introducing HyperThreading - 2 threads on one CPU. After that it moved on with dual-, quad-, octa-, ... core CPUs which might also run HyperThreading. Nowadays, nobody cares if a process takes 100% CPU time at 3 or 4 logical cores if there are still 3 or 4 available.
I'm not an silicon engineer, so really I don't have any "real insight" into this, but at the end of the day, I think it's not a question of if you can to a certain task with an ASIC, a p-ASIC, a FPGA, a NPU or a general CPU - for sure, you can.
But it's a matter of what you need, want to do in the future with that chip and what price tag you need:
Merchant silicon is "cheap" but might no have what you need, custom ASICs take long to develop and thus is costly, FPGAs are flexible but expensive and power-hungry and have probably a bigger form-factor (physical dimensions), NPUs a something between and general CPUs don't provide that scale yet like an NPU (never heard of a 100+ core CPU).
Thank you for great answer! Now it is pretty clear for me.
Nobody in his right mind would call Broadcom Tomahawk NPU ;))
Unbelievably bad choice of naming from Cisco marketing I guess as pretty self-inflicting.