We had a chance to talk with Krste Asanovic, SiFive's Chief Architect, who used some of the time to present the new Out-of-Order Application Processor Core IP called SiFive U8 series.
As the name implies, U8 succeeds in the U7 series launched in 2018, and one of the key goals of this processor is to disrupt the market of endpoints, enterprise and networking as well edge computing and autonomous.
If you want to go a bit deeper endpoints include DTV / Smart Home, AR, VR,MR, Set-Top Box, Game Consoles and Digital Imaging. Enterprise and networking include 5G Wireless, Core/Edge Routers, Base Stations, and Access Points while Edge and Autonomous addresses autonomous vehicles, CDAS, IVI / Cluster, Head-up displays, and Telematics, military, and aerospace as well as robots and drones. Just by looking into this, one can imagine that they are going after the market that is currently dominated by ARM.
SiFive calls the U series a scalable and power-Efficient 64-Bit microarchitecture for Embedded Intelligence
SiFive U8 series starts with U84
The new IP core design goals for the U8 series were 1.5X performance per watt, 2X area efficiency and class defining scalability. The area efficiency is based on SiFive internal estimates of SPEC Int/GHz per Watt, and Power per MM2 of SiFive U84 Core with L2$ in 16nm vs Competitor Core estimates of core implementation in 16nm.
To get the desired performance per watt, the company focused on SuperScalar, Out-Of-Order core with 10-12 Stage and triple-Issue, and wanted to achieve area efficiency with low power small configurable area that is at the same time extensible. The scalability part includes Parameterized microarchitecture with Composable Cache, Optional FPU, and nine Core Mix + Match Cluster.
2.6GHz clock at 7nm, 3.1 faster than U7
The U84 core, the first of many from this generation, is a quad-core CPU with 2MB L2 cache, and it measures 2.63 mm2. The core size without L2 cache is just 0.28 mm2. Compared to the previous U7 series the new U84 has 2.3 times higher IPC and 3.1 times better total performance at 1.4 times higher frequency. At 7nm, U84 is expected to get 2.6GHz clock.
We don’t want to go too deep into microarchitecture, but the new U8 features advanced multi-level branch prediction, MMU with L2 backed TLB, Multiple issue Execution units as well as Efficient Register Renaming. The U8 fetch unit comes with configurable size and organization L1 cache and compressed instructions set with out of order resolution and direct and indirect jump and return capability from the fetch Queue.
The U8 dispatch and rename unit supports superscalar rename and dispatch as well as register update and program counter tracking with instruction group tracking.
The U8 integer execution unit supports power-optimized issue queues, flexible issue capability from issue queues and low latency forwarding.
The U8 class processor Core IP comes today with two variants, SiFive U84 with high-performance, scalable 8-series core and U87 high-performance scalable 8-series core with vector processing.
Modality
The SiFive RISC V SoC supports a heterogenous core complex where a customer can mix U7 series with U7 series and even S7 series with PCIe and USB support and SRAM/DRAM or even HBM2E support.
Krste confirmed that the SiFive is working with some lead customers and that U84 will generally be available next year. The chip has been taped out, and the U87 is expected toward the end of next year.
James Prior, who takes care of PR and Analyst relations for the company, confirmed that the company has a lot of interest from end and edge point AI, automotive, and datacenter attach markets. The SiFive cores offer 10 percent more performance of Cortex A72 with half of area.
The A72 has been rather popular in automotive business, and the fact that someone can offer slightly more performance at half of die area makes a compelling case for a lot of customers.