The 7 nm microprocessor is engineered to meet the demands its clients face for gaining AI-based insights from their data without compromising response time for high volume transactional workloads. Big Blue said that the IBM Telum is designed with a new dedicated on-chip accelerator for AI inference, to enable real time AI embedded directly in transactional workloads, alongside improvements for performance, security and availability
The microprocessor contains eight processor cores, clocked at over 5GHz, with each core supported by a redesigned 32MB private level-2 cache. The level-2 caches interact to form a 256MB virtual Level-3 and 2GB Level-4 cache. Along with improvements to the processor core itself, the 1.5x growth of cache per core over the z15 generation is designed to enable a significant increase in both per-thread performance and total capacity IBM can deliver in the next generation IBM Z system. INM claimed Telum’s performance improvements are vital for rapid response times in complex transaction systems, especially when augmented with real time AI inference.
IBM said that Telum also features significant innovation in security, with transparent encryption of main memory. Telum’s Secure Execution improvements are designed to provide increased performance and usability for Hyper Protected Virtual Servers and trusted execution environments, making Telum an optimal choice for processing sensitive data in Hybrid Cloud architectures.
The predecessor IBM z15 chip was designed to enable industry-leading seven nines availability for IBM Z and LinuxONE systems. Telum is engineered to further improve upon availability with key innovations including a redesigned 8-channel memory interface capable of tolerating complete channel or DIMM failures and designed to transparently recover data without impact to response time.
Telum adds a new integrated AI accelerator with more than 6 TFLOPs compute capacity per chip. Every core has access to this accelerator and can dynamically leverage the entire compute capacity to minimize inference latency. Due to the centralized accelerator architecture with direct connection to the cache infrastructure, Telum is designed to enable extremely low latency inference for response-time sensitive workloads. With planned system support for up to 200 TFLOPs, the AI acceleration is also designed to scale up to the requirements of the most demanding workloads.
Keeping data on IBM Z offers many latency and data protection advantages. The IBM Telum processor is designed to help clients maximize these benefits, providing low and consistent latency for embedding AI into response time sensitive transactions. This can enable customers to leverage the results of AI inference to better control the outcome of transactions before they complete. For example, leveraging AI for risk mitigation in Clearing & Settlement applications to predict which trades or transactions have high risk exposures and to propose solutions for a more efficient settlement process. A more expedited remediation of questionable transactions can help clients prevent costly consequences and negative business impact.