

# C H I P S



# Inside Tiger Lake: Intel's Next Generation Mobile Client CPU

### **Xavier Vera**

Principal Engineer, Client Performance Architecture Intel Corporation



.... all at the same power envelopes with increased power efficiency





# Outline



### Intel 10nm SuperFin Process Technology

### **SoC Architecture**

### Willow Cove Core

X<sup>e</sup> Graphics

### SoC Uncore, IO and Display

### **Power Management**





# **New High Performance XTOR**

Innovation Across The Entire Stack, From Channel To Interconnects







# **Improved Metal Stack**

Innovation Across The Entire Process Stack, From Channel To Interconnects



### **Super MIM Capacitor** >5x increase in MIM capacitance

**Novel Thin Barrier** reduces via resistance by 30% Thin layers of different Hi-K materials, each just a few Angstroms thick, stacked in a repeating "superlattice."







# Architecture and Design Improvements





# **Introducing Tiger Lake**







# Willow Cove Core



- Built on the Sunny Cove architectural foundation
- Redesigned fundamental circuits to take advantage of SuperFin technology
- Redesigned caching architecture to larger noninclusive, 1.25MB MLC
- Control Flow Enforcement technology to help protect against return/jump oriented attacks





# **The Result**







# **The Result**

Large Frequency Gains







# **The Result**

Increased Power Efficiency





Voltage











- Large improvements in performance per watt efficiency
- Up to 96EUs with increased capabilities
- 3.8MB L3 cache
- Increased bandwidth to LLC and Memory

| GTI                      | GTI GTI<br>GAM<br>COPY ENGINE |       |    | 3.8 MB L3 CACHE |                                         |                              |                              |                     |   | MEDIA<br>ENGINE |   |   |                          |
|--------------------------|-------------------------------|-------|----|-----------------|-----------------------------------------|------------------------------|------------------------------|---------------------|---|-----------------|---|---|--------------------------|
| GAN                      |                               |       |    |                 |                                         |                              |                              |                     |   | 0.00            |   |   |                          |
| COF                      |                               |       |    |                 |                                         |                              |                              |                     |   |                 |   |   |                          |
| SHA                      | RED F                         | UNCTR | NC |                 |                                         |                              |                              |                     |   |                 |   |   |                          |
| GEC                      | GEOMETRY                      |       |    |                 |                                         |                              | PIXEL BACKEND                |                     |   |                 |   |   |                          |
| RAS                      | RASTER                        |       |    |                 |                                         |                              | HIZ / DEPTH                  |                     |   |                 |   |   |                          |
| ICACHE & THREAD DISPATCH | B                             | B     | 3  | В               | DATAPORT<br>[LD/ST]<br>MEDIA<br>SAMPLER | SLM                          | SLM                          | DATAPORT            | B | EU              | 8 | 3 | L AL                     |
|                          | EU                            | EU    | E  | В               |                                         |                              |                              | MEDIA               | 문 | 8               | 명 | 8 | HE & IT                  |
|                          | EN                            | EU    | EU | E               |                                         | L1 +<br>TEX-<br>CACHE        | L1 +<br>TEX-<br>CACHE        | SAMPLER             | B | 8               | 8 | 8 | ICACHE & THREAD DISPATCH |
|                          | EU                            | B     | E  | B               | SAMPLER                                 |                              |                              | SAMPLER             | В | E               | æ | Э | - AIN                    |
| ICACHE & THREAD DISPATCH | EU                            | B     | E  | EU              | DATAPORT<br>[LD/ST]<br>MEDIA            | SLM                          | SLM<br>L1 +<br>TEX-<br>CACHE | DATAPORT<br>[LD/ST] | E | E               | E | 3 | icher et al              |
|                          | ß                             | B     | 3  | B               |                                         |                              |                              | MEDIA<br>SAMPLER    | 3 |                 | E |   | ICHURE & INVEND USPATUM  |
|                          | EU                            | EU    | B  | 8               | SAMPLER                                 | L1 +<br>TEX-<br>CACHE        |                              |                     | e | EU              | E | В | END OIS                  |
|                          | B                             | B     | B  | B               | SAMPLER                                 |                              |                              | SAMPLER             |   |                 | E | 8 | - united                 |
| ICACHE & THREAD DISPATCH | EU                            | EU    | 3  | В               | DATAPORT<br>[LD/ST]                     | SLM<br>L1 +<br>TEX-<br>CACHE | SLM<br>L1 +<br>TEX-<br>CACHE | DATAPORT<br>[LD/ST] | B | E               | E | 8 | IUNUT                    |
|                          | E                             | EU    | 3  | B               | MEDIA                                   |                              |                              | MEDIA               | B |                 | E | B |                          |
|                          | EU                            | E     | E  | В               | SAMPLER                                 |                              |                              | SAMPLER             | E | E               | E | B | ILSUNE & INITAL USATUN   |
|                          | 3                             | B     | B  | 8               | SAMPLER                                 |                              |                              | SAMPLER             | æ | g               | æ | æ | - NICT                   |





# **Fabrics and Memory**

### **Coherent Fabric**

- 2x increase in coherent fabric bandwidth
  - Dual ring microarchitecture
- 50% LLC size increase to non-inclusive
- IO caching

### Memory

- More efficient memory bandwidth for graphics and cores
  - Support for up to ~86GB/s of memory bandwidth
  - Deeper, narrower dual memory controller for higher efficiency
- Architectural support for LP4x-4267 and DDR4-3200 (initial) and up to LP5-5400
- Intel<sup>®</sup> Total Memory Encryption





# Display

### Connectivity

- The future calls for more displays, higher resolutions and better image quality
  - 8K displays
- Added a dedicated fabric path (DIP) to memory to maintain quality of service
  - Up to 64GB/s of isoch bandwidth
- External display connectivity through DP/HDMI protocols





# 10

### PCIe Gen4 on CPU for low latency, high bandwidth device access to memory

- Full 8GB/s bandwidth to memory
- ~100ns less latency when attached to CPU vs PCH

### Integrated Thunderbolt 4 and USB4 support

- Up to 40Gb/s bandwidth on each port
- USB tunneling

### Integrated Display output via Type-C

- DP alternate mode/tunneling over Thunderbolt
- DP-in ports for discrete graphics card display output to mux over type-C port
- HDMI/DP/Type-C connectors supported



# **Power Management**

- Moved extensive logic to gated power domains
- Increased FIVR efficiency
- Deeper package C state turning off all clocks in CPU
- Hardware-based save and restore logic
- Autonomous DVFS in coherent fabric and memory subsystem to optimize frequency and voltage based on bandwidth and latency





 Goal is to adapt frequencies (and voltages) to required performance level



# FABRIC







- Workload is in a core centric phase with few requests
- Core increases frequency to match required performance
- Fabric and memory reduce frequency to match low number of requests







- Workload enters a phase where requests go out to fabric and hit the large LLC
- Core reduces the frequency, while fabric increases it to match the increasing number of requests
- Memory stays at low frequency since requests are served from cache







- Workload enters a phase where data requests miss in the LLC and go to memory
- Core keeps reducing frequency
- Memory increases frequency to match the increasing demand
- Fabric can reduce frequency







 Working set is within the core caches, core raises the frequency to deliver max frequency, while fabric and memory stay at the minimum frequency to deliver the few requests







# **Tiger Lake SoC Architecture**

Leveraging Process Tech Improvements, Tiger Lake SoC Architecture delivers significant advancements across a wide set of SoC IPs, with:

- More than a generational increase in CPU performance in Willow Cove CPU core
- Massive improvements in graphics power efficiency in X<sup>e</sup>-LP graphics IP
- Improved fabric and memory to deliver more bandwidth *efficiently*
- Rich I/O... and much more!







# Legal Disclaimers

Capacitance and resistance measurements and CPU utilization rates are based on silicon projections and preliminary development board measurements as of March 2020. They may not reflect all publicly available updates and are subject to change.

Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors.

Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more complete information about performance and benchmark results, visit http://www.intel.com/benchmarks.

Refer to https://software.intel.com/articles/optimization-notice for more information regarding performance and optimization choices in Intel software products.

Your costs and results may vary.

Intel technologies may require enabled hardware, software or service activation. No product or component can be absolutely secure.

Intel contributes to the development of benchmarks by participating in, sponsoring, and/or contributing technical support to various benchmarking groups, including the BenchmarkXPRT Development Community administered by Principled Technologies.

Intel does not control or audit third-party data. You should consult other sources to evaluate accuracy.

© Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of others.





