nvidia ada lovelace ‘geforce rtx 40’ gaming gpu detailed: double the rops, huge l2 cache & 50% more fp32 units than ampere, 4th gen tensor & 3rd gen rt cores

Details regarding the NVIDIA Ada Lovelace Gaming GPU which will power the GeForce RTX 40 series graphics cards have been revealed. The new information comes from Kopte7kimi & talks about the block diagram of the next-gen architecture.

NVIDIA GeForce Ada Lovelace GPU SM Block Diagram Detailed: Bigger & Better Than Ever For Gamers!

The NVIDIA Ada Lovelace GPU architecture is no mystery anymore. We have learned the specific configurations that will power the next Gen AD10* series SKUs for GeForce RTX 40 series graphics cards and we have also seen leaked specifications of the lineup. Now, it’s time to talk purely about the next-generation graphics chip itself.

NVIDIA AD102 ‘Ada Lovelace’ Gaming GPU ‘SM’ Block Diagram (Image Credits: Kopite7kimi):

nvidia ada lovelace ‘geforce rtx 40’ gaming gpu detailed: double the rops, huge l2 cache & 50% more fp32 units than ampere, 4th gen tensor & 3rd gen rt cores

NVIDIA GA102 ‘Ampere’ Gaming GPU ‘SM’ Block Diagram:

nvidia ada lovelace ‘geforce rtx 40’ gaming gpu detailed: double the rops, huge l2 cache & 50% more fp32 units than ampere, 4th gen tensor & 3rd gen rt cores

Starting with the GPU configuration, Kopite7kimi compares the top AD102 GPU to various other GPUs from the green team. These include the gaming-focused Ampere GA102 and Turing TU102 while there’s also the HPC-Focused Hopper GH100 and Ampere GA100 added to the list. I’ll only compare the AD102 to its gaming predecessors since the HPC-focused designs are vastly different than consumer-centric offerings.

The NVIDIA Ada Lovelace AD102 GPU will feature up to 12 GPC (Graphics Processing Clusters). This is an increase of 70% versus GA102 which features only 7 GPCs. Each GPU will consist of 6 TPCs and 2 SMs which is the same configuration as the existing chip. Each SM (Streaming Multiprocessor) will house four sub-cores which is also the same as the GA102 GPU. What’s changed is the FP32 & the INT32 core configuration. Each sub-core will include 128 FP32 units but combined FP32+INT32 units will go up to 192. This is because the FP32 units don’t share the same sub-core as the IN32 units. The 128 FP32 cores are separate from the 64 INT32 cores.

So in total, each sub-core will consist of 128 FP32 plus 64 INT32 units for a total of 192 units. Each SM will have a total of 512 FP32 units plus 256 INT32 units for a total of 768 units. And since there are a total of 24 SM units (2 per GPC), we are looking at 12,288 FP32 Units and 6,144 INT32 units for a total of 18,432 cores. Each SM will also include two Wrap Schedules (32 thread/CLK) for 64 wraps per SM. This is a 50% increase on the cores (FP32+INT32) and a 33% increase in Wraps/Threads vs the GA102 GPU.

NVIDIA Ada Lovelace GPU Specs ‘Preliminary’:

GPU Name AD102 GA102 TU102 GA100 GH100
GPC 12 (Per GPU) 1.7x 2x 1.5x 1.5x
TPC 6 (Per GPC) Same Same 0.75x 0.67x
SM 2 (Per TPC) Same Same Same Same
Sub-Core 4 (Per SM) Same Same Same Same
FP32 128 (Per SM) Same 2x 2x Same
FP32+INT32 192 (Per SM) 1.5x 1.5x 1.5x Same
Warps 64 (Per SM) 1.33x 2x Same Same
Threads 2048 (Per SM) 1.33x 2x Same Same
L1 Cache 192 KB (Per SM) 1.5x 2x Same 0.75x
L2 Cache 96 MB (Per GPU) 16x 16x 2.4x 1.6x
ROPs 32 (Per GPC) 2x 2x 2x 2x

Moving over to the cache, this is another segment where NVIDIA has given a big boost over the existing Ampere GPUs. The Ada Lovelace GPUs will pack 192 KB of L1 cache per SM, an increase of 50% over Ampere. That’s a total of 4.5 MB of L1 cache on the top AD102 GPU. The L2 cache will be increased to 96 MB as mentioned in the leaks. This is a 16x increase over the Ampere GPU that hosts just 6 MB of L2 cache. The cache will be shared across the GPU.

nvidia ada lovelace ‘geforce rtx 40’ gaming gpu detailed: double the rops, huge l2 cache & 50% more fp32 units than ampere, 4th gen tensor & 3rd gen rt cores

Finally, we have the ROPs which are also increased to 32 per GPC, an increase of 2x over Ampere. You are looking at up to 384 ROPs on the next-gen flagship versus just 112 on the fastest Ampere GPU, the RTX 3090 Ti. There are also going to be the latest 4th Generation Tensor and 3rd Generation RT (Raytracing) cores infused on the Ada Lovelace GPUs which will help boost DLSS & Raytracing performance to the next-level.

The NVIDIA GeForce RTX 40 series graphics cards featuring the next-gen Ada Lovelace gaming GPUs are expected to launch in the second half of 2022 & are said to utilize the same TSMC 4N process node as the Hopper H100 GPU.

NVIDIA CUDA GPU (RUMORED) Preliminary:

GPU TU102 GA102 AD102
Flagship SKU RTX 2080 Ti RTX 3090 Ti RTX 4090?
Architecture Turing Ampere Ada Lovelace
Process TSMC 12nm NFF Samsung 8nm TSMC 4N?
Die Size 754mm2 628mm2 ~600mm2
Graphics Processing Clusters (GPC) 6 7 12
Texture Processing Clusters (TPC) 36 42 72
Streaming Multiprocessors (SM) 72 84 144
CUDA Cores 4608 10752 18432
L2 Cache 6 MB 6 MB 96 MB
Theoretical TFLOPs 16 TFLOPs 40 TFLOPs ~90 TFLOPs?
Memory Type GDDR6 GDDR6X GDDR6X
Memory Capacity 11 GB (2080 Ti) 24 GB (3090 Ti) 24 GB (4090?)
Memory Speed 14 Gbps 21 Gbps 24 Gbps?
Memory Bandwidth 616 GB/s 1.008 GB/s 1152 GB/s?
Memory Bus 384-bit 384-bit 384-bit
PCIe Interface PCIe Gen 3.0 PCIe Gen 4.0 PCIe Gen 4.0
TGP 250W 350W 600W?
Release Sep. 2018 Sept. 20 2H 2022 (TBC)

TECH NEWS RELATED

MSI AMD 300-Series Motherboards To Support Ryzen 5000 Series Processors

MSI AMD X370, B350, and A320 Series motherboards to receive the latest AGESA Combo PI V2 1.2.0.7 update to support the AMD Zen 3 Ryzen 5000 Series processors. AMD has announced great news about extending the portfolio for system builds. The Ryzen 5000 Series desktop processor will now support the ...

View more: MSI AMD 300-Series Motherboards To Support Ryzen 5000 Series Processors

Mesa 22.2 Allows Users To Disable AMD Infinity Cache & Brings Additional Changes in RadeonSI Update For RDNA 3 GPU Enablement

The AMD open-source development team has continued their hard work to incorporate not only recent advancements, such as VCN4 and GFX11 but has also worked to usher in support for the upcoming RDNA 3 GPU line coming out later this year. The team has added further support in the ...

View more: Mesa 22.2 Allows Users To Disable AMD Infinity Cache & Brings Additional Changes in RadeonSI Update For RDNA 3 GPU Enablement

MIT titles nanoscience building in honor of AMD CEO Dr. Lisa Su

MIT reanointed its nanoscience building in honor of one of its famous alumni amd CEO of AMD, Dr. Lisa Su. MIT’s Building 12 is renamed the Lisa T. Su Building in honor of the AMD CEO’s achievements What used to be called Building 12, The Lisa T. Su Building, ...

View more: MIT titles nanoscience building in honor of AMD CEO Dr. Lisa Su

ROG X EVANGELION Collection Launched

Just a few days after MSI launched their EVANGELION Collection, ASUS and Studio khara launches the ROG X EVANGELION Collection – a full set of gaming PC components including motherboards, graphics cards, gaming cases, all-in-one coolers, gaming monitors, peripherals, routers, apparel and gear with a distinct colour design inspired by ...

View more: ROG X EVANGELION Collection Launched

Beyerdynamic DT 900 Pro X headphone review

It may not be a gaming headset but it's possibly still the best set of headphones for gaming.

View more: Beyerdynamic DT 900 Pro X headphone review

ASUS Unleashes Its ROG STRIX SCAR 17 SE Laptop: Powered By 5.2 GHz Intel Core i9-12950HX CPU With 16 Cores & NVIDIA RTX 3080 Ti GPU

ASUS has officially lifted the curtains off its brand new ROG STRIX SCAR special edition laptop which rocks a 16 Core Intel Core i9-12950HX running at 5.2 GHz. THE BEST JUST GOT BETTER: INTRODUCING THE 2022 ASUS ROG STRIX SCAR 17 SPECIAL EDITION WITH 5.2 GHZ INTEL CORE I9-12950HX ...

View more: ASUS Unleashes Its ROG STRIX SCAR 17 SE Laptop: Powered By 5.2 GHz Intel Core i9-12950HX CPU With 16 Cores & NVIDIA RTX 3080 Ti GPU

AOC's AGON Division Presents the 240Hz QHD AGON PRO AG274QZM

PRESS RELEASE17th May 2022 – AGON by AOC — one of the world’s leading gaming monitor and IT accessories brands – announces the 27″ (68.58 cm) AGON PRO AG274QZM, a super-fast 240 Hz gaming monitor with QHD resolution and a breath-taking MiniLED IPS panel. The AG274QZM beautifully matches the punchy ...

View more: AOC's AGON Division Presents the 240Hz QHD AGON PRO AG274QZM

Alleged NVIDIA GeForce RTX 4090 Ti Founders Edition Graphics Card Cooler Pictured: Massive Cooling Design With Huge Heatsink, Baseplate With Both GPU & Memory Coverage

An alleged NVIDIA GeForce RTX 4090 Ti Founders Edition graphics card cooler has been leaked out over at Chiphell Forums. The cooler might give us a very first look at the updated Founders Edition design being featured on the next-gen Ada Lovelace GPUs. NVIDIA GeForce RTX 4090 Ti Founders ...

View more: Alleged NVIDIA GeForce RTX 4090 Ti Founders Edition Graphics Card Cooler Pictured: Massive Cooling Design With Huge Heatsink, Baseplate With Both GPU & Memory Coverage

Unreal Engine 5 Powered EzBench Benchmark Let’s You Put Your Graphics Card Through Its Paces With 8K Textures & Raytracing

XMG’s APEX 15 MAX ‘Desktop-Replacement’ Laptops Upgraded To Support AMD Ryzen 5000 Desktop CPUs

AOC Unveils AGON PRO AG274QZM Gaming Display: 27″ QHD Panel, 240Hz Refresh Rate, HDR1000 For £999.99

AMD Flagship RDNA 3 Navi 31 GPU Allegedly Codenamed ‘Plum Bonito’ & Uses ‘Gemini’ Board, Next-Gen RDNA 4 May Land As ‘GFX1200’ Series

Best Way to Apply Thermal Paste – Does the Pattern Matter?

AMD Marketing Claims Radeon RX 6000 GPUs Offer Better Performance Per Dollar & Higher FPS Per Watt Versus NVIDIA’s RTX 30 Series

Western Digital to produce 162-Layer NAND before the end of the year

Philips Momentum 279M1RV 4K gaming monitor review

NVIDIA GeForce RTX 4090 'Announcement in July, Twice as Fast as RTX 3090' Rumour Alleges.

Linux Adds Improved Power Management for Intel Arc Alchemist GPUs

IT admin gets 7 years for wiping his company's servers to prove a point

Silicon Power XS70 2TB NVMe SSD review

OTHER TECH NEWS

Top Car News Car News