By continuing to use the site or forum, you agree to the use of cookies, find out more by reading our GDPR policy

The Nvidia GeForce RTX 3090 is the next-generation halo card from Team Green, and it's going to be a monster. The Nvidia GeForce RTX 3090 is now confirmed as the next halo graphics card from Team Green, thanks to Micron's inadvertent posting of memory details (the PDF is now removed). With that piece of knowledge, we've dissected the rest of what we expect to find in the RTX 3090. Nvidia has a countdown to the 21st anniversary of its first GPU, the GeForce 256, slated for September 1. The battle for the best graphics cards and top of the GPU hierarchy is about to get heated. We've talked about the Nvidia Ampere and RTX 30-series as a whole elsewhere, so this discussion is focused purely on the GeForce RTX 3090. Let's dig into the details of what we know about the GeForce RTX 3090, including the expected GPU and memory specifications, release date, price, features, and more. First, the GeForce RTX 3090 branding is the first 90-series suffix we've seen since the GTX 690 back in 2012. That was a dual-GPU variant of the GTX 680, but based on the Micron documentation, RTX 3090 will still be a single GPU. Spoiler: multi-GPU support in games is practically dead, at least on life support. Why bring back the 90 brandings? Simple: It opens the door for a new tier of performance and pricing. That's not good news for our wallets. We discussed the Micron inadvertent posting of details and more in a recent Tom's Hardware show, which you can view below. Let's dig into the details. The Micron posting gives us one extremely concrete set of data. Unless Nvidia changes something between now and the unveiling, the GeForce RTX 3090 will have 12GB of GDDR6X memory clocked at somewhere between 19-21 Gbps per pin. Let's be clear: It's 21Gbps. Nvidia's GTX 1080 Ti was the first 11GB GPU, and it was a surprise. Nvidia had multiple references to build off: Turning the dial to 11, 11GB, 11Gbps clocks. The same applies to 21Gbps. This is the 21st anniversary of the GeForce 256, the "world's first GPU" according to Nvidia, who coined the GPU acronym for the occasion. There's also a 21-day countdown going on right now. Add that to the specs from Micron and 21Gbps is effectively confirmed. If I'm wrong, I'll eat my GPU hat. This is a big deal, as it's the first time a GPU will have over 1TBps of memory bandwidth while using something other than HBM2 memory. (AMD's Radeon VII has 1TBps as well, via 16GB of HBM2.) We don't have exact details on how much companies pay for HBM2 vs. GDDR6X, but there's a big premium with HBM2 — you need a silicon interposer, plus the memory itself costs more. To put this in perspective, the RTX 2080 Ti 'only' has 616GBps, so this is effectively a 64% boost in the memory performance. That leads into the rest of the GPU specs, but let's first point out that the RTX 2080 Ti has 27% more memory bandwidth than the GTX 1080 Ti. It also has 20% more theoretical computational performance (TFLOPS), and architectural updates mean it makes better use of those resources. In short, GPU TFLOPS is often scaled similarly to bandwidth. As we've already pointed out, the move to 21Gbps GDDR6X increased raw memory bandwidth by 64% relative to the RTX 2080 Ti. That means we also expect the RTX 3090 to deliver around 50-75% more computational performance. Do you know what would make for a nice target? 21 TFLOPS. Yeah, baby! How it gets it isn't critical, but there are a few options. We know from the Nvidia A100 that Ampere can reach massive sizes on TSCM's 7nm process. It's an 826mm square package, which is relatively close to the maximum reticle size — you can't make a chip physically larger than the reticle. The GA100 at the heart of the A100 also supports FP64 (64-bit floating-point) computation, which is necessary for the target market of scientific research. GeForce cards don't need FP64 and typically only have 1/32 the performance in FP64 vs. FP32 instead of the 1/2 performance found in the bigger GP100, GV100, and GA100 chips. Option one is that Nvidia strips out all the FP64 functionality, adds ray tracing RT cores in its place, and still ends up with a big chip that has up to 128 SMs. This is more or less what happened with the Pascal generation: GP100 used HBM2, GP102 used GDDR5/GDDR5X, but both had a maximum configuration of 3840 FP32 CUDA cores. Some of these would end up disabled to improve yields via binning, but if Nvidia goes with 118 SMs and 7,552 CUDA cores, then clock the chip at 1.4GHz (boost), it would have a theoretical performance of 21.1 TFLOPS.  Oh, and it uses 50W more power. Learn more about this powerhouse GPU card by visiting OUR FORUM.