Maxwell Lite A Little Too Light?

admin | February 4th, 2015 - 6:21 am

1 2 3ALL

During a troll around the internet, at a typically late hour on an tranquil evening approaching January’s frosty departure, I had the misfortune to tread in a rather troubling tale of graphical discord. Most would declare it nothing more than a particularly wet wave of liquid news, though I call it a tale with intent to paint an appropriately portentous backdrop.

The story in question was broken to me in the form of a Podcast, authored by the venerable and voracious technophiles at PC Perspective. Being a frequent frequenter of their Youtube channel, I had casually dropped by to digest my weekly feed. Following a brief browse of their non-fiction section, my hungry eyes wandered over the words…

“Nvidia GTX 970 3.5gb Memory Issue explained”

They then proceeded to roll upwards, giving me a fleeting glimpse of unkempt eyebrow followed by a wash of royal maroon peppered with pulsating orange hoops. A life-altering spiritual experience? A severe psychedelic episode? Neither. Merely ephemeral ocular echoes of my studio’s spot lights (Gu10 LED’s) waltzing to a silent Danube as distant corners of consciousness grudgingly comprehended the terrifying truth.

The attributed video was under twenty minutes, not an epic, but comfortably long enough to snatch confusion from the jaws of simplicity. Having made impatient assumptions based on the headline, I anticipated the content would commence with an obsessive but legitimate consumer complaint, subtly segue into an artful official explanation and conclude with an elegant political balancing act from atop a proverbial picket fence of grass green and ruby red.

One cursory click followed by 936 spellbinding seconds, and my prophecy had come to pass. All that remained was to determine how to be sufficiently convoluted in my summary of such statistically fortified, fact-enriched and judiciously informative IT journalism.

Upon beginning the sixth paragraph of what only the definitively patient were presently engaged in assimilating, I sensed my mission to obfuscate had already been accomplished and that their sanity henceforth depended on divine lucidity.

Thus, let us presently reveal in the briefest most rewarding fashion who and what has warranted a certain Emerald Giant’s artfully engineered explanation.

Once upon a PCB, there were two top tier GPUs, 980 and 970, the Maxwell twins. Separated at birth, named with their respective numbers, cloned, branded, marketed and finally, sold in vast volumes to frame driven elitists.

Some recruited the broader brother to promote their perpetual pursuit of peerless pixel virtuosity while others with less cash, or a reduced need for benchmark bedazzlement, opted for the slighter sibling…though not so slight as to be mocked or ridiculed.

Turning to the tedious basics. On September 19th, 2014, the day of launch, Nvidia had solemnly sworn, via various methods of statistical divulgence, that both of these marvellous Maxwells fostered four gigabytes of video RAM, 64 reverential ROPs and 2mb of Level 2 Cache, equalling the quotas flaunted by their crimson nemeses, The R9 290 and R9 290X.

Six months later, in the more immediate past, a thread initiated by a seemingly misinformed customer took firm root on Nvidia’s forum and began to feverishly flourish with the numerous and supportive findings of forensically observant enthusiasts, all of whom had adopted 970s.

They were dismayed to discover detrimental performance anomalies during particularly intensive workloads and their grievances revolved around two principal issues.

1: The 970 was not consistently assigning its complete apportionment of memory, even when applications were exceeding the limits of their allotted resources. Under such circumstances, notably in Shadow of Mordore, RAM usage would appear to top out at 3.5GB, leaving 0.5GB unpopulated. When subjected to identical conditions, a 980 would nonchalantly transcend this mysterious perimeter and allocate all four gigabytes to the game or benchmark that required it.

2. On the exceedingly rare occasions that the 970 made use of its extremities, it would only do so at a dramatically diminished speed, supposedly transforming a silky, stimulating creature-slaying spree into an shockingly sporadic saga of choppy discontent. In other words, the net effect on frame rates and playability was less than acceptable.

A special utility, compiled by German CUDA programmer “Nai” was used by owners of either or both cards to incrementally determine the RAM’s optimal bandwidth by dividing it into segments of 128mb and assessing each individually. The results appeared to coincide with the experiences encountered in games as the 970’s upper memory blocks were occupied and thus, served as convincing evidence in support of the original claim.

Initially, there were some suggestions that the applications themselves were ill-equipped to exploit their full complement of provisions and elected to substitute slower system RAM for video RAM they were unable to detect. Even Nai himself asserted his program had been coded in great haste and was not designed to stress or analyze video RAM in the precise fashion that many impulsive reviewers had presumed. A revised version was promptly completed and exhibited exactly the same behaviour

By now, numerous inflammatory posts had permeated almost every forum actively engaged in debating the laws of luxury computing, and Nvidia decided it was time to publish a comprehensive and placatory response.

The unabridged version was hideously detailed, so here is a compacted and jocular paraphrase slavishly composed from what I was able to glean from green pastures.

–~~~~~~~~~~~~–

Green Eyes’ Acolytes Vs. The World

Nvidia Minion: The 970 has 4GB of memory just as advertised but we decided to separate the last 0.5 gigabytes from a central reserve of 3.5GB.

The Angry: And why pray did you do that, Giant Green Eyes?

Nvidia Minion: Please be merciful, I’m not Green Eyes, I’m but a humble minion who preaches on his behalf, can’t you tell by my thin, reedy voice.

The Wronged: Just answer the question.

Nvidia Minion: It was part of our new binning method to improve yield rates, you know, like how we disabled three streaming processors on your GPU and left them active on the 980.

The Outraged: What has that got to do with the memory?

Nvidia Minion: Listen, the manufacturing process for our ingeniously detailed designs is still comparatively cumbersome. But our lord and master is continually establishing more refined and efficient techniques to disable non-essential elements on chips that don’t qualify for our flagship products . That way we preserve as much performance as possible for you, the customer. This is one such example.

We were able to isolate and disable a section of level two cache, 256k, exactly one eighth of the Maxwell’s total, along with 8 render output processors, but leave the adjoining memory controller active and ensure all four gigs of RAM remained accessible.

The Seething: I see, so not only do we have 512 megs of terminally crippled memory, we’ve also got less ROPs and cache than we were led to believe.

Nvidia Minion: Well, had we not done this, we’d have had to fuse off two memory controllers, 16 ROPs, half a meg of cache and render a whole gig of RAM completely redundant. In other words, sell you a card with 3 gigs instead of 4, but we didn’t think you’d want that.

The Furious: Don’t be impertinent. What you SOLD us wasn’t what you promised. I can’t believe you’re actually trying to make such cynical and greedy marketing sound like some masterfully conceived and beneficial production decision. One more time, why did you split up the memory?

Nvidia Minion: The memory controller communicating with that last half gig doesn’t have its own portion of cache, because we disabled it. So in order for its data to reach the crossbar, the fat bit in the middle, it has to be delivered via the cache and crossbar port connected to its neighbouring controller. Remarkably, it can, the other controller is an obliging chap and more than happy to share his port and cache, so the two of them hook up through what we call a “buddy interface”, isn’t that cute?

The Enraged: Cute? As in shrewd? Absolutely!

Nvidia Minion: The one, very minor drawback is that this poor deprived controller can’t work as quickly as the other seven by virtue of having to share his partner’s cache. In fact he’s only one seventh as efficient, so we separated the RAM he talks to from the main section that those other seven controllers interact with and gave the latter priority over the former.

The Exploited: Please, that’s enough personifying. We’re not in primary school. Don’t try to turn this into a sugary syrupy piece of contemporary children’s fiction in the hope of persuading us that you have a heart, or melting ours. Just get on with it.

Nvidia Minion: Don’t worry, we won’t melt anything, that’s AMD’s job. Anyway, segmenting and categorizing the memory means applications that require less than 3.5 gigs, the vast majority, will only ever occupy the larger and faster section, whilst those that need more, very few, will be granted use of that last half gig. If we’d left the memory undivided, the entire pool’s bandwidth would have been relegated to 50% of its capacity.

The Appalled: Why?

Nvidia Minion: Because those last two memory controllers are sharing a crossbar port, remember? The second port was sacrificed along with the ROPs and cache. But each controller still has to satisfy the orders assigned by the tiny 1kb matchstick man who sprints along the memory bus, so that the SMMs on the other side of the crossbar can access and utilize the VRAM.

The Disgruntled: NO MORE PERSONIFYING.

Nvidia Minion: Beg your pardon. So If the RAM weren’t partitioned, the memory interface would attempt to operate as though the eighth port still existed, placing the seventh port under twice as much load as usual. Meanwhile, its two associated controllers, our odd couple, budded up or otherwise, would effectively be fulfilling RAM requests at half the rate of the other six, all of which still have one port a piece and would hence, always be forced to wait for these last two to catch up.

Let’s try an analogy. Imagine a RAID 1+0 array of hard disks made up of eight drives. Seven of the drives are identical but the eighth is a mismatch and slower than the others. We arrange these drives in an array consisting of four RAID 1 pairings, which are then linked together, or “striped” as one big RAID 0 array. As only seven of the comprising drives match, three pairings would be able to operate at their maximum speed, but the fourth would be encumbered by the runt drive.

As a RAID 0 array depends of every disk making simultaneous writes, this would result in the three healthy pairings or “mirrors” being impeded by the fourth, thereby compromising the throughput of the entire array.

–~~~~~~~~~~~~–

What we have done, in essence, is to isolate the eighth drive, keep it linked to the array via the seventh and designate it as an additional reserve of cache which the host system will only employ when the rest of the array has been fully populated. The seventh drive can then perform to its optimum level and in parallel to the remaining six whilst no storage space is sacrificed.

The Irked: Very clever. So I guess this explains the stuttering

Nvidia Minion: Stuttering?

The Provoked: That right, whenever memory occupancy breeches that three and a half gig boundary, we get a damned slide show.

Nvidia Minion: That’s odd, none of our tests manifested such a symptom, it’s not possible. You see, The last portion of RAM might be slower, but its addressed far less often and even when it is, it’s never in big chucks for prolonged periods, certainly not long enough to induce stuttering. We’re talking factions of a second. The performance penalty is all but negligible because the determent is subsumed into and aggregated across the whole interface.

The Aggrieved. Then how do you explain the symptoms in Shadow of Mordor.

Nvidia Minion: We’ve carried out our own tests, we’ve compared the 970 with the 980 in several games, including Mordor. We forced specific scenarios to ensure both the primary and auxiliary memory banks were accessed and others to exploit the primary bank alone. We conducted multiple iterations of benchmarks for each condition.

Considering the 970s three missing SMMs, the performance discrepancy we recorded was firmly in line with what you’d expect. The results have been published, feel free to review them. We didn’t observe any or stuttering over and above what is typical when you choose to run the latest, most demanding titles at high resolutions and with armfuls of optical pic n mix.

The Offended. The fact remains, that if those ROPs and cache had been left active, as you insisted they were when we all made our purchases then, under certain circumstances, no matter how small or insignificant you might feel they are, the performance of this card would have been higher.

Nvidia Minion. There’s no evidence I’ve seen to suggest that, the three SMMs are this card’s only hindrance and their absence has been known from day one.

The Mislead: There is firm evidence, look at the results for Nai’s CUDA benchmark, clearly demonstrating the last portion of RAM is sub-par.

Nvidia Minion: Indeed, we openly admitted as such, but neither the RAM nor that particular benchmark, which is actually a diagnostic utility, has any bearing on real world performance.

The Deceived: And what about our collective experiences in games? Major degradation and choppiness that corresponds exactly to when that upper partition of RAM is called into service.

Nvidia Minion: Again, we’ve been unable to replicate those declines. I do wonder if perhaps you’re expecting a little too much. Seriously, ask yourself, would you really be applying settings as high as you are, or testing quite as vigorously if you hadn’t known about this memory anomaly?

The Apoplectic. Stop dancing around the issue. What if you hadn’t mentioned the disabled streaming processors but had informed us about the ROPs, cache and memory? Would things have been any different? Would you have admitted your folly and apologized for flagrant deceit and opportunism.

Nvidia Minion: That’s a little harsh, it was a misunderstanding between our engineering and technical PR divisions, these things happen.

The Upset: Perhaps you should have connected them with a “BUDDY INTERFACE”.

Nvidia Minion: For what it’s worth, I’m sorry if you feel offended mislead, or exploited it was absolutely not our intention. There’s nothing sinister or premeditated about this. We’ll republish the official spec since it was inaccurate but in terms of value and performance you’ve still got exactly what you invested in.

The Irritated: Fine, so next time I buy a Rolex that turns out to be a counterfeit I should shut up and be happy, correct?

Nvidia Minion. That’s not the same thing. Materially, the counterfeit would be completely different from what you presumed you were purchasing.

The Manipulated. But performance would be exactly the same.

Nvidia Minion. You don’t buy Rolex’s merely to tell the time, they’re a fashion statement. A counterfeit would never afford the wearer the same status.

The Ired: Assuming his friends or acquaintances were discerning enough to tell the difference. What if we were sold Ruby that turned out to be Garnet, or Emerald that turned out to be Peridot or diamond that turned out to be Moissanite, all convincing enough to fool everybody except jewellers, whom I never wanted to impress to begin with. What if I purchased tickets for a Rolling Stones concert and was instead rewarded with a 100% authentic Rolling Stones tribute act? Should I solicit a refund?

Nvidia Minion: I’m no legal expert sir, I can’t participate in a hypothetical. Now if you’ll excuse me, I really must go, the master commands my presence. If you’re still not satisfied, you can always take it up with your place of purchase.

The Annoyed: Shame, things were just getting interesting.

Nvidia Minion: I know. Just bad luck isn’t it. Work always gets in the way of life. Oh, by the way. Expect a new driver very soon.

The Disgusted. Don’t worry, we shall. Expect a class action. Farewell for now. Oh! and best of luck against Samsung and all your official vendors who’ll be seeking recompense once they’ve negotiated an avalanche of returns from the e-tailers you suggested we complain to. You know how these things work. Leave your troubles on the doorstep, and they’ll break in through the windows.

AMD Fanatics. And to think of the flaming we suffered for the Hawaii’s fan and throttle shenanigans.

1 2 3ALL

Responses are currently closed, but you can trackback from your own site.

Maxwell Lite A Little Too Light?

Computers

Camera Roll