If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. |
|
|
Thread Tools | Display Modes |
#1
|
|||
|
|||
RAMGATE in full swing. (GPU/Cuda Memory Bandwidth Performance Test now available) nice for GTX 970 testing ! ;)
I had some hopes that maybe RAMGATE wasn't going to be so bad... but in
reality it's quite BAD. Seems like NVIDIA's latest GTX 970 graphics chip is seriously bugged. It has 8 processors packed in a 4x2 package, sort of. One of those packages has one chip missing. So that chip is running at 50% capacity. This would normally reduce the entire system speed to just 4x instead of 8x. Cause it would be the bottleneck. Apperently they found someway to still use 7 out of 8 processors. So the GTX 970 is a 7x chip. However they got a bit greedy... they also wanted that last 8x 500MB segment. I don't quite understand why they did not simply connect all of the 8x500 MB ram chips to the 8 processors and 7 caches in such a way that the full addressing range of 4 GB could be used. I can vaguely understand that chip 4 out of 4 would be bottlenecked... but by sending 1/7 of the workload there and the other 6/7 to the other 3x2 chips the system should be able to run somewhat decently at 7x the speed... especially since processors normally faster than memory chips. Anyway let's look at it one more time: http://www.pcper.com/reviews/Graphic...ations-GTX-970 I suspect they made the decision to segment the space like ms-dos once did... to prevent queuing/sandman/sand clock effect because of a missing L2 cache chip. But the funny thing is... the cache chip is probably not even that important for some things at least. Then again maybe it is but still... Now they will probably have to limit the chip to just 3.5 GB to limit access to the 0.500 MB slow part. Odd thing is by accessing both segments at the same time, they claim it can still access full speed, which is ofcourse a bit cheasy... what software would actually do that ? Probably not... Their chip memory requests go something like 0,1,2,3,4,5,6,0,1,2,3,4,5,6, instead of 0,1,2,3,4,5,6,7,0,1,2,3,4,5,6,7 So one DRAM chip is probably completely being skipped most of the time whenever something is being access below 3.5 GB ?! I find this a somewhat weird decision. But maybe they tested it and came to the conclusion that the missing cache chip has such a bad influence on total system performance they decided to segment it. Kinda funny... for my own corewars app the cache is almost useless so... I guess they had games in mind... maybe they can re-configure the card for either gaming or cuda. I wonder if it's hard-wired or if it's somewhat flexible. If it's flexible they be a little bit in luck... if it's capable in driver then they lucky too... but if not... they got a big **** storm coming LOL. I am touchy/feely about RAM cause it's the main bottleneck of would-be-applications. DON'T **** WITH RAM EVER ! =D Well they ****ed it... now they gonna be ****ed LOL. There solution is interesting from a processing perspective but from a gamer/stutter perspective it's currently horrible, unless they can fix the stutter. If they can't fix the stutter they ****ed. Anyway what's worse is they now have an admin behaving like a NAZI or something and deleting postings from RAVING nvidia customers. What's even more funny is they deleted my posting about my new RAM BANDWIDTH test as well ?! I kinda wanted to write a RAM BANDWIDTH test like this for a long time now... But seeing this horrible situation I can no longer stick my head in the sand. I had to write a tool to help people figure this out, inspired by Nai's benchmark. Some interesting discoveries have been made by me with this tool. Apperently NVIDIA may have more dirty secrets they don't wanna anybody to know about, hence their possible reason to deleting most of postings about my new tool. Anyway the discovery is: As more and more "cuda memory objects" are created the cuda memory becomes slower and slower and slower. I am not yet sure why this is ? Is the driver being "taxed" because of all the objects ? Is it perhaps processing a "List of Memories" ? I could instead allocate one gigantic block and treat it as if it were multiple memory blocks... just to see if the multiple allocations is causing the slow down. Another theory could be that the higher memory addresses are being hosted on slower and slower chips, maybe to save costs ? However I think I can vagely remember reading about higher addresses being slower in some pdf some long time ago ? Perhaps I am being delussional ? Well whatever the case may be... now that RAMGATE is in full swing and we are talking about a 7/8 of a performance drop you can bet your ass more and more and more people are going to jump on this and investigate graphics cards memory... closer and closer... Apperently other nvidia cards also had some dirty tricks... with "unbalanced memory". It wasn't as bad that time... I also have been wondering why cheaper graphics cards always have less bandwidth... It's easy to say well... that's just because of less chips on it etc... but is it really ? Is it really that expensive to give more bandwidth to GPUs ? For now I will believe it... but hmm... Why not feed 512 bits from a single DRAM chip across 512 pins ? Why would that be bad ? Too much energy cost for 512 pins ? Weird... I guess it needs to be processed anyway... but at least there is some caching effect that way... now it just seems 8x64 bits or so.= 512 ? dont think so. Or perhaps 4x64 bits, or 8x32 bits = 256 bits. It says: "32 bit memory controller segment" so apperently all these chips are "32 bit" addresseable. Perhaps older chips from the 32 bit addressing era... Perhaps 64 bit addressable chips cost more... hmm... Well anyway... I don't like ADMIN NAZIS at all... and I am most certainly not going to put up with ANY of that CRAP what so ever EVER. So once again I will have to rely back on usenet to spread the message =D There was a saying once: "Don't shoot the messenger" LOL. There's a lot of "messenger shootings" nowadays on FORUMS LOL. Here is my Test CUDA Memory Bandwidth Performance application, with nice gui, block size setup, round setup, graph/chart, log/error messages, kernel source and ptx source. The packed folder contains a winrar file containing the 3 files, 2 of them are necessary to run the application (*.exe and *.ptx). http://www.skybuck.org/CUDA/BandwidthTest/ My GT520 is showing approx 1.5 GigaByte/sec bandwidth with these float4 and kernel and short running time. It should be able to achieve 9 GigaByte/Sec so not sure why it's so low... (maybe kernel launch parameters could be better) I am kinda curious what other graphics cards will show. Also this a first version/release (0.03), maybe later I will update it a little bit, so it has some better launch parameters/optimal calculation support for newer graphics cards, for now this will have to do ! The unpacked folder contains the 3 files unpacked in case anybody is having troubles with extracting them. I just added a little "save chart to file" button, which saves the chart into two files one "bitmap" and one "wmf" which is a new kind of graphics format which is much smaller. So that basically all windows systems should be able to read that file, you could then open it in ms-paint and re-save it as a jpg or so. Here is example of single run: Here is example of multiple runs: And finally I will convert the single one to jpg so it can be shown he I hope you enjoy it... maybe this little app will shine some more light on things I'd be curious to seem some charts of GTX 970 and perhaps other models like that as well to see if there is indeed some thruth to it all ?! Best of all check the setup tab of my app. It allows other block sizes to be tested as well... most interesting graphs are then rendered. (The memory: 1 GB) is rounded to whole numbers for now, so either 1 or 2 or 3 or 4 GB and so forth... no 3.5 GB or so will be mentioned... Didn't have time yet to code that properly but it's a minor issue... just a title above the graph... so if you do have a graphics card memory system with 0.x and you wondering why this tool doesn't state the full correct fractional number... now you know. I've been up for a while... did a lot of enjoyment reading into this issue... and this posting kinda long, mixed with text from the forum and ofcourse this new text. And I am kinda feeling fuzzy... and you know me... my usenet postings are often fuzzy... I like fuzzy... don't you just like fuzzy ?! =D EEEEEEEhhhhhummmm what more can I write for your entertainment oh yeah that's right... the fun I had reading about this RAMGATE !!!!!!!!!!!!!!! I JUST LIKE TO SAY ONE THING AND ONE THING ONLY: "BIG KADOOS/APPLAUSE to the people that figured this (one) OUT ! Cause this was one hell of a bitch to discover.. IT WAS UNDOCUMENTED maybe COVERED UP... it was questioned... it was disbelieved by some... certainly not me... WHEN MY FELLOW GAMERS NAG ABOUT STUTTER I BELIEVE THEM.... CAUSE THESE GUYS OWN THE GAMING WORLD YEAH... THEY CAN SPOT A SNUTTERY SNOT SNOT SNOT STUTTER FROM FAAAAAR AWAY THE OTHER SIDE OF THE GALAXY !!! LOLOLOLOLOLOLOLOLOL =D just like mmmmmmmmmmmmmmmmmmmmmmmmeeeeeeeeee YEAH LOL" =D We gamers really sensitive to "MICRO STUTTER" or any "STUTTER" for that matter. When ya think about it, it's kinda funny that NVIDIA thought they could hide this STUTTER ?! It's a bit like trying to hide a SPOOK/GOON right in front of RAMBO's NOSE LOLOLOLOLOL. "HEY, ? HEY ? HEYYYYYY ? What's that I SMELL ?" Rambo goes.... "IS THAT A GOON I SMELL" LOLOL. And you know what... that guy at GURU 3D that had an ICON/avatar like CLINT MOTHA ****ING EASTWOOD ! WAS FOKKING BRILLIANT/HIRALIOUS/AWESOME/UNFORGETTABLE ! It was like BIG MISTER CLINT EASSTWOOD himself FROM THAT WILD WEST MOVIE came to clear things up =D YEAH BABY =D TITITITIT TUT TUT TUUU TITITITITI TU TU TUUUUU ! =D And then add a bit of Dirty Hirty entering a time machine into the wild west movie: "The Good, The Bad, And The Ugly" and then saying with a real bad/dark deep beer/smoke/raw out of bed cracked up voice: "HRRREEYYY NVIDRRIA ?! Wrrhartt's THRRRAT I HEEEAARRRRR? ! Yourrrrrr rrrsselling BAD GRAPHICS CARDS I HEAR ?!" "HHMMRRR YOU FEEEEL LUCKY PUNK ??!!! "I know whatrrr youurrr thinkingrrr ?!" "Willll.... or willl they not discover the memory bug issue ?!" "I know whatrrr yourr thinking ?!" "Will they run at full speed 8x times... or will they drop back to 1x time and notice it ?!" "Well whatta ya think punks ?!" "Huhh" "You think we not gonna detect it !?" This here is a bad ass bug detector... It will blow your cover up wide open ! =D HAHAHAHAHAHAHA =D Oh yeah... just making fun of this whole situation and especially nvidia... sorry can't resist... this is so bad... it must be published. Meeanwhile my outlook express editor is also behaving badly... probably because of graphics/wmf or maybe my accidental key pressed somehow ****ed it up. Maybe something with shift or something Anyway my clint eastwoord impressio was better the first few times or so... actually I have it on video camera... when I was watching that forum with it... cause it's some legendary... obsolutely... this is going to be stuff of legends... and you bet I got it on video tape ! =D HAHAHAHAHAHA =D No matter what gets deleted... it's on my video tape you get it ! =D OOOHHHHHHH yeah =D Gonna sit this one out and watch it play out... but maybe meanwhile I even improve my bandwidth test program. It just bbbbbbbb bbb bbbuuu buuu beaytifull this whole thing... hopefully this will cause more focus on RAAAAAMMMMMMMMMMMMMMM We sure need better RRRRRRRRRRRRAAAAAAMMMMM now and for ever at least the coming decades !! MORRRRRRRRRRRRRREEE RRRRAAMMMM SPEED PLS !!!!!!!!!!!!!!!!!!!!!!!! NOT LESS ! NVIDIA get it backwards this time ! It felt like a time jump back to ms-dos time. Not sorry I certainly don't wanna experience a time machine in that way ! NO-NO-NO=NO=NOOooooo THAT-WOULD-BE-BAD. EVEN BILL G. SHUDDERS at the THOUGHT of THAT... I BET ! hahahahahahahahah. No segment/offset or segmentation bull**** EVER AGAIN ! NNNNNNNNNNNNNOoooooooo It's programmer horror. Just when ya thought the segmentation horror days were over... IT GETS EVEN WORSE !!!!!!!!!!! AAAAAAAAAAAAAAHHHH Now you don't even know it's segmented PERFORMANCE WISE ?! AAAAAHHH THE RAM IS THERE... BUT THE SPEEEEEEEED IS NOT ?! NO WHERE TO BE FOUND ?!!! At certain addresses ?!!! AHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH H Just the idea of that makes me go:AAAHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH Just to be clear one more time ^ Yeah this BURN IS GOOD I SAY ! BURN THIS WITCH ! AND BURN IT FAST =D MY RECOMMENDATION FOR THE GTX 970 DESIGN: BURN IT AND NEVER EVER LOOK BACK AT IT EVER AGAIN =D NOT AS LONG AS YOU CANNOT GARANTUEE CONSISTENT PERFORMANCE OF THE RAM SYSTEM ! There ya have it again folks: I hate inconsistency, that's my sentence as a software programmer. A hardware (nintendo or was it the cell chip designer ?) programmer/designer once said: "I consider symmetry to be an esthetic". It's basically the same thing: 8 processors, 8 caches, 8 ram chips. That's symmetry. 8 processors, 7 caches, 8 ram chips is asking for dissaster LOL. And now NVIDIA is in BIG dissaster LOL =D It was a nice try... and a fail try. LET'S BURN IT, AND LET S BURN IT REAL GOOD. If it was a terminator having to recover from 1 failed cache it might be nice, but for a gaming system... NOPE. NO STUTTERY ARNOLDS. Break that piggy, them cents gonna be necessary for lawyers. Most probably YES ! =D Total recall might be necessary too. Bye, Bye, Skyburn(the-witch-gtx-970.) |
#2
|
|||
|
|||
RAMGATE in full swing. (GPU/Cuda Memory Bandwidth Performance Test now available) nice for GTX 970 testing ! ;)
And Skybuck wouldn't be Skybuck, if Skybuck didn't make up for the missing
image links. Apperently they somehow got filtered out... I was kinda hoping outlook express would auto-convert them to plain text links instead of images but guess not. So I will just give you a link which you could have found yourself anyway if you followed the other link. But just to be clear... here there will be pictures of charts...... charts of performance... boooeyeah: http://www.skybuck.org/CUDA/BandwidthTest/Charts/ Yeah and if you guys don't have webspace or don't want your name attached to pictures send me them and I host them for you. . Remove one dot after each symbol and you should be good to go: One more time e-mail addres in cryptic mode: skybuck 2000 at hotmail dot com Send me stuff (hopefully your charts ) and piece out mothers ! YEEEEEAAAH =D Bye, Skybuck. |
#3
|
|||
|
|||
RAMGATE in full swing. (GPU/Cuda Memory Bandwidth PerformanceTest now available) nice for GTX 970 testing ! ;)
On Wed, 28 Jan 2015 07:21:02 +0100, Skybuck Flying wrote:
Total recall might be necessary too. You're a goddamned total retard. You starred in the movie. Bye, If ONLY! |
#4
|
|||
|
|||
RAMGATE in full swing. (GPU/Cuda Memory Bandwidth Performance Test now available) nice for GTX 970 testing ! ;)
Total recall might be necessary too.
" You're a goddamned total retard. " Lol no, you welcome to try and make me look like a retard, but the only retard in this story is the person(s) that designed this stuttery RAM system ! LOLOLOLOLOLOL. If this doesn't get you fired then I don't know what will ! Bye, Skybuck. |
#5
|
|||
|
|||
RAMGATE in full swing. (GPU/Cuda Memory Bandwidth Performance Test now available) nice for GTX 970 testing ! ;)
Hmmm...
Interestingly enough... I just came across a message on a thread on the nvidia, the thread that has now been read 750.000+ times in just a couple of days. The message gave an explanation for the missing messages on the forum. An explanantion that I have not heard before yet. Apperently the nvidia forum had a funtionality, where people could file "reports/complaints". Just like in games. And just like in games, apperently this functionality got abused, and people starting hitting "reports/complaints" just to **** people off and sabotage the forum. If enough reports/complaints where filled then the message would be automatically hidden. If this is the true reason for the removal of certain messages we will never know, but it does sound a little bit plausible. Apperently this functionality has now been disabled to prevent further abuse. Let's take all this information with a grant of salt. But I do like mentioning it, because it's the first time I heard an explanation like that on a forum ! And I am LOLLING at those people hitting those report buttons ! LOL. Nice strategy LOL. Bye, Skybuck =D |
Thread Tools | |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Free performance test for flash memory? | Jaques Beusel | General | 1 | October 23rd 10 12:34 PM |
Improvement in CUDA performance? | Smarty | Nvidia Videocards | 19 | August 31st 10 04:18 PM |
Faster CUDA performance ?? | Smarty | Nvidia Videocards | 10 | June 3rd 10 08:22 PM |
Getting Better CUDA performance ?? | Smarty | Homebuilt PC's | 1 | June 1st 10 11:20 AM |
Front Side Bus bandwidth matching memory bandwidth... | KILOWATT | General | 3 | August 6th 06 07:55 PM |