A computer components & hardware forum. HardwareBanter

If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

Go Back   Home » HardwareBanter forum » Video Cards » Nvidia Videocards
Site Map Home Register Authors List Search Today's Posts Mark Forums Read Web Partners

Very good news from AMD



 
 
Thread Tools Display Modes
  #1  
Old July 11th 11, 09:37 AM posted to alt.comp.lang.borland-delphi,alt.comp.periphs.videocards.nvidia,comp.arch,rec.games.corewar
Skybuck Flying[_7_]
external usenet poster
 
Posts: 460
Default Very good news from AMD

(Not my subject description)

Status of my benchmarking of cuda, more will follow soon:

Well, time for some numbers...

Exact measurements must still be done, but so far I am counting the seconds
myself for a somewhat estimation of what's going on.

Settings tested:

mElementCount.Value := 8000;
mBlockCount.Value := 10000;
mLoopCount.Value := 80000;

This means there are 10.000 blocks each containing 8.000 elements. An
element is a 32 bit integer so that's 4 bytes.

For a total of: 10.000 x 8.000 x 4 = 320.000.000 bytes (300 in the mega
range)

The loop count represents the number of times each block performs a memory
lookup within it's element array of 8000.

This is done 80.000 times.

So number of bytes transferred: 80.000 x 10.000 x 4 bytes = 3.200.000.000 (3
in the giga range)

It does all of this in approximately 16 to 20 seconds or so depending on
some minor kernel and/or launch tweaks.

So that's about 160 MB per second or so...

I am more interested in how many integers it could lookup, which is about
80.000 x 10.000 in 20 seconds = 800.000.000 in 20 seconds.

So that's roughly: 40.000.000 in one second.

These are just estimates for now...

I haven't done the same benchmark yet on the cpu, but the corewar simulator
was similiar and did even much more processing with branches and such and it
could do 160.000.000 in 1.5 seconds or so or 1.0 second or so...

So things are starting to look a bit bad for cuda and the gt 520

However more exact measurements and comparisions need to be done... but
based on these first observations... the cuda memory lookup performance is
probably sucky and worse than cpu, I could be wrong though, but I don't
think so, but I will continue with benchmarking it a bit more properly.

Some factors of influence:

For gpu:

1. The more memory it can use the faster it will probably be, but I am not
sure, the current benchmark has troubles allocating "test memory" on host
which needs to be transferred to cuda/device... so this is a bit of a nasty
situation. Perhaps a kernel could be written which initializes the memory...
so far this is done on host to be sure it's done somewhat correctly.

For cpu:

1. The cpu has a very fast local cache where the problem size can probably
fit in quite well, this allows cpu to spin real fast over that memory.


CPU has about 16 GB/sec cache or so per core.

GPU has about 12 GB/sec at least it said some on some chart, in reality it
seems to be lower close to 9 GB/sec according to visual profiler which is a
bit weird...

Anyway it could be interesting if other people could run my cuda benchmark
on their graphics card to see what they get for a results...

So I will probably release my test program in the near future, few days from
now or so, and the kernel in ptx form or maybe even cu form so people can
test it out

Consider this posting an "approximation"

Bye,
Skybuck.



  #2  
Old July 12th 11, 02:55 AM posted to alt.comp.lang.borland-delphi,alt.comp.periphs.videocards.nvidia,comp.arch,rec.games.corewar
Skybuck Flying[_7_]
external usenet poster
 
Posts: 460
Default Very good news from AMD

Well now, time for some real numbers.

The numbers are fluctuating a little bit depending on the settings.

So far my prediction that more memory usage would lead to better gpu results
seems to be true.

Yesterday after I wrote the posting and went to bed I also thought to myself
that the results should probably be similiar since they are probably both
using the same kind of memory technology, so it's the memory technology
itself that is limiting both systems.

That will probably need to change in the future... they way they interact
with the memory, the memory would need to be split up into multiple
independent sections with their own lanes somehow... that's what I am
thinking.

Anyway now let's go analyse the results:

Settings whe

20.000 blocks
8.000 elements per block.
80.000 loops per block.

So 20.000x80.000 memory transactions were performed each on GPU and CPU.

First numbers:

Kernel execution time in seconds: 25.0861718750000000
CPU execution time in seconds : 31.0449578088835310

The CPU is 6 seconds slower.

In other words:

(new - old)/old * 100 = 24%.

So the GPU is just 24% faster, it's a cheap GPU though

However please keep in mind that this test only uses one core of the CPU.
So what remains to be explored is how the CPU would perform if it was
multi-threaded. That in itself would be an interesting test to see if
multi-cores can somehow increase mts or not.
My expecting would probably be not, unless it's the memory controller itself
that's leading to slow down.

Other numbers from this test:

Cuda memory transactions per second: 63.780.157
CPU memory transactions per second : 51.538.159

Cuda/gpu does about 64 million per second.
CPU does about 51 million per second. (one core).

Interesting results, but not really.

Bye,
Skybuck.

  #3  
Old July 12th 11, 03:05 AM posted to alt.comp.lang.borland-delphi,alt.comp.periphs.videocards.nvidia,comp.arch,rec.games.corewar
Skybuck Flying[_7_]
external usenet poster
 
Posts: 460
Default Very good news from AMD

However I made a little testing mistake.

The CPU program was still in debug compilation mode which adds overhead.

I ran the same test again but this time in release mode, here are the
results:

Kernel execution time in seconds: 25.0870683593750000
CPU execution time in seconds : 11.8696194628088207

Cuda memory transactions per second: 63777878.5898704829000000
CPU memory transactions per second : 134797918.7549603890000000

The cpu is double as fast as the gpu !

This also matches more closely what I saw with redcode simulator:

134.797.918

134 million memory transactions for the cpu.

Some of them probably lost because of less cache hits because memory a bit
more spread out and chained.

So far this seems to put a pretty big nail in the coffin for cuda, at least
for the current GT 520 GPU which is 10 years younger than AMD X2 3800+ which
is 10 years old by now.

So this 10 year old AMD x2 3800+ still kicks GT 520's ass twice with just a
single core.

However that's just the total result.

Once calculating MTS per euro perhaps the GPU would win out...

Then there is also heat issue. AMD x2 is twice as hot as GT 520.

These kinds of considerations would come into play for super computers

(Also power consumption costs and cooling costs and housing costs could also
play a roll )

Bye,
Skybuck.

 




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
¢¾¢¾¢¾GOOD NEWS ! DVDS and Software FACTORY SALE! good quality and cheap price! AND FREE SHIPPING!¢À¢À¢À helen Cdr 0 March 20th 08 12:25 AM
GOOD NEWS ! DVDS FACTORY SALE! good quality and cheap price! ANDFREE SHIPPING! helen Cdr 0 November 29th 07 09:43 AM
(",) Hello, I Have Good News! [email protected] Asus Motherboards 11 January 30th 05 06:04 AM
hp s20 owners, some good news and bad news Bongo Scanners 0 April 15th 04 12:45 AM
Minolta 5400: bad news, good news Dan Marder Scanners 1 October 16th 03 06:20 PM


All times are GMT +1. The time now is 06:07 PM.


Powered by vBulletin® Version 3.6.4
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright ©2004-2024 HardwareBanter.
The comments are property of their posters.