A computer components & hardware forum. HardwareBanter

If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

Go Back   Home » HardwareBanter forum » Processors » Overclocking AMD Processors
Site Map Home Register Authors List Search Today's Posts Mark Forums Read Web Partners

new benchmark: core-2-core transfer-speed



 
 
Thread Tools Display Modes
  #1  
Old March 2nd 08, 04:07 AM posted to alt.comp.hardware.overclocking,alt.comp.hardware.overclocking.amd,comp.arch
Elcaro Nosille[_2_]
external usenet poster
 
Posts: 7
Default new benchmark: core-2-core transfer-speed

It's always said that the single-die technology of AMD's phenon
is superior to Intel's dual-die solution. I wrote a little bench-
mark some time ago that measures the speed of transfers from one
CPU core to another. This benchmark measureed the throughput and
latency of linear and random memory accesses to cachelines writ-
ten by another core just before. I ran it on my 3GHz Core II Ex-
treme quadcore and someone gave me some numbers after running it
on a Phenon overclocked to 2,53GHz.

The numbers were surprising to me:

- First, it seems AMD didn't manage to get a real advantage from
its single-die technology. Random memory-acccesses to a 16kB
block written to the L1-cache of another core just before have
a throughput of about 1GB/s (!) whereas my Core 2 Extreme has
3,9GBs between the cores on the same die and 2,7GB/s between
the cores of different dice.
The linear throughput between the cores of the same block-size
is about 3GB/s for the Phenon-system and 6GB/s between cores
on the same die and 4,8GB/s between cores on different dice
when probed on my Core 2 Extreme.
- Second, transferring from the L1 cache of one core to the L1
cache to another core on the same die of Core-2-based CPUs is
slower than when the data has been written back to the common
L2-cache and is transferred from there to the destination-core.

Some on a german board mentioned that this tests test aren't
meaningful for real-world-purposes because I probe only trans-
fers from one core to another in one direction where other
cores do nothing.
So I wrote a new benchmark for Win32 that has configurable
behaviour on:
- the pattern:
Linear measures the throughput of linear memory-accesses and
random measures the throughput and latency of random memory
-accesses (measuring the latency of linear accesses doesn't
make sense in my case because I don't do pointer-chasing on
linear accesses and memory-accesses become pipelined).
- the direction - unidirectional vs. bidirectional:
When transferred unidirectional, one core produces some data
and another consumes it; when transferred bidirectional both
cores are producers and consumers.
- the block-size:
The block is the entity produced by the thread on one core
and consumed by the thread on the other core. Block-sizes
range from "4k" to "64m".
- producers and consumers:
You can give a number of core-pairs to the benchmark that
will be tested. When benchmarking unidirectional transfers
the first core is the producer and the second is the con-
sumer; when benchmarking bidirectional transfers both are
producers and consumers.
The core-numers are from 1 to N where N is the number of
cores in the system. With Intel's quadcores the cores on
the same die are 1 and 2 or 2 and 3 (relies on the APIC
-IDs and I've never seen a BIOS that does this different).

You can download the benchmark including the sources at [1].
There are two batch-files in the .zip-archive. These are
quadcore.cmd and dualcore.cmd; both run a large number of
benchmarks against different patterns, directions, block
-siztes and core-configurations and one is for dualcore,
the other for quadcore-systems (I could also build a batch
for 8-core-systems with two CPUs - or even larger).

It would be nice to see some results in any of the newsgroups
I posted to. You can copy the output of the batch by chosing
the copy-function of the console's system-menu.

[1] http://depositfiles.com/files/3877959
 




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
new benchmark: core-2-core transfer-speed Elcaro Nosille[_2_] Overclocking 0 March 2nd 08 04:07 AM
Should I go Dual Core or Quad Core? Intel C2 DUO E6850 vs. Quad-Core Q6600 Brian Cryer Nvidia Videocards 4 January 16th 08 11:23 PM
Should I go Dual Core or Quad Core? Intel C2 DUO E6850 vs. Quad-Core Q6600 John Weiss[_2_] Nvidia Videocards 6 January 4th 08 10:09 AM
Simple benchmark to confirm Dual Core working OK? [email protected] AMD x86-64 Processors 2 September 4th 06 06:56 PM
Which Notebook to buy? Intel Centrino, Core DUO, Core Duo 2, AMD Turion, Single Core [email protected] General 4 August 31st 06 02:11 AM


All times are GMT +1. The time now is 04:01 PM.


Powered by vBulletin® Version 3.6.4
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright ©2004-2024 HardwareBanter.
The comments are property of their posters.