A computer components & hardware forum. HardwareBanter

If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

Go Back   Home » HardwareBanter forum » Processors » General
Site Map Home Register Authors List Search Today's Posts Mark Forums Read Web Partners

CELL 2 "Enhanced Cell Broadband Engine" to be revealed soon



 
 
Thread Tools Display Modes
  #1  
Old April 12th 07, 05:52 AM posted to comp.sys.super,comp.arch,comp.arch.embedded,comp.sys.ibm.pc.hardware.chips,comp.os.linux.advocacy
AirRaid
external usenet poster
 
Posts: 51
Default CELL 2 "Enhanced Cell Broadband Engine" to be revealed soon

http://www.ps3coderz.com/index.php?o...70&Itemi d=31

Cell2: Some Thoughts
Written by nblachford
Thursday, 12 April 2007

Next week we should see the first details of the second generation
Cell processor. Little has been said publicly to date about it, but
knowing what we do we can attempt to figure out what's planned.

What we do know:
Cell 2, or properly "Enhanced Cell Broadband Engine" has full speed
Double Precision floating point units.

16,000 of them are due to be used alongside Opterons in the
forthcoming Roadrunner Supercomputer, the first slated to reach a
PetaFLOP of computing power.

They are due to ship in a blade with up to 16 GigaBytes of RAM,
probably sometime in 2008.

Pedicting Performance
No figures have been published as yet but there are ways of working
out expected performance.

One of these is a dead give-away - the recently released SDK 2.1
incorporates a compiler and simulator capable of simulating the
processor. You should be able to get a very good approximation of the
performance from that.

However the simulator can only simulate, it can't tell you things like
clock speed or other technical specifications of final hardware. We
can however work some things out...

The Roadrunner supercomputer is slated to achieve a PeteFLOP of
computing power (1million GigaFLOPs). This will be done using the Cell
processors, the Opterons are used for I/O and control.

The use of Opterons for I/O implies the relatively weak PPE is not
getting a significant upgrade. If it was the Opterons would not be
necessary.

It also tells us each Cell has to get a linpack 1K rating of at least
62.5 Double Precision GigaFLOPS. While Cell is fast and highly
efficient, it is not that efficient on linpack. Getting a good linpack
score is going to be difficult. Just adding full Double Precision
units won't do it.

There are a number of ways that the good linpack score can be
achieved:

The first is to use the technique which only uses double precision
when absolutely necessary and single precision at other times. This
works for linpack and has achieved very high rates already [1]. This
however may be considered "cheating" as it doesn't really measure
double precision performance.

The second method would be to upgrade the hardware. The floating point
hardware is obviously changing but we don't know about any other
changes as yet.

Double Precision requires twice the room for data compared to single
precision, applications which switch to DP will have problems if the
current local store size is kept. I expect we'll see the local store
doubled in size to 512 KiloBytes.

The second area to change is the memory controller. If data takes
twice the room it will also take twice the memory bandwidth. XDR
already runs at twice the rate of the interface in the standard Cell
so this is an option. XDR2 is even faster still so it could also be
used.

Since this chip will be used in higher end machines than the existing
Cell it can be more expensive, additional pins can be added if they
want. Adding more pins will allow for more memory controllers or
memory lanes to be added, again increasing bandwidth.

I think a doubling of memory bandwidth is highly likely and a
quadrupling is also a possibility because of the need to increase the
efficiency of the chip. Increasing the number of lanes also means more
memory chips can be connected, we already know this chip is due in
blades with up to 16GB.

A clock frequency rise is a distinct possibility, however going up
high raises power consumption sharply so they won't go too far. That
said I expect they'll be able to go safely above 4GHz without
problems.

Interconnects
One other possibility for change is in the I/O connections. The
Opterons will utilise HyperTransport 3, a very high speed I/O system.
Having this directly on Cell would allow it to be used with commodity
PC chipsets, this would save a lot of money for anyone interested in
building Cell workstations.

I don't think there will be any other big changes, there will likely
be all sorts of tweaks and certainly there'll be additional work on
lowering the power consumption. The Internal bus system (EIB) may be
beefed up a bit if the external memory bandwidth increases.

A lot will have been learned in the development of the first Cell
processor so there is the possibility of all sorts of other changes.

We also can't rule out the possibility of additional SPEs being added.
However there is nothing to indicate this will happen.

Conclusion
Cell2 is due to be discussed shortly. We already know it's a processor
designed for high performance Double Precision FLOPs, will access a
lot more memory and will appear in a large supercomputer. The
performance necessary means there is a need for higher efficiency in
the new processor, I expect this will mean adding higher capacity
local stores and higher memory bandwidth.

Contrary to common opinion the existing Cell is actually very fast on
DP maths [2][3]. It should be at least in line with most desktop
processors, and well ahead in some cases. Beefing up the hardware in a
Cell2 should ensure the DP Cell should be an absolute beast. Cell has
already shown performance 10-100 times faster than existing processors
on single precision, this new processor should do the same for double
precision. Cell has already had a lot of interest from the scientific
& HPC community, expect this chip to bring a lot more.

  #2  
Old April 13th 07, 12:38 AM posted to comp.sys.super,comp.arch,comp.arch.embedded,comp.sys.ibm.pc.hardware.chips,comp.os.linux.advocacy
Quadibloc
external usenet poster
 
Posts: 46
Default CELL 2 "Enhanced Cell Broadband Engine" to be revealed soon

AirRaid wrote:
What we do know:
Cell 2, or properly "Enhanced Cell Broadband Engine" has full speed
Double Precision floating point units.


Now that's a surprise. I remembered seeing a web site mentioning it
somewhere - and this feature, the one I would most like to see in such
a chip was, I thought, *specifically* indicated as something it
wouldn't have.

I'm very glad to hear the good news.

Contrary to common opinion the existing Cell is actually very fast on
DP maths [2][3]. It should be at least in line with most desktop
processors, and well ahead in some cases.


Yes, that is true - it has more raw throughput than a typical
microcomputer chip on double-precision, despite the disparity between
its double-precision and single-precision speed. None the less, the
fact that it's just "ordinary" in DP performance is a disappointment.

Except for the PlayStation 3, which uses units with one non-working
processor, all the machines using the existing Cell are very
expensive, however, so the Cell 2 isn't likely to be available to "the
rest of us" any time soon.

John Savard

  #3  
Old April 22nd 07, 08:47 PM posted to comp.sys.super,comp.arch,comp.arch.embedded,comp.sys.ibm.pc.hardware.chips,comp.os.linux.advocacy
AirRaid
external usenet poster
 
Posts: 126
Default CELL 2 "Enhanced Cell Broadband Engine" to be revealed soon

http://forum.beyond3d.com/showthread.php?t=40661

quote 'one':

"As announced a month ago, yesterday At Cool Chips X, IBM did a
presentation about the SPE in the new 65nm Cell B.E. for HPC with DP
enhancement. Tech-on has an article (reg required).

The Enhanced BE supports DDR2 (DDR2-800) up to 16GB. The DP FLOPS
increased from 25.6 Gflops to 102 Gflops, the DP latency is reduced
from 13 cycles to 9 cycles with a full pipeline and dual issue. It
supports denormal and expected NaN to be more IEEE compliant. Its SPU
ISA is v1.2, with 5 new DP instructions. The transistor count is 250
million (from 241 million for 90nm Cell), the chip area is 212 mm2
(from 235 mm2), and it consumes 100 watts (from 110 watts).."

http://techon.nikkeibp.co.jp/article...5/P1030856.jpg
http://techon.nikkeibp.co.jp/article...5/P1030852.jpg




 




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Force "medium present" or "device ready"? Mike Richter Cdr 5 October 23rd 06 12:12 AM
Samsung ML-2150 (2152W) (1) suddenly prints all pages "almost" blank and (2) error message "HSync Engine Error" , not in user manual Lady Margaret Thatcher Printers 5 May 4th 06 04:51 AM
Downside of changing "Max frames to render ahead"/"Prerender Limit" to 1/0? Jeremy Reaban Nvidia Videocards 2 March 31st 06 04:24 AM
Ultimate in over-the-top cell speculation. Intel manufactures Cell. Microsoft withers. Robert Myers General 41 April 8th 05 09:46 AM
Ultimate in over-the-top cell speculation. Intel manufactures Cell. Microsoft withers. Robert Myers Intel 39 April 8th 05 09:46 AM


All times are GMT +1. The time now is 06:48 AM.


Powered by vBulletin® Version 3.6.4
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright ©2004-2024 HardwareBanter.
The comments are property of their posters.