If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. |
|
|
Thread Tools | Display Modes |
#1
|
|||
|
|||
CELL 2 "Enhanced Cell Broadband Engine" to be revealed soon
http://www.ps3coderz.com/index.php?o...70&Itemi d=31
Cell2: Some Thoughts Written by nblachford Thursday, 12 April 2007 Next week we should see the first details of the second generation Cell processor. Little has been said publicly to date about it, but knowing what we do we can attempt to figure out what's planned. What we do know: Cell 2, or properly "Enhanced Cell Broadband Engine" has full speed Double Precision floating point units. 16,000 of them are due to be used alongside Opterons in the forthcoming Roadrunner Supercomputer, the first slated to reach a PetaFLOP of computing power. They are due to ship in a blade with up to 16 GigaBytes of RAM, probably sometime in 2008. Pedicting Performance No figures have been published as yet but there are ways of working out expected performance. One of these is a dead give-away - the recently released SDK 2.1 incorporates a compiler and simulator capable of simulating the processor. You should be able to get a very good approximation of the performance from that. However the simulator can only simulate, it can't tell you things like clock speed or other technical specifications of final hardware. We can however work some things out... The Roadrunner supercomputer is slated to achieve a PeteFLOP of computing power (1million GigaFLOPs). This will be done using the Cell processors, the Opterons are used for I/O and control. The use of Opterons for I/O implies the relatively weak PPE is not getting a significant upgrade. If it was the Opterons would not be necessary. It also tells us each Cell has to get a linpack 1K rating of at least 62.5 Double Precision GigaFLOPS. While Cell is fast and highly efficient, it is not that efficient on linpack. Getting a good linpack score is going to be difficult. Just adding full Double Precision units won't do it. There are a number of ways that the good linpack score can be achieved: The first is to use the technique which only uses double precision when absolutely necessary and single precision at other times. This works for linpack and has achieved very high rates already [1]. This however may be considered "cheating" as it doesn't really measure double precision performance. The second method would be to upgrade the hardware. The floating point hardware is obviously changing but we don't know about any other changes as yet. Double Precision requires twice the room for data compared to single precision, applications which switch to DP will have problems if the current local store size is kept. I expect we'll see the local store doubled in size to 512 KiloBytes. The second area to change is the memory controller. If data takes twice the room it will also take twice the memory bandwidth. XDR already runs at twice the rate of the interface in the standard Cell so this is an option. XDR2 is even faster still so it could also be used. Since this chip will be used in higher end machines than the existing Cell it can be more expensive, additional pins can be added if they want. Adding more pins will allow for more memory controllers or memory lanes to be added, again increasing bandwidth. I think a doubling of memory bandwidth is highly likely and a quadrupling is also a possibility because of the need to increase the efficiency of the chip. Increasing the number of lanes also means more memory chips can be connected, we already know this chip is due in blades with up to 16GB. A clock frequency rise is a distinct possibility, however going up high raises power consumption sharply so they won't go too far. That said I expect they'll be able to go safely above 4GHz without problems. Interconnects One other possibility for change is in the I/O connections. The Opterons will utilise HyperTransport 3, a very high speed I/O system. Having this directly on Cell would allow it to be used with commodity PC chipsets, this would save a lot of money for anyone interested in building Cell workstations. I don't think there will be any other big changes, there will likely be all sorts of tweaks and certainly there'll be additional work on lowering the power consumption. The Internal bus system (EIB) may be beefed up a bit if the external memory bandwidth increases. A lot will have been learned in the development of the first Cell processor so there is the possibility of all sorts of other changes. We also can't rule out the possibility of additional SPEs being added. However there is nothing to indicate this will happen. Conclusion Cell2 is due to be discussed shortly. We already know it's a processor designed for high performance Double Precision FLOPs, will access a lot more memory and will appear in a large supercomputer. The performance necessary means there is a need for higher efficiency in the new processor, I expect this will mean adding higher capacity local stores and higher memory bandwidth. Contrary to common opinion the existing Cell is actually very fast on DP maths [2][3]. It should be at least in line with most desktop processors, and well ahead in some cases. Beefing up the hardware in a Cell2 should ensure the DP Cell should be an absolute beast. Cell has already shown performance 10-100 times faster than existing processors on single precision, this new processor should do the same for double precision. Cell has already had a lot of interest from the scientific & HPC community, expect this chip to bring a lot more. |
#2
|
|||
|
|||
CELL 2 "Enhanced Cell Broadband Engine" to be revealed soon
AirRaid wrote:
What we do know: Cell 2, or properly "Enhanced Cell Broadband Engine" has full speed Double Precision floating point units. Now that's a surprise. I remembered seeing a web site mentioning it somewhere - and this feature, the one I would most like to see in such a chip was, I thought, *specifically* indicated as something it wouldn't have. I'm very glad to hear the good news. Contrary to common opinion the existing Cell is actually very fast on DP maths [2][3]. It should be at least in line with most desktop processors, and well ahead in some cases. Yes, that is true - it has more raw throughput than a typical microcomputer chip on double-precision, despite the disparity between its double-precision and single-precision speed. None the less, the fact that it's just "ordinary" in DP performance is a disappointment. Except for the PlayStation 3, which uses units with one non-working processor, all the machines using the existing Cell are very expensive, however, so the Cell 2 isn't likely to be available to "the rest of us" any time soon. John Savard |
#3
|
|||
|
|||
CELL 2 "Enhanced Cell Broadband Engine" to be revealed soon
http://forum.beyond3d.com/showthread.php?t=40661
quote 'one': "As announced a month ago, yesterday At Cool Chips X, IBM did a presentation about the SPE in the new 65nm Cell B.E. for HPC with DP enhancement. Tech-on has an article (reg required). The Enhanced BE supports DDR2 (DDR2-800) up to 16GB. The DP FLOPS increased from 25.6 Gflops to 102 Gflops, the DP latency is reduced from 13 cycles to 9 cycles with a full pipeline and dual issue. It supports denormal and expected NaN to be more IEEE compliant. Its SPU ISA is v1.2, with 5 new DP instructions. The transistor count is 250 million (from 241 million for 90nm Cell), the chip area is 212 mm2 (from 235 mm2), and it consumes 100 watts (from 110 watts).." http://techon.nikkeibp.co.jp/article...5/P1030856.jpg http://techon.nikkeibp.co.jp/article...5/P1030852.jpg |
Thread Tools | |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Force "medium present" or "device ready"? | Mike Richter | Cdr | 5 | October 23rd 06 12:12 AM |
Samsung ML-2150 (2152W) (1) suddenly prints all pages "almost" blank and (2) error message "HSync Engine Error" , not in user manual | Lady Margaret Thatcher | Printers | 5 | May 4th 06 04:51 AM |
Downside of changing "Max frames to render ahead"/"Prerender Limit" to 1/0? | Jeremy Reaban | Nvidia Videocards | 2 | March 31st 06 04:24 AM |
Ultimate in over-the-top cell speculation. Intel manufactures Cell. Microsoft withers. | Robert Myers | General | 41 | April 8th 05 09:46 AM |
Ultimate in over-the-top cell speculation. Intel manufactures Cell. Microsoft withers. | Robert Myers | Intel | 39 | April 8th 05 09:46 AM |