If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. |
|
|
Thread Tools | Display Modes |
#21
|
|||
|
|||
Intel details future Larrabee graphics chip
On a sunny day (Fri, 08 Aug 2008 07:40:53 -0700) it happened John Larkin
wrote in : That's the IBM "channel controller" concept: add complexm specialized dma-based i/o controllers to take the load off the CPU. But if you have hundreds of CPU's, the strategy changes. John Ultimately you will have to move bytes, from one CPU to the other, or from dedicated IO to one CPU, and things have to happen at the right moment. Results will never be available before requests...... It is a bit like Usenet: (smile), there are many 'processors' (readers. posters, lurkers) here, some output some data at some time in response to some event, could be a question, others read it, later, much later perhaps, see the problem? Watched the Olympic opening, I must say the Chinese make a beautiful event. Never got boring, the previous one was ugly and not worth looking at, but anyways, so many LEDs? And some projection! Seems they are ahead in many a field. Would you not be scare to death if you were a little girl hanging 25 meters above the floor from some steel cables..... Chinese are brave too :-) |
#22
|
|||
|
|||
Intel details future Larrabee graphics chip
On Thu, 7 Aug 2008 07:44:19 -0700, "Chris M. Thomasson"
wrote: "Chris M. Thomasson" wrote in message ... "John Larkin" wrote in message ... [...] Using multicore properly will require undoing about 60 years of thinking, 60 years of believing that CPUs are expensive. The bottleneck is the cache-coherency system. I meant to say: /One/ bottleneck is the cache-coherency system. I think the trend is to have the cores surround a common shared cache; a little local memory (and cache, if the local memory is slower for some reason) per CPU wouldn't hurt. Cache coherency is simple if you don't insist on flat-out maximum performance. What we should insist on is flat-out unbreakable systems, and buy better silicon to get the performance back if we need it. I'm reading Showstopper!, the story of the development of NT. It's a great example of why we need a different way of thinking about OS's. Silicon is going to make that happen, finally free us of the tyranny of CPU-as-precious-resource. A lot of programmers aren't going to like this. John |
#23
|
|||
|
|||
Intel details future Larrabee graphics chip
On a sunny day (Fri, 08 Aug 2008 08:54:36 -0700) it happened John Larkin
wrote in : /One/ bottleneck is the cache-coherency system. I think the trend is to have the cores surround a common shared cache; a little local memory (and cache, if the local memory is slower for some reason) per CPU wouldn't hurt. Cache coherency is simple if you don't insist on flat-out maximum performance. What we should insist on is flat-out unbreakable systems, and buy better silicon to get the performance back if we need it. I'm reading Showstopper!, the story of the development of NT. It's a great example of why we need a different way of thinking about OS's. Silicon is going to make that happen, finally free us of the tyranny of CPU-as-precious-resource. A lot of programmers aren't going to like this. John John Lennon: 'You know I am a dreamer' ..... ' And I hope you join us someday' (well what I remember of it). You should REALLY try to program a Cell processor some day. Dunno what you have against programmers, there are programmaers who are amazingly clever with hardware resources. I dunno about NT and MS, but IIRC MS plucked programmers from unis, and sort of brainwashed them then.. the result we all know. |
#24
|
|||
|
|||
Intel details future Larrabee graphics chip
John Larkin wrote:
On Thu, 7 Aug 2008 07:44:19 -0700, "Chris M. Thomasson" wrote: "Chris M. Thomasson" wrote in message ... "John Larkin" wrote in message ... [...] Using multicore properly will require undoing about 60 years of thinking, 60 years of believing that CPUs are expensive. The bottleneck is the cache-coherency system. I meant to say: /One/ bottleneck is the cache-coherency system. I think the trend is to have the cores surround a common shared cache; a little local memory (and cache, if the local memory is slower for some reason) per CPU wouldn't hurt. For small N this can be made work very nicely. Cache coherency is simple if you don't insist on flat-out maximum performance. What we should insist on is flat-out unbreakable systems, and buy better silicon to get the performance back if we need it. Existing cache hardware on Pentiums still isn't quite good enough. Try probing its memory with large power of two strides and you fall over a performance limitation caused by the cheap and cheerful way it uses lower address bits for cache associativity. See Steven Johnsons post in the FFT Timing thread. I'm reading Showstopper!, the story of the development of NT. It's a great example of why we need a different way of thinking about OS's. If it is anything like the development of OS/2 you get to see very bright guys reinvent things from scratch that were already known in the mini and mainframe world (sometimes with the same bugs and quirks as the first iteration of big iron code suffered from). NT 3.51 was a particularly good vintage. After that bloatware set in. Silicon is going to make that happen, finally free us of the tyranny of CPU-as-precious-resource. A lot of programmers aren't going to like this. CPU cycles are cheap and getting cheaper and human cycles are expensive and getting more expensive. But that also says that we should also be using better tools and languages to manage the hardware. Unfortunately time to market advantage tends to produce less than robust applications with pretty interfaces and fragile internals. You can after all send out code patches over the Internet all too easily ;-) Since people buy the stuff (I would not wish Vista on my worst enemy by the way) even with all its faults the market rules, and market forces are never wrong... Most of what you are claiming as advantages of separate CPUs can be achieved just as easily with hardware support for protected user memory and security privilige rings. It is more likely that virtualisation of single, dual or quad cores will become common in domestic PCs. There was a Pentium exploit documented against some brands of Unix. eg. http://www.ssi.gouv.fr/fr/sciences/f...006-duflot.pdf Loads of physical CPUs just creates a different set of complexity problems. And they are a pig to program efficiently. Regards, Martin Brown ** Posted from http://www.teranews.com ** |
#25
|
|||
|
|||
Intel details future Larrabee graphics chip
On Fri, 08 Aug 2008 18:03:09 +0100, Martin Brown
wrote: John Larkin wrote: On Thu, 7 Aug 2008 07:44:19 -0700, "Chris M. Thomasson" wrote: "Chris M. Thomasson" wrote in message ... "John Larkin" wrote in message ... [...] Using multicore properly will require undoing about 60 years of thinking, 60 years of believing that CPUs are expensive. The bottleneck is the cache-coherency system. I meant to say: /One/ bottleneck is the cache-coherency system. I think the trend is to have the cores surround a common shared cache; a little local memory (and cache, if the local memory is slower for some reason) per CPU wouldn't hurt. For small N this can be made work very nicely. Cache coherency is simple if you don't insist on flat-out maximum performance. What we should insist on is flat-out unbreakable systems, and buy better silicon to get the performance back if we need it. Existing cache hardware on Pentiums still isn't quite good enough. Try probing its memory with large power of two strides and you fall over a performance limitation caused by the cheap and cheerful way it uses lower address bits for cache associativity. See Steven Johnsons post in the FFT Timing thread. I'm reading Showstopper!, the story of the development of NT. It's a great example of why we need a different way of thinking about OS's. If it is anything like the development of OS/2 you get to see very bright guys reinvent things from scratch that were already known in the mini and mainframe world (sometimes with the same bugs and quirks as the first iteration of big iron code suffered from). Yes. Everybody thought they could write from scratch a better (whatever) than the other groups had already developed, and in a few weeks yet. There were "two inch pipes full of **** flowing in both directions" between graphics groups. Code reuse is not popular among people who live to write code. NT 3.51 was a particularly good vintage. After that bloatware set in. Silicon is going to make that happen, finally free us of the tyranny of CPU-as-precious-resource. A lot of programmers aren't going to like this. CPU cycles are cheap and getting cheaper and human cycles are expensive and getting more expensive. But that also says that we should also be using better tools and languages to manage the hardware. Unfortunately time to market advantage tends to produce less than robust applications with pretty interfaces and fragile internals. You can after all send out code patches over the Internet all too easily ;-) NT followed the classic methodology: code fast, build the OS, test/test/test looking for bugs. I think there were 2000 known bugs in the first developer's release. There must have been ballpark 100K bugs created and fixed during development. Since people buy the stuff (I would not wish Vista on my worst enemy by the way) even with all its faults the market rules, and market forces are never wrong... Most of what you are claiming as advantages of separate CPUs can be achieved just as easily with hardware support for protected user memory and security privilige rings. It is more likely that virtualisation of single, dual or quad cores will become common in domestic PCs. Intel was criminally negligent in not providing better hardware protections, and Microsoft a co-criminal in not using what little was available. Microsoft has never seen data that it didn't want to execute. I ran PDP-11 timeshare systems that couldn't be crashed by hostile users, and ran for months between power failures. There was a Pentium exploit documented against some brands of Unix. eg. http://www.ssi.gouv.fr/fr/sciences/f...006-duflot.pdf Loads of physical CPUs just creates a different set of complexity problems. And they are a pig to program efficiently. So program them inefficiently. Stop thinking about CPU cycles as precious resources, and start think that users matter more. I have personally spent far more time recovering from Windows crashes and stupidities than I've spent waiting for compute-bound stuff to run. If the OS runs alone on one CPU, totally hardware protected from all other processes, totally in control, that's not complex. As transistors get smaller and cheaper, and cores multiply into the hundreds, the limiting resource will become power dissipation. So if every process gets its own CPU, and idle CPUs power down, and there's no context switching overhead, the multi-CPU system is net better off. What else are we gonna do with 1024 cores? We'll probably see it on Linux first. John |
#26
|
|||
|
|||
Intel details future Larrabee graphics chip
John Larkin wrote:
On Fri, 08 Aug 2008 18:03:09 +0100, Martin Brown wrote: John Larkin wrote: On Thu, 7 Aug 2008 07:44:19 -0700, "Chris M. Thomasson" wrote: "Chris M. Thomasson" wrote in message ... "John Larkin" wrote in message ... [...] Using multicore properly will require undoing about 60 years of thinking, 60 years of believing that CPUs are expensive. The bottleneck is the cache-coherency system. I meant to say: /One/ bottleneck is the cache-coherency system. I think the trend is to have the cores surround a common shared cache; a little local memory (and cache, if the local memory is slower for some reason) per CPU wouldn't hurt. For small N this can be made work very nicely. Cache coherency is simple if you don't insist on flat-out maximum performance. What we should insist on is flat-out unbreakable systems, and buy better silicon to get the performance back if we need it. Existing cache hardware on Pentiums still isn't quite good enough. Try probing its memory with large power of two strides and you fall over a performance limitation caused by the cheap and cheerful way it uses lower address bits for cache associativity. See Steven Johnsons post in the FFT Timing thread. I'm reading Showstopper!, the story of the development of NT. It's a great example of why we need a different way of thinking about OS's. If it is anything like the development of OS/2 you get to see very bright guys reinvent things from scratch that were already known in the mini and mainframe world (sometimes with the same bugs and quirks as the first iteration of big iron code suffered from). Yes. Everybody thought they could write from scratch a better (whatever) than the other groups had already developed, and in a few weeks yet. There were "two inch pipes full of **** flowing in both directions" between graphics groups. Code reuse is not popular among people who live to write code. NT 3.51 was a particularly good vintage. After that bloatware set in. Silicon is going to make that happen, finally free us of the tyranny of CPU-as-precious-resource. A lot of programmers aren't going to like this. CPU cycles are cheap and getting cheaper and human cycles are expensive and getting more expensive. But that also says that we should also be using better tools and languages to manage the hardware. Unfortunately time to market advantage tends to produce less than robust applications with pretty interfaces and fragile internals. You can after all send out code patches over the Internet all too easily ;-) NT followed the classic methodology: code fast, build the OS, test/test/test looking for bugs. I think there were 2000 known bugs in the first developer's release. There must have been ballpark 100K bugs created and fixed during development. Since people buy the stuff (I would not wish Vista on my worst enemy by the way) even with all its faults the market rules, and market forces are never wrong... Most of what you are claiming as advantages of separate CPUs can be achieved just as easily with hardware support for protected user memory and security privilige rings. It is more likely that virtualisation of single, dual or quad cores will become common in domestic PCs. Intel was criminally negligent in not providing better hardware protections, and Microsoft a co-criminal in not using what little was available. Microsoft has never seen data that it didn't want to execute. I ran PDP-11 timeshare systems that couldn't be crashed by hostile users, and ran for months between power failures. There was a Pentium exploit documented against some brands of Unix. eg. http://www.ssi.gouv.fr/fr/sciences/f...006-duflot.pdf Loads of physical CPUs just creates a different set of complexity problems. And they are a pig to program efficiently. So program them inefficiently. Stop thinking about CPU cycles as precious resources, and start think that users matter more. I have personally spent far more time recovering from Windows crashes and stupidities than I've spent waiting for compute-bound stuff to run. If the OS runs alone on one CPU, totally hardware protected from all other processes, totally in control, that's not complex. As transistors get smaller and cheaper, and cores multiply into the hundreds, the limiting resource will become power dissipation. So if every process gets its own CPU, and idle CPUs power down, and there's no context switching overhead, the multi-CPU system is net better off. What else are we gonna do with 1024 cores? We'll probably see it on Linux first. I was doing/learning all this stuff 30 years ago. We even developed a loosely couple multi uP system where each module had a comms processor, and apps processor and an OS processor. Back then all these problems had already been analysed to death, and solutions found (if they existed). The future of Intel/MS R&D ought to be reading IEEE papers from the 60s/70s -- Dirk http://www.transcendence.me.uk/ - Transcendence UK http://www.theconsensus.org/ - A UK political party http://www.onetribe.me.uk/wordpress/?cat=5 - Our podcasts on weird stuff |
#27
|
|||
|
|||
Intel details future Larrabee graphics chip
"John Larkin" wrote in message
... On Fri, 08 Aug 2008 18:03:09 +0100, Martin Brown wrote: John Larkin wrote: On Thu, 7 Aug 2008 07:44:19 -0700, "Chris M. Thomasson" wrote: "Chris M. Thomasson" wrote in message ... "John Larkin" wrote in message ... [...] Using multicore properly will require undoing about 60 years of thinking, 60 years of believing that CPUs are expensive. The bottleneck is the cache-coherency system. I meant to say: /One/ bottleneck is the cache-coherency system. I think the trend is to have the cores surround a common shared cache; a little local memory (and cache, if the local memory is slower for some reason) per CPU wouldn't hurt. For small N this can be made work very nicely. Cache coherency is simple if you don't insist on flat-out maximum performance. What we should insist on is flat-out unbreakable systems, and buy better silicon to get the performance back if we need it. Existing cache hardware on Pentiums still isn't quite good enough. Try probing its memory with large power of two strides and you fall over a performance limitation caused by the cheap and cheerful way it uses lower address bits for cache associativity. See Steven Johnsons post in the FFT Timing thread. I'm reading Showstopper!, the story of the development of NT. It's a great example of why we need a different way of thinking about OS's. If it is anything like the development of OS/2 you get to see very bright guys reinvent things from scratch that were already known in the mini and mainframe world (sometimes with the same bugs and quirks as the first iteration of big iron code suffered from). Yes. Everybody thought they could write from scratch a better (whatever) than the other groups had already developed, and in a few weeks yet. There were "two inch pipes full of **** flowing in both directions" between graphics groups. Code reuse is not popular among people who live to write code. NT 3.51 was a particularly good vintage. After that bloatware set in. Silicon is going to make that happen, finally free us of the tyranny of CPU-as-precious-resource. A lot of programmers aren't going to like this. CPU cycles are cheap and getting cheaper and human cycles are expensive and getting more expensive. But that also says that we should also be using better tools and languages to manage the hardware. Unfortunately time to market advantage tends to produce less than robust applications with pretty interfaces and fragile internals. You can after all send out code patches over the Internet all too easily ;-) NT followed the classic methodology: code fast, build the OS, test/test/test looking for bugs. I think there were 2000 known bugs in the first developer's release. There must have been ballpark 100K bugs created and fixed during development. Since people buy the stuff (I would not wish Vista on my worst enemy by the way) even with all its faults the market rules, and market forces are never wrong... Most of what you are claiming as advantages of separate CPUs can be achieved just as easily with hardware support for protected user memory and security privilige rings. It is more likely that virtualisation of single, dual or quad cores will become common in domestic PCs. Intel was criminally negligent in not providing better hardware protections, and Microsoft a co-criminal in not using what little was available. Microsoft has never seen data that it didn't want to execute. I ran PDP-11 timeshare systems that couldn't be crashed by hostile users, and ran for months between power failures. There was a Pentium exploit documented against some brands of Unix. eg. http://www.ssi.gouv.fr/fr/sciences/f...006-duflot.pdf Loads of physical CPUs just creates a different set of complexity problems. And they are a pig to program efficiently. So program them inefficiently. Stop thinking about CPU cycles as precious resources, and start think that users matter more. I have personally spent far more time recovering from Windows crashes and stupidities than I've spent waiting for compute-bound stuff to run. If the OS runs alone on one CPU, totally hardware protected from all other processes, totally in control, that's not complex. As transistors get smaller and cheaper, and cores multiply into the hundreds, the limiting resource will become power dissipation. So if every process gets its own CPU, and idle CPUs power down, and there's no context switching overhead, the multi-CPU system is net better off. What else are we gonna do with 1024 cores? We'll probably see it on Linux first. One point: RCU can scale to thousands of cores; Linux uses that algorithm in its kernel today. |
#28
|
|||
|
|||
Intel details future Larrabee graphics chip
On Tue, 05 Aug 2008 08:24:04 -0700, John Larkin
wrote: On Tue, 5 Aug 2008 13:30:52 +0200, "Skybuck Flying" wrote: As the number of cores goes up the watt requirements goes up too ? Not necessarily, if the technology progresses and the clock rates are kept reasonable. And one can always throttle down the CPUs that aren't busy. Will we need a zillion watts of power soon ? Bye, Skybuck. I saw suggestions of something like 60 cores, 240 threads in the reasonable future. This has got to affect OS design. John This won't bother *nix class OS's They have been scaled past 10 thousand cores already. Other OS are on their own. |
#29
|
|||
|
|||
Intel details future Larrabee graphics chip
On Tue, 5 Aug 2008 12:54:14 -0700, "Chris M. Thomasson"
wrote: "John Larkin" wrote in message .. . On Tue, 5 Aug 2008 13:30:52 +0200, "Skybuck Flying" wrote: As the number of cores goes up the watt requirements goes up too ? Not necessarily, if the technology progresses and the clock rates are kept reasonable. And one can always throttle down the CPUs that aren't busy. Will we need a zillion watts of power soon ? Bye, Skybuck. I saw suggestions of something like 60 cores, 240 threads in the reasonable future. I can see it now... A mega-core GPU chip that can dedicate 1 core per-pixel. lol. At that point you should integrate them directly into the display. Then you could get to get to giga core systems. This has got to affect OS design. They need to completely rethink their multi-threaded synchronization algorihtms. I have a feeling that efficient distributed non-blocking algorihtms, which are comfortable running under a very weak cache coherency model will be all the rage. Getting rid of atomic RMW or StoreLoad style memory barriers is the first step. That reminds me of an article / paper i once read about Cache Only Memory Architecture (COMA). Only they did seem to be able to get it to work though. |
#30
|
|||
|
|||
Intel details future Larrabee graphics chip
On Wed, 06 Aug 2008 19:57:23 -0700, John Larkin
wrote: On Tue, 5 Aug 2008 12:54:14 -0700, "Chris M. Thomasson" wrote: "John Larkin" wrote in message . .. On Tue, 5 Aug 2008 13:30:52 +0200, "Skybuck Flying" wrote: As the number of cores goes up the watt requirements goes up too ? Not necessarily, if the technology progresses and the clock rates are kept reasonable. And one can always throttle down the CPUs that aren't busy. Will we need a zillion watts of power soon ? Bye, Skybuck. I saw suggestions of something like 60 cores, 240 threads in the reasonable future. I can see it now... A mega-core GPU chip that can dedicate 1 core per-pixel. lol. This has got to affect OS design. They need to completely rethink their multi-threaded synchronization algorihtms. I have a feeling that efficient distributed non-blocking algorihtms, which are comfortable running under a very weak cache coherency model will be all the rage. Getting rid of atomic RMW or StoreLoad style memory barriers is the first step. Run one process per CPU. Run the OS kernal, and nothing else, on one CPU. Never context switch. Never swap. Never crash. John OK. How do you deal with I/O devices, user input and hot swap? |
Thread Tools | |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Intel details future 'Larrabee' graphics chip | NV55 | Intel | 9 | August 22nd 08 09:08 PM |
Intel details future 'Larrabee' graphics chip | NV55 | AMD x86-64 Processors | 9 | August 22nd 08 09:08 PM |
Intel details future 'Larrabee' graphics chip | NV55 | Nvidia Videocards | 9 | August 22nd 08 09:08 PM |
Intel details future 'Larrabee' graphics chip | NV55 | Ati Videocards | 9 | August 22nd 08 09:08 PM |
Intel details future -Larrabee- graphics chip | NV55 | General | 7 | August 7th 08 05:12 PM |