If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. |
|
|
Thread Tools | Display Modes |
#1
|
|||
|
|||
speed up p4 16-bit
I have a p4 3.00E with an 865PE, running XPPro SP2
I have a RAD development system which has a 32-bit compiler and a 16-bit source code generator. The 16-bit system is a dog, about 2x a celeron 400MHZ. The compiler is fine, about 15x How can I tweak my hardware to speed up 16-bit programs? Thanks. Philip |
#2
|
|||
|
|||
Philip Prohm wrote:
I have a p4 3.00E with an 865PE, running XPPro SP2 I have a RAD development system which has a 32-bit compiler and a 16-bit source code generator. The 16-bit system is a dog, about 2x a celeron 400MHZ. The compiler is fine, about 15x How can I tweak my hardware to speed up 16-bit programs? Thanks. What exactly is slow? The compiler, or the program created by the compiler? Yousuf Khan |
#3
|
|||
|
|||
Yousuf Khan wrote:
Philip Prohm wrote: I have a p4 3.00E with an 865PE, running XPPro SP2 I have a RAD development system which has a 32-bit compiler and a 16-bit source code generator. The 16-bit system is a dog, about 2x a celeron 400MHZ. The compiler is fine, about 15x How can I tweak my hardware to speed up 16-bit programs? Thanks. What exactly is slow? The compiler, or the program created by the compiler? The 32-bit compiler and the 32-bit program created by the compiler are both fine. The dog is the 16-bit source code generator. The generator starts with its templates, adds my tweaks and hand code, then from all those generates a bunch of source code files (which are then compiled in a conventional fashion). This generation phase is very slow, only twice the speed of the celeron whereas the compilation phase is fifteen times the speed. I'm happy with the 32-bit performance I would like to improve the 16-bit performance Philip |
#4
|
|||
|
|||
Philip Prohm wrote:
The 32-bit compiler and the 32-bit program created by the compiler are both fine. The dog is the 16-bit source code generator. The generator starts with its templates, adds my tweaks and hand code, then from all those generates a bunch of source code files (which are then compiled in a conventional fashion). This generation phase is very slow, only twice the speed of the celeron whereas the compilation phase is fifteen times the speed. I'm happy with the 32-bit performance I would like to improve the 16-bit performance Well, it doesn't sound like it's a 16 vs. 32-bit issue, it just sounds like the source code generator is very i/o bound. It sounds like it opens up a lot of files simultaneously. That's the domain of the disk subsystem. You could try to implement a RAID striping system and/or increase memory to accomodate additional disk caching. Your mileage may vary. Yousuf Khan |
#5
|
|||
|
|||
Yousuf Khan wrote:
Philip Prohm wrote: The 32-bit compiler and the 32-bit program created by the compiler are both fine. The dog is the 16-bit source code generator. The generator starts with its templates, adds my tweaks and hand code, then from all those generates a bunch of source code files (which are then compiled in a conventional fashion). This generation phase is very slow, only twice the speed of the celeron whereas the compilation phase is fifteen times the speed. I'm happy with the 32-bit performance I would like to improve the 16-bit performance Well, it doesn't sound like it's a 16 vs. 32-bit issue, it just sounds like the source code generator is very i/o bound. It sounds like it opens up a lot of files simultaneously. That's the domain of the disk subsystem. You could try to implement a RAID striping system and/or increase memory to accomodate additional disk caching. Your mileage may vary. Interesting suggestions, Yousuf. There's certainly a few files being opened at generation time but I don't know precisely how many. I'm not sure I would have picked the problem as being i/o bound. Maybe I was just fixated on 16 bits Does FILES= do anything useful in XP? Does XP have a disc i/o monitor? W98's sysmon.exe was great for this sort of monitoring. Just tried sysmon but it won't run, and not properly in w98 compatibility mode I have 1GB ram so any ideas what I can do with some of it? Don't know much about RAID; how many discs would I need? I tried the Intel Application Accelerator (which I get the impression helps i/o) but it wouldn't install - incompatible with my chipset or something, I forget now. Philip |
#6
|
|||
|
|||
Philip Prohm wrote:
Does FILES= do anything useful in XP? Does XP have a disc i/o monitor? W98's sysmon.exe was great for this sort of monitoring. Just tried sysmon but it won't run, and not properly in w98 compatibility mode found perfmon.msc There are so many things to choose from I would welcome any suggestions as to which bits to monitor in order to nail this i/o bottleneck theory, and not waste too much time experimenting in order to come up with a useful monitor. Thanks Philip |
#7
|
|||
|
|||
Philip Prohm wrote:
Yousuf Khan wrote: Philip Prohm wrote: The 32-bit compiler and the 32-bit program created by the compiler are both fine. The dog is the 16-bit source code generator. The generator starts with its templates, adds my tweaks and hand code, then from all those generates a bunch of source code files (which are then compiled in a conventional fashion). This generation phase is very slow, only twice the speed of the celeron whereas the compilation phase is fifteen times the speed. I'm happy with the 32-bit performance I would like to improve the 16-bit performance Well, it doesn't sound like it's a 16 vs. 32-bit issue, it just sounds like the source code generator is very i/o bound. It sounds like it opens up a lot of files simultaneously. That's the domain of the disk subsystem. You could try to implement a RAID striping system and/or increase memory to accomodate additional disk caching. Your mileage may vary. Interesting suggestions, Yousuf. There's certainly a few files being opened at generation time but I don't know precisely how many. I'm not sure I would have picked the problem as being i/o bound. Maybe I was just fixated on 16 bits Does FILES= do anything useful in XP? Does XP have a disc i/o monitor? W98's sysmon.exe was great for this sort of monitoring. Just tried sysmon but it won't run, and not properly in w98 compatibility mode I have 1GB ram so any ideas what I can do with some of it? Don't know much about RAID; how many discs would I need? I tried the Intel Application Accelerator (which I get the impression helps i/o) but it wouldn't install - incompatible with my chipset or something, I forget now. Philip 1GB of RAM? If you are generating 16-bit code, it is unlikely that you will need more than a handful of megabytes. Set aside 128MB as a RAM disk. Copy all your files to it, and tell the code generator that you want to use drive Z: (or whatever the RAM disk is assigned to) as your work area. RAM should be about a thousand times faster than compiling to disk if the limitation truly is the I/O. If you do not see an improvement, then you can be assured the limitation is not I/O in nature. I can't tell you exactly how to make a RAM disk in XP, but there should be some tool availible to you for that purpose. Alex -- My words are my own. They represent no other; they belong to no other. Don't read anything into them or you may be required to compensate me for violation of copyright. (I do not speak for my employer.) |
#8
|
|||
|
|||
Philip Prohm wrote:
Interesting suggestions, Yousuf. There's certainly a few files being opened at generation time but I don't know precisely how many. I'm not sure I would have picked the problem as being i/o bound. Maybe I was just fixated on 16 bits Yeah, these days with the rapid changes in technology, it gets very easy to get fixated on a big complex explanation, when it could be a much simpler explanation. Does FILES= do anything useful in XP? No, that's an old DOS configuration parameter which set the number of file handles available to programs running under DOS. The maximum limit in DOS was 256 (which most people probably opted for), but you could save some memory by limiting it to a smaller value. Back in those days, every bit of memory counted. Starting with Windows NT and going into XP, I believe the total file handle limit is into the millions, making DOS's limits laughable. I have 1GB ram so any ideas what I can do with some of it? Don't know much about RAID; how many discs would I need? For RAID striping, all you would need is two disks of identical size. I tried the Intel Application Accelerator (which I get the impression helps i/o) but it wouldn't install - incompatible with my chipset or something, I forget now. No, that IAA was simply Intel's specially designed device drivers for Windows. Generic disk device drivers already exist out of the box in Windows, courtesy of Microsoft. However, the idea of IAA was that they performed better with Intel chipsets, as they were specially designed for the Intel chipsets, as opposed to the generic Microsoft ones. Most other manufacturers also provide their own specialized drivers for optional installation. You either use the generic Microsoft ones, or you use the specialized chipset maker drivers. Yousuf Khan |
#9
|
|||
|
|||
Yousuf Khan wrote:
Philip Prohm wrote: Interesting suggestions, Yousuf. There's certainly a few files being opened at generation time but I don't know precisely how many. I'm not sure I would have picked the problem as being i/o bound. Maybe I was just fixated on 16 bits Yeah, these days with the rapid changes in technology, it gets very easy to get fixated on a big complex explanation, when it could be a much simpler explanation. I've been doing some monitoring with Performance Monitor and the i/o is not the problem - the compilation phase does much more i/o than the generation phase, especially reading. The generator has a write burst when moving from one output source file to the next (there are about 80) and in between does essentially no reading or writing whatsoever. This makes sense in hindsight because the system takes ages to initialise; it does a lot of work and uses a lot of memory (120MB). Perhaps it is "pre-reading" everything into memory. Anyway, during generation, which takes 4-5 minutes, the cpu sits on about 50% and the disc does very little apart from the write bursts I mentioned. So I'm back to my 16-bit instruction hypothesis. The conventional wisdom amongst my colleagues is that newish AMD chips do 16-bit much better than Intel P4 (almost an order of magnitude) Windows, courtesy of Microsoft. However, the idea of IAA was that they performed better with Intel chipsets, as they were specially designed for the Intel chipsets, as opposed to the generic Microsoft ones. Most other manufacturers also provide their own specialized drivers for optional installation. You either use the generic Microsoft ones, or you use the specialized chipset maker drivers. I'm guessing the performance advantage of optimised driver over Microsoft driver would be 5-10%? Philip |
#10
|
|||
|
|||
Philip Prohm wrote:
I've been doing some monitoring with Performance Monitor and the i/o is not the problem - the compilation phase does much more i/o than the generation phase, especially reading. The generator has a write burst when moving from one output source file to the next (there are about 80) and in between does essentially no reading or writing whatsoever. This makes sense in hindsight because the system takes ages to initialise; it does a lot of work and uses a lot of memory (120MB). Perhaps it is "pre-reading" everything into memory. Anyway, during generation, which takes 4-5 minutes, the cpu sits on about 50% and the disc does very little apart from the write bursts I mentioned. So I'm back to my 16-bit instruction hypothesis. The conventional wisdom amongst my colleagues is that newish AMD chips do 16-bit much better than Intel P4 (almost an order of magnitude) I still don't think it's necessarily the 16-bit vs. 32-bit problem here. It's probably true that Athlons run all 16-bit (and most 32-bit) code faster than P4, but I wouldn't think it would be this much higher. It sounds like perhaps, since the P4 is touching 50% business, that's it's probably found the code branches a lot and that takes a huge toll on a P4. Athlons are just much better at handling code branching, because they assume the worst case code, whereas P4's assume the best-case code. But only you can find out, you'll probably have to get yourself one, and run your own programs on it. Windows, courtesy of Microsoft. However, the idea of IAA was that they performed better with Intel chipsets, as they were specially designed for the Intel chipsets, as opposed to the generic Microsoft ones. Most other manufacturers also provide their own specialized drivers for optional installation. You either use the generic Microsoft ones, or you use the specialized chipset maker drivers. I'm guessing the performance advantage of optimised driver over Microsoft driver would be 5-10%? Possibly, which you have to measure against the loss of stability, which isn't really countable. Most people have no trouble with IAA, but a small number seem to have horrible stability problems afterwards. It's an either-or situation, you either have problems, or you don't, nothing in between. Yousuf Khan |
Thread Tools | |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
D-Link 614+ - DSL was slow, no improvement after DSL speed upgrade | Mike Walsh | General Hardware | 0 | December 6th 04 04:01 PM |
FSB, Bus speed, memory speed??!! | esara | General Hardware | 1 | April 8th 04 04:19 AM |
Bus speed calc? | Anonymous Joe | General | 1 | December 29th 03 06:22 AM |
DVD Identifier - wirting speed in brackets | Chan WK | Cdr | 2 | December 5th 03 05:20 AM |
How can I get advertised burn speed? | Charles Howse | Cdr | 11 | November 22nd 03 04:48 PM |