A computer components & hardware forum. HardwareBanter

If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

Go Back   Home » HardwareBanter forum » Processors » Intel
Site Map Home Register Authors List Search Today's Posts Mark Forums Read Web Partners

speed up p4 16-bit



 
 
Thread Tools Display Modes
  #1  
Old January 27th 05, 11:33 AM
Philip Prohm
external usenet poster
 
Posts: n/a
Default speed up p4 16-bit

I have a p4 3.00E with an 865PE, running XPPro SP2

I have a RAD development system which has a 32-bit compiler and a 16-bit
source code generator. The 16-bit system is a dog, about 2x a celeron
400MHZ. The compiler is fine, about 15x

How can I tweak my hardware to speed up 16-bit programs? Thanks.

Philip
  #2  
Old January 28th 05, 12:49 AM
Yousuf Khan
external usenet poster
 
Posts: n/a
Default

Philip Prohm wrote:
I have a p4 3.00E with an 865PE, running XPPro SP2

I have a RAD development system which has a 32-bit compiler and a 16-bit
source code generator. The 16-bit system is a dog, about 2x a celeron
400MHZ. The compiler is fine, about 15x

How can I tweak my hardware to speed up 16-bit programs? Thanks.


What exactly is slow? The compiler, or the program created by the compiler?

Yousuf Khan
  #3  
Old January 28th 05, 08:14 AM
Philip Prohm
external usenet poster
 
Posts: n/a
Default

Yousuf Khan wrote:
Philip Prohm wrote:

I have a p4 3.00E with an 865PE, running XPPro SP2

I have a RAD development system which has a 32-bit compiler and a
16-bit source code generator. The 16-bit system is a dog, about 2x a
celeron 400MHZ. The compiler is fine, about 15x

How can I tweak my hardware to speed up 16-bit programs? Thanks.



What exactly is slow? The compiler, or the program created by the compiler?


The 32-bit compiler and the 32-bit program created by the compiler are
both fine. The dog is the 16-bit source code generator.

The generator starts with its templates, adds my tweaks and hand code,
then from all those generates a bunch of source code files (which are
then compiled in a conventional fashion). This generation phase is very
slow, only twice the speed of the celeron whereas the compilation phase
is fifteen times the speed.

I'm happy with the 32-bit performance
I would like to improve the 16-bit performance

Philip
  #4  
Old January 28th 05, 12:37 PM
Yousuf Khan
external usenet poster
 
Posts: n/a
Default

Philip Prohm wrote:
The 32-bit compiler and the 32-bit program created by the compiler are
both fine. The dog is the 16-bit source code generator.

The generator starts with its templates, adds my tweaks and hand code,
then from all those generates a bunch of source code files (which are
then compiled in a conventional fashion). This generation phase is very
slow, only twice the speed of the celeron whereas the compilation phase
is fifteen times the speed.

I'm happy with the 32-bit performance
I would like to improve the 16-bit performance


Well, it doesn't sound like it's a 16 vs. 32-bit issue, it just sounds
like the source code generator is very i/o bound. It sounds like it
opens up a lot of files simultaneously. That's the domain of the disk
subsystem. You could try to implement a RAID striping system and/or
increase memory to accomodate additional disk caching. Your mileage may
vary.

Yousuf Khan
  #5  
Old January 28th 05, 01:32 PM
Philip Prohm
external usenet poster
 
Posts: n/a
Default

Yousuf Khan wrote:
Philip Prohm wrote:

The 32-bit compiler and the 32-bit program created by the compiler are
both fine. The dog is the 16-bit source code generator.

The generator starts with its templates, adds my tweaks and hand code,
then from all those generates a bunch of source code files (which are
then compiled in a conventional fashion). This generation phase is
very slow, only twice the speed of the celeron whereas the compilation
phase is fifteen times the speed.

I'm happy with the 32-bit performance
I would like to improve the 16-bit performance



Well, it doesn't sound like it's a 16 vs. 32-bit issue, it just sounds
like the source code generator is very i/o bound. It sounds like it
opens up a lot of files simultaneously. That's the domain of the disk
subsystem. You could try to implement a RAID striping system and/or
increase memory to accomodate additional disk caching. Your mileage may
vary.


Interesting suggestions, Yousuf. There's certainly a few files being
opened at generation time but I don't know precisely how many. I'm not
sure I would have picked the problem as being i/o bound. Maybe I was
just fixated on 16 bits

Does FILES= do anything useful in XP? Does XP have a disc i/o monitor?
W98's sysmon.exe was great for this sort of monitoring. Just tried
sysmon but it won't run, and not properly in w98 compatibility mode

I have 1GB ram so any ideas what I can do with some of it? Don't know
much about RAID; how many discs would I need?

I tried the Intel Application Accelerator (which I get the impression
helps i/o) but it wouldn't install - incompatible with my chipset or
something, I forget now.

Philip
  #6  
Old January 28th 05, 01:57 PM
Philip Prohm
external usenet poster
 
Posts: n/a
Default

Philip Prohm wrote:
Does FILES= do anything useful in XP? Does XP have a disc i/o monitor?
W98's sysmon.exe was great for this sort of monitoring. Just tried
sysmon but it won't run, and not properly in w98 compatibility mode


found perfmon.msc

There are so many things to choose from I would welcome any suggestions
as to which bits to monitor in order to nail this i/o bottleneck theory,
and not waste too much time experimenting in order to come up with a
useful monitor. Thanks

Philip
  #7  
Old January 28th 05, 03:37 PM
Alex Johnson
external usenet poster
 
Posts: n/a
Default

Philip Prohm wrote:
Yousuf Khan wrote:

Philip Prohm wrote:

The 32-bit compiler and the 32-bit program created by the compiler
are both fine. The dog is the 16-bit source code generator.

The generator starts with its templates, adds my tweaks and hand
code, then from all those generates a bunch of source code files
(which are then compiled in a conventional fashion). This generation
phase is very slow, only twice the speed of the celeron whereas the
compilation phase is fifteen times the speed.

I'm happy with the 32-bit performance
I would like to improve the 16-bit performance




Well, it doesn't sound like it's a 16 vs. 32-bit issue, it just sounds
like the source code generator is very i/o bound. It sounds like it
opens up a lot of files simultaneously. That's the domain of the disk
subsystem. You could try to implement a RAID striping system and/or
increase memory to accomodate additional disk caching. Your mileage
may vary.



Interesting suggestions, Yousuf. There's certainly a few files being
opened at generation time but I don't know precisely how many. I'm not
sure I would have picked the problem as being i/o bound. Maybe I was
just fixated on 16 bits

Does FILES= do anything useful in XP? Does XP have a disc i/o monitor?
W98's sysmon.exe was great for this sort of monitoring. Just tried
sysmon but it won't run, and not properly in w98 compatibility mode

I have 1GB ram so any ideas what I can do with some of it? Don't know
much about RAID; how many discs would I need?

I tried the Intel Application Accelerator (which I get the impression
helps i/o) but it wouldn't install - incompatible with my chipset or
something, I forget now.

Philip


1GB of RAM? If you are generating 16-bit code, it is unlikely that you
will need more than a handful of megabytes. Set aside 128MB as a RAM
disk. Copy all your files to it, and tell the code generator that you
want to use drive Z: (or whatever the RAM disk is assigned to) as your
work area. RAM should be about a thousand times faster than compiling
to disk if the limitation truly is the I/O. If you do not see an
improvement, then you can be assured the limitation is not I/O in
nature. I can't tell you exactly how to make a RAM disk in XP, but
there should be some tool availible to you for that purpose.

Alex

--
My words are my own. They represent no other; they belong to no other.
Don't read anything into them or you may be required to compensate me
for violation of copyright. (I do not speak for my employer.)
  #8  
Old January 29th 05, 07:11 AM
Yousuf Khan
external usenet poster
 
Posts: n/a
Default

Philip Prohm wrote:
Interesting suggestions, Yousuf. There's certainly a few files being
opened at generation time but I don't know precisely how many. I'm not
sure I would have picked the problem as being i/o bound. Maybe I was
just fixated on 16 bits


Yeah, these days with the rapid changes in technology, it gets very easy
to get fixated on a big complex explanation, when it could be a much
simpler explanation.

Does FILES= do anything useful in XP?


No, that's an old DOS configuration parameter which set the number of
file handles available to programs running under DOS. The maximum limit
in DOS was 256 (which most people probably opted for), but you could
save some memory by limiting it to a smaller value. Back in those days,
every bit of memory counted.

Starting with Windows NT and going into XP, I believe the total file
handle limit is into the millions, making DOS's limits laughable.

I have 1GB ram so any ideas what I can do with some of it? Don't know
much about RAID; how many discs would I need?


For RAID striping, all you would need is two disks of identical size.

I tried the Intel Application Accelerator (which I get the impression
helps i/o) but it wouldn't install - incompatible with my chipset or
something, I forget now.


No, that IAA was simply Intel's specially designed device drivers for
Windows. Generic disk device drivers already exist out of the box in
Windows, courtesy of Microsoft. However, the idea of IAA was that they
performed better with Intel chipsets, as they were specially designed
for the Intel chipsets, as opposed to the generic Microsoft ones. Most
other manufacturers also provide their own specialized drivers for
optional installation. You either use the generic Microsoft ones, or you
use the specialized chipset maker drivers.

Yousuf Khan
  #9  
Old January 29th 05, 09:55 AM
Philip Prohm
external usenet poster
 
Posts: n/a
Default

Yousuf Khan wrote:
Philip Prohm wrote:

Interesting suggestions, Yousuf. There's certainly a few files being
opened at generation time but I don't know precisely how many. I'm not
sure I would have picked the problem as being i/o bound. Maybe I was
just fixated on 16 bits



Yeah, these days with the rapid changes in technology, it gets very easy
to get fixated on a big complex explanation, when it could be a much
simpler explanation.


I've been doing some monitoring with Performance Monitor and the i/o is
not the problem - the compilation phase does much more i/o than the
generation phase, especially reading. The generator has a write burst
when moving from one output source file to the next (there are about 80)
and in between does essentially no reading or writing whatsoever.

This makes sense in hindsight because the system takes ages to
initialise; it does a lot of work and uses a lot of memory (120MB).
Perhaps it is "pre-reading" everything into memory. Anyway, during
generation, which takes 4-5 minutes, the cpu sits on about 50% and the
disc does very little apart from the write bursts I mentioned.

So I'm back to my 16-bit instruction hypothesis. The conventional wisdom
amongst my colleagues is that newish AMD chips do 16-bit much better
than Intel P4 (almost an order of magnitude)


Windows, courtesy of Microsoft. However, the idea of IAA was that they
performed better with Intel chipsets, as they were specially designed
for the Intel chipsets, as opposed to the generic Microsoft ones. Most
other manufacturers also provide their own specialized drivers for
optional installation. You either use the generic Microsoft ones, or you
use the specialized chipset maker drivers.


I'm guessing the performance advantage of optimised driver over
Microsoft driver would be 5-10%?


Philip
  #10  
Old January 29th 05, 05:18 PM
Yousuf Khan
external usenet poster
 
Posts: n/a
Default

Philip Prohm wrote:
I've been doing some monitoring with Performance Monitor and the i/o is
not the problem - the compilation phase does much more i/o than the
generation phase, especially reading. The generator has a write burst
when moving from one output source file to the next (there are about 80)
and in between does essentially no reading or writing whatsoever.

This makes sense in hindsight because the system takes ages to
initialise; it does a lot of work and uses a lot of memory (120MB).
Perhaps it is "pre-reading" everything into memory. Anyway, during
generation, which takes 4-5 minutes, the cpu sits on about 50% and the
disc does very little apart from the write bursts I mentioned.

So I'm back to my 16-bit instruction hypothesis. The conventional wisdom
amongst my colleagues is that newish AMD chips do 16-bit much better
than Intel P4 (almost an order of magnitude)


I still don't think it's necessarily the 16-bit vs. 32-bit problem here.
It's probably true that Athlons run all 16-bit (and most 32-bit) code
faster than P4, but I wouldn't think it would be this much higher. It
sounds like perhaps, since the P4 is touching 50% business, that's it's
probably found the code branches a lot and that takes a huge toll on a
P4. Athlons are just much better at handling code branching, because
they assume the worst case code, whereas P4's assume the best-case code.

But only you can find out, you'll probably have to get yourself one, and
run your own programs on it.

Windows, courtesy of Microsoft. However, the idea of IAA was that they
performed better with Intel chipsets, as they were specially designed
for the Intel chipsets, as opposed to the generic Microsoft ones.
Most other manufacturers also provide their own specialized drivers
for optional installation. You either use the generic Microsoft ones,
or you use the specialized chipset maker drivers.



I'm guessing the performance advantage of optimised driver over
Microsoft driver would be 5-10%?


Possibly, which you have to measure against the loss of stability, which
isn't really countable. Most people have no trouble with IAA, but a
small number seem to have horrible stability problems afterwards. It's
an either-or situation, you either have problems, or you don't, nothing
in between.

Yousuf Khan
 




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
D-Link 614+ - DSL was slow, no improvement after DSL speed upgrade Mike Walsh General Hardware 0 December 6th 04 04:01 PM
FSB, Bus speed, memory speed??!! esara General Hardware 1 April 8th 04 04:19 AM
Bus speed calc? Anonymous Joe General 1 December 29th 03 06:22 AM
DVD Identifier - wirting speed in brackets Chan WK Cdr 2 December 5th 03 05:20 AM
How can I get advertised burn speed? Charles Howse Cdr 11 November 22nd 03 04:48 PM


All times are GMT +1. The time now is 05:53 PM.


Powered by vBulletin® Version 3.6.4
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright ©2004-2024 HardwareBanter.
The comments are property of their posters.