New Itanium chips cost just $744

#**341** October 28th 03, 08:12 PM

"bill davidsen" wrote in message
...

| And how exactly do you plan to create an OS that is this "decent"? Every
OS
| eventually comes across this problem. Look at the Unixes and their load
| average statistics. A load average of 1.00 or less on a single-processor
| system means that the system is keeping up with its processes; whereas a
| load average above 1.00 means that there are more requests for time
slices
| than there are available time slices in this same system. You can often
seen
| some systems running at 2.00 or 5.00 or higher.

The load avarage is the number of processes average on the run queue.
Depending on the UNIX version that may include some processes which are
waiting for semiphore, swap, etc. AIX is nice and responsive with a high
load average, I've been hapily editing text file with an editor and not
noticed the load average was 100+ until the alarm went off.

Try this in a graphical OS where echoing a keypress in the editor
requires several process to operate and not all of them are interactive.
And, of course, this argument doesn't apply if you care about *server*
performance.

This isn't a 'not enough CPU' problem. This is a 'CPU gets stuck'
problem. It may be due to OS problems, it may be due to hardware issues, but
every OS on PC hardware suffers from it.

But systems which are usable for desktop, such as Linux, may actually
have a lot of processes and still be able to give the CPU to the one
with the human attached.

Not when you're in X and there is no 'one' with the human attached. And,
of course, this doesn't help for the server case.

As noted elsewhere, a slow machine is nicer to
use with Linux than Windows, the memory use seems better.

Even under comparable usage, with a graphical environment? Linux is
definitely nicer to use in text mode than Windows is, on comparable slow
hardware. However, you are definitely right that Windows memory management
is pure crap.

That said,
making best use of memory means that unused parts of processes do get
swapped, and changing virtual desktops often takes 400-800ms. Of course
Windows doesn't *have* virtual destops in the same way, so there's no
way to compare.

I think low memory is a different problem than CPU latency due to
ambush. Trying to talk about both at the same time, just because you
generally encounter both on low end hardware, obscures things if you're
talking about system sizing.

DS

#**342** October 30th 03, 05:35 AM

On Mon, 29 Sep 2003 21:55:41 -0400, "Bill Todd"
wrote:

snip

So databases aren't candidates for being rewritten (according to your
original suggestion) to leverage SMT's potential to achieve greater per-core
throughput - because they're *already* multi-threaded for other existing
(SMP and I/O) reasons.

Now I understand the disconnect. The post in which I stated
(reiterated, actually) that I thought multi-threading was
underutilized follows:

RMProcessors do the best job single-threaded applications because
that's
RMwhat people know how to write without getting themselves into a big
RMmuddle. People program in a single-threaded style because there is
no
RMincentive for them to do otherwise. If they try, they risk getting
RMthemselves into a muddle with very little prospect of a payoff. No
RMmarket, no software. No software, no market.
RM
RMSingle-threaded software and processors geared to support single
RMthreaded software are self-reinforcing habits. I know of very few
RMproblems that don't exhibit a significant degree of exploitable
RMparallelism. The problem is finding the parallelism at the
RMgranularity that the processor supports efficiently and writing
RMsoftware to support it. That sounds too much like work and, except
in
RMthe HPC world, the OS kernel world, the enterprise computing world,
RMand increasingly the world of games, it doesn't get done.

You apparently interpret that (or something else I said, but I don't
know what) as meaning that I think all software needs to be rewritten
to exploit multi-threading. Most PC users only benefit from having
more than one processor (real or virtual) available if they are trying
to do more than one thing at a time because most PC software isn't
written to exploit multiple processors. That situation isn't likely
to change for the reasons mentioned in my post.

HPC, OS Kernels, and enterprise computing are another matter, and I
mentioned such performance-critical applications as places where
multi-threading is already used. I didn't state, and I didn't intend
to imply, that it is underutilized in those areas, and I would include
OLTP workloads in what I meant by enterprise computing. In other
words, I had no intention of implying that multi-threading is
underutilized for OLTP.

The only question is whether it would make any sense
to over-subscribe each SMT core with *more* threads than it can execute
concurrently to attempt to further leverage available memory bandwidth: my
suspicion is that the answer is "No" because of the increased level of
multi-programming and resulting inter-thread run-time contention that would
occur for what would be likely only a marginal throughput increase in the
*absence* of such considerations, and I suggest that the dramatically
sub-linear increase in throughput reported in the paper you cited tends to
support that suspicion (though with only a single data point one can only
suspect, rather than assume, that the improvement was rapidly approaching an
asymptote).

Nor did I intend to propose aggressively oversubscribing processors.
I only meant to reiterate a point that I thought had been discussed
and agreed up; viz, that SMT was one of the very few ways you could
hide the effects of cache-miss stalls for OLTP workloads. As
discussed elsewhere, SMT probably doesn't help the P4 much because it
doesn't have the resources to take advantage of it, but (and I really
think we are just struggling to agree on something we already agreed
upon), a processor with sufficient computational resources could
benefit from SMT for OLTP workloads.

While you (and Jon Forrest and many others) seem to feel that PC's are
plenty poweful enough, that isn't my experience of them. I'd love to
have a multi-threaded grep and a multi-threaded gcc, but I don't
expect them to appear any time soon. Single threaded programming is
so deeply entrenched that I don't expect any significant change at any
time in the forseeable future, but other programming paradigms are
possible and would be more useful than most people seem to think.
That's all I was trying to say.

RM

#**343** October 30th 03, 04:38 PM

On Tue, 30 Sep 2003 03:59:08 -0400, "Bill Todd"
wrote:

"Robert Myers" wrote in message
.. .

...

I only meant to reiterate a point that I thought had been discussed
and agreed up; viz, that SMT was one of the very few ways you could
hide the effects of cache-miss stalls for OLTP workloads.

Perhaps it's mostly just a difference in viewpoint, but I see nothing about
SMT (or CMP) that hides the effects of cache-miss stalls: each individual
thread still takes just as long to execute as ever. What SMT, CMP, and for
that matter plain old SMP do is allow more parallel use of memory bandwidth
by multiple threads (plus in the case of SMT somewhat more
efficient/flexible utilization of fine-grained processor resources) *in
cases where the workload otherwise lends itself to multiple concurrent
threads of execution* (either within a single process or between multiple
processes).

No matter how you say it, Itanium is alot of watts, alot of
transistors, and alot of real estate on a motherboard to leave sitting
idle while waiting for a cache line to fill.

Back to SuperDome and its scaling problems (HP doesn't like it when I
refer to "scaling problems", but I can't remember the alternative
language they wanted me to use). One approach: turn up the heat on
the engineers to design better/faster crossbar circuitry (program
probably already underway).

Another approach (and probably the direction the industry is headed in
general): stop trying to hook up so many separate chips and get a
single chip to process more threads one way or another. Haggling over
names and details of what resources to share and how left to other
readers and posters.

snip

Single threaded programming is
so deeply entrenched that I don't expect any significant change at any
time in the forseeable future, but other programming paradigms are
possible and would be more useful than most people seem to think.

That's where we largely part company, ... herculean efforts to parallelize memory
accesses (in situations where there are no *other* factors that would
benefit from such parallization) just likely aren't normally justifiable.
As I said earlier, it may come to pass that *compilers* will start doing
transparent tricks to speed up execution of individual threads by concurrent
execute-ahead mechanisms in separate helper threads (though they'll need to
be careful not to squander processing resources that could be used more
effectively by other, independent threads), but the idea that any
significant amount of software will be developed (or rewritten) simply to
take advantage of some potential increase in CPU parallelism just doesn't
seem realistic (because CPUs are *already* fast enough for the vast majority
of the work that they do - some of your work may be an exception to that,
but if so it's likely a *rare* exception).

As I've already said elsewhere, I expect CPU's to be spinning off
threads without human intervention in the no-too-distant future (not a
very bold prediction).

A much bolder prediction: the search for executable threads will go
higher than the low-hanging fruit already identified--fork on call,
simple run-ahead, and helper threads--and the search will be
successful.

...For that matter, this is something akin
to a universal truth in softwa it's seldom worth expending major efforts
in performance optimization outside of a few very carefully selected
critical areas - otherwise, just let hardware advances solve any problem
that may exist.

Unless you work on problems that simply cannot be done without massive
parallelism, in which case you are constantly seeking new ways of
looking at the same old problems.

RM

#**344** November 13th 03, 09:58 PM

In article ,
John Brock wrote:
| In article ,
| Tony Hill wrote:
| On 17 Sep 2003 00:53:44 -0400, (John Brock) wrote:
|
| I have a P4 PC on order which will have 1GB of RAM. I like to keep
| PCs for quite a while, and my long term plan was to upgrade to 4GB
| (i.e., max out the machine) two or three years from now. It sounds
| like you are saying this won't work very well at all. Could you
| elaborate? In particular, if I don't need virtual memory beyond
| 4GB will there still be a problem? If the limit is 4GB for physical
| plus virtual then it would seem that it wouldn't really hurt me if
| all of that 4GB was physical. Or am I missing something?
|
| With the way that Windows (and *BSD) use virtual memory, it's
| generally a good idea to have at least as much swap space as physical
| memory. Some programs will actually refuse to run if you don't have
| any swap space.
|
| If you're running Linux than it's a different story. Linux handles
| it's virtual memory in a very different manner (not necessarily
| better, in fact some could easily argue that it's worse for high-load
| servers), and you only really need swap space if you're running out of
| physical memory.
|
| You may also bump into some licensing issues if you're running
| Windows. Different versions of Windows have different maximum memory
| limits, many of which are determined more by licensing than by
| technical considerations.
|
| This is interesting, but I'm still unclear on the matter, and I
| think I would like to try asking my question a different way: how
| much benefit (and what sort of benefit) would there be in taking
| a P4 PC with 1GB of RAM (and plenty of free disk space) and upgrading
| it to 4GB?

Let me try to rephrase what others have been telling you, it depends on
what you do with it, the o/s and the spplications. If you are not tight
on memory it won't buy you a thing. If you are using a lot of swap now
it will be a huge win. If you are running Windows anything over 2GB will
probably not help (check with a guru on the current state of Win VM).
Linux will make good use of all the memory you have.

The definitive answer is "it depends." Sorry.

--
Bill Davidsen CTO, TMR Associates
As we enjoy great advantages from inventions of others, we should be
glad of an opportunity to serve others by any invention of ours; and
this we should do freely and generously.
-Benjamin Franklin (who would have liked open source)

Thread Tools
Show Printable Version Email this Page
Display Modes
Linear Mode Switch to Hybrid Mode Switch to Threaded Mode

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
AMD to demonstrate dual-core chips	Tony Hill	AMD x86-64 Processors	11	September 16th 04 11:49 AM
Itanium sales hit $14bn (w/ -$13.4bn adjustment)! Uh, Opteron sales too	Yousuf Khan	AMD x86-64 Processors	43	September 7th 04 09:34 AM
Power supply EXPLOSION	Peter Hucker	Overclocking	137	July 28th 04 10:35 PM
Bad news for ATI: Nvidia to 'own' ATI at CeBit - no pixel shader 3.0 support in R420 (long)	NV55	Ati Videocards	12	February 24th 04 06:29 AM
Inq update on future ATI & Nvidia chips	Radeon350	Ati Videocards	0	August 13th 03 10:41 PM