L1/L2 cache question

#1 January 14th 04, 06:08 AM

Hi,
An L1 cache is split(I,D) or unified. Most(All?) of the time an L2
cache is unified. Are there any processors out there which use a
split/unified L1 cache backed by a *unified* L2 cache?

Is there any motivation in doing this?

I am confused since IMHO, all the instruction opcodes ought to fit in
L1 I-cache(say 16KB). Are there instruction sets that necessitate a
need for an L2 I-cache (either as split or part of a unified L2
cache)?

I ask this because I am doing simulation research for a 2-level cache
and am thinking of hardcoding the L2 cache to be a unified cache. In
any case, even though it is unified, it is essentially a L2 D-cache. I
think the L1 is big enough for all the instruction opcodes. Thats what
I feel.

Also, what is the LARGEST(2*split or unified) L1 cache now-a-days?
512KB? Or have they pushed it to 1GB?

Love to hear about this....

Thanks and have a great 2004,
v796

#2 January 14th 04, 12:48 PM

An L1 cache is split(I,D) or unified. Most(All?) of the time an L2
cache is unified. Are there any processors out there which use a
split/unified L1 cache backed by a *unified* L2 cache?

Most do, including P4/Athlon.

Is there any motivation in doing this?

L1, traditionally, is split, because you get better performance if you
can fetch an instruction and some data in the same cycle. Its probably
easier/faster to have split caches than a multi-ported cache.

I am confused since IMHO, all the instruction opcodes ought to fit in
L1 I-cache(say 16KB).

There are hardly any applications that are only 16KB code.

Are there instruction sets that necessitate a
need for an L2 I-cache (either as split or part of a unified L2
cache)?

Instruction sets don't require any cache at all. The only reason to
add a cache is to increase performance.

I ask this because I am doing simulation research for a 2-level cache
and am thinking of hardcoding the L2 cache to be a unified cache. In
any case, even though it is unified, it is essentially a L2 D-cache. I
think the L1 is big enough for all the instruction opcodes. Thats what
I feel.

Maybe for a microcontroller type application. Not for anything bigger.

Also, what is the LARGEST(2*split or unified) L1 cache now-a-days?
512KB? Or have they pushed it to 1GB?

For L1, 64KBx2 is about as large as I've seen.

Cheers,
JonB

#3 January 14th 04, 12:55 PM

In article , (Jon Beniston) writes:
| An L1 cache is split(I,D) or unified. Most(All?) of the time an L2
| cache is unified. Are there any processors out there which use a
| split/unified L1 cache backed by a *unified* L2 cache?
|
| Most do, including P4/Athlon.

He probably meant the converse.

| Is there any motivation in doing this?
|
| L1, traditionally, is split, because you get better performance if you
| can fetch an instruction and some data in the same cycle. Its probably
| easier/faster to have split caches than a multi-ported cache.

Also, instruction caches are typically read-only, and that makes a
slight difference to their optimal implementation, which is more
important at L1 than L2.

| Are there instruction sets that necessitate a
| need for an L2 I-cache (either as split or part of a unified L2
| cache)?
|
| Instruction sets don't require any cache at all. The only reason to
| add a cache is to increase performance.

There are ISAs where the cache is semantically visible to even
unprivileged programs and, in some of those, a cache can be needed
for functionality. I don't know of any that need a separate L1 and
L2 cache, though.

Regards,
Nick Maclaren.

#4 January 14th 04, 01:31 PM

L1, traditionally, is split, because you get better performance if you
can fetch an instruction and some data in the same cycle. Its probably
easier/faster to have split caches than a multi-ported cache.

Also, an I cache has different properties: not only can it be read-only,
it also might want to store additional information, such as predecode bits
or - gasp! - be in a totally different format a la P4 trace cache.

Other observation: There are several implementations on the market that
will fetch some data from L2 only (FP, in the cases I know), and one could
imagine situations where there is no L1 D cache.

Jan

#5 January 14th 04, 03:02 PM

"v796" wrote in message
m...
Hi,
An L1 cache is split(I,D) or unified. Most(All?) of the time an L2
cache is unified. Are there any processors out there which use a
split/unified L1 cache backed by a *unified* L2 cache?

Is there any motivation in doing this?

I am confused since IMHO, all the instruction opcodes ought to fit in
L1 I-cache(say 16KB). Are there instruction sets that necessitate a
need for an L2 I-cache (either as split or part of a unified L2
cache)?

I ask this because I am doing simulation research for a 2-level cache
and am thinking of hardcoding the L2 cache to be a unified cache. In
any case, even though it is unified, it is essentially a L2 D-cache. I
think the L1 is big enough for all the instruction opcodes. Thats what
I feel.

Also, what is the LARGEST(2*split or unified) L1 cache now-a-days?
512KB? Or have they pushed it to 1GB?

Love to hear about this....

Thanks and have a great 2004,
v796

All the "opcodes" for OS/390 do not fit in a 16k I Cache (or even one
larger) QED.

del cecchi

ps windows XP either

#6 January 14th 04, 09:21 PM

Hi,
Sorry a mistake in my first question....The correct question is below.

An L1 cache is split(I,D) or unified. Most(All?) of the time an L2
cache is unified. Are there any processors out there which use a
split/unified L1 cache backed by a --split-- L2 cache?

It should be split instead of unified. Sorry, I get confused easily.

Keep your comments coming. Thanks.

Sincerely,
v796.

(v796) wrote in message om...
Hi,
An L1 cache is split(I,D) or unified. Most(All?) of the time an L2
cache is unified. Are there any processors out there which use a
split/unified L1 cache backed by a *unified* L2 cache?

Is there any motivation in doing this?

I am confused since IMHO, all the instruction opcodes ought to fit in
L1 I-cache(say 16KB). Are there instruction sets that necessitate a
need for an L2 I-cache (either as split or part of a unified L2
cache)?

I ask this because I am doing simulation research for a 2-level cache
and am thinking of hardcoding the L2 cache to be a unified cache. In
any case, even though it is unified, it is essentially a L2 D-cache. I
think the L1 is big enough for all the instruction opcodes. Thats what
I feel.

Also, what is the LARGEST(2*split or unified) L1 cache now-a-days?
512KB? Or have they pushed it to 1GB?

Love to hear about this....

Thanks and have a great 2004,
v796

#7 January 14th 04, 09:39 PM

| L1, traditionally, is split, because you get better performance if you
| can fetch an instruction and some data in the same cycle. Its probably
| easier/faster to have split caches than a multi-ported cache.

Also, instruction caches are typically read-only.

Ahh, non-refillable caches, you might want to get a patent on that..

and that makes a
slight difference to their optimal implementation, which is more
important at L1 than L2.

| Are there instruction sets that necessitate a
| need for an L2 I-cache (either as split or part of a unified L2
| cache)?
|
| Instruction sets don't require any cache at all. The only reason to
| add a cache is to increase performance.

There are ISAs where the cache is semantically visible to even
unprivileged programs and, in some of those, a cache can be needed
for functionality. I don't know of any that need a separate L1 and
L2 cache, though.

Indeed.. I implemented such an architecture a few years ago. There are
significant performance increases to be had, as well as deterministic
performance for multi-threaded programs:

http://www.cs.bris.ac.uk/Publication...le%20Computing

I wasn't aware that there were too many commercial archs offering such
control. Any references?

Cheers,
JonB

#8 January 15th 04, 12:46 AM

On 14 Jan 2004 13:21:51 -0800, (v796) wrote:

Hi,
Sorry a mistake in my first question....The correct question is below.

An L1 cache is split(I,D) or unified. Most(All?) of the time an L2
cache is unified. Are there any processors out there which use a
split/unified L1 cache backed by a --split-- L2 cache?

I.e. something far closer to a classic Harvard architecture than we
typically see!

While I certainly am no expert at processor/memory designs, I know of
no current implementation that does this, for fairly good reason:

The cache-main memory interface is quite expensive in terms of
components; if you can size the L1 cache so that the working set (and,
btw, ignore the remarks about code not fitting in the small L1 cache,
because you don't want either the whole program or the whole
instruction set in the cache, you just want the main working set) your
I-cache misses will be proportionately few compared to your D-cache
misses. Thus spending the money for a separate L2/memory interface
just for the code is hard to justify compared to simply expanding the
L2. The key is that most software has pretty good locality of
reference (at least, assuming the compiler is competent) so each I
cache line contains a high proportion of code that *will* be required,
compared to each D cache line, which frequently does not.

Oh, and the P4 Xeon has a 1MB L2, and the Itanium-2 has up to a 6MB
L3, which looks like an L2 from many angles: 16K I + 16K D L1, 256K
L2, 6MB L3.

Sincerely,
v796.

Malc.

#9 January 15th 04, 02:00 AM

Jan Other observation: There are several implementations on the market that
Jan will fetch some data from L2 only (FP, in the cases I know)

Which implementations? I know of

MIPS R8000
Itanium (#1, that I know of)

Do you know of others?

I'm curious because all this basic cache talk has motivated me to write
a wikipedia page about CPU caches. It'd be nice to get history like this
right.

#10 January 15th 04, 01:27 PM

In article ,
Jon Beniston wrote:
| L1, traditionally, is split, because you get better performance if you
| can fetch an instruction and some data in the same cycle. Its probably
| easier/faster to have split caches than a multi-ported cache.

Also, instruction caches are typically read-only.

Ahh, non-refillable caches, you might want to get a patent on that..

Sounds like the mask-programmed ROM on old microcontrollers.
It's been over 20 years; you can use the idea for free.

--
John Carr )

Thread Tools
Show Printable Version Email this Page
Display Modes
Linear Mode Switch to Hybrid Mode Switch to Threaded Mode

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
How to set Level 2 cache for 1mb and 2mb CPU's!	Chris	General	3	March 1st 05 04:00 PM
How to set Level 2 cache for 1mb and 2mb CPU's!	Chris	Homebuilt PC's	3	March 1st 05 04:00 PM
Athlon 64 queries	Spiro	AMD x86-64 Processors	11	September 19th 04 01:30 AM
P1-P55T2P4 L2-cache - memory upgrad question?	viktor weisshaeupl	Asus Motherboards	2	June 4th 04 06:42 PM
AIW question and PowerColor question	GTX_SlotCar	Ati Videocards	2	January 22nd 04 06:23 PM