OT, But interesting

Tim Schmidt computer_holic@hotmail.com
Mon, 07 Aug 2000 02:07:55 GMT


----
MHZ != RL performance increase. More precisely, I havent seen
chip clocking give a straight ratio of MHZ increase to RL performance, its
usually a curve and much lower then the percentage of overclocked speed.
Then again I dont necessarily trust benchmarks either, I have seen WAY too
many slanted test results. And manufacturers are very likely to squalsh
your site if you do post benchmarks.
----

No, but under Q3 with a GF2, the Duron 600 -> 950 ran ~40% faster.  
Real-world stuff there.

----
I havent really read up on DDR memory is DDR _kind of like_ RAID Striping
for memory? *needs to go visit Toms* Or does it actually use a 2x bites in
the bus width? (well it wouldnt be _quite_ 2x i wouldnt think because you
have 2 memory addresses.) I mean your memory is only 100MHZ, you can't go
faster then that but you can read/write ~2x as fast if you split it and
write to two simultaneously and 2x the bus width, but that still doesnt
increase the seek time for the actual memory itself. And you still end up
with a timing/processor latency problem.

Apple tried something similar to this around 95-6. I forgot what they
called it. but you put matched pairs pairs of dimms in the corresponding
A/B slots and you got about a 3-5% performance boost, but it didnt
increase the width of the bus..
----

No, DDR pushed 2x the bytes through the bus, here's how it works:

Normal SDRAM transmits on the up of the timing signal, DDR transmits on the 
up and down, doubling the bandwidth.  It's a 200Mhz bus using a 100Mhz 
clock.  In other words, it's really a true 200Mhz bus.

Some PC motherboard now offer memory interleaving to also increase the 
bandwidth (but if you have a double-sided DIMM, you only need one).  
Different implementations work better than others.  Some companies are still 
working out bugs, so your mileage may vary.

----
The whole Pentium line is based on a RISC-esque approach, basically the
Pentium series was a risc core surrounded by a pre-processor that broke
the cisc instructions down into risc instructions for processing. Which
may explain the internal core running at 100mhz and the outer core of the
processor running at a GHZ *ponders*
----

I meant RISC-esq in that the engineers did their best to decrease the 
complexity/transistor count of teh chip.  No, the core is what's running at 
GHz levels, the I/O (system bus, FSB, what ever your favorite term is) runs 
from 66-200Mhz depending.

--
But if your doing super-scalar design you would WANT to use risc-esque
instructions wouldnt you?
--

not nescesarily, super-scalar design only refers to the execution of 
commands in parallel by different units within the processor.

---
Everyone else in the industry the industry uses RISC processing, Cray,
Apple, Sun, IBM, HP, Digital, etc.
--

Just as everyone is using CISC which has actually become a sort of RISC/CISC 
hybrid, as you have said, CISC processors have not been truely CISC since 
the 586 era (specifically, a company AMD purchased -- can't remember the 
name -- pioneered the CISC->RISC translator technique which is why the K6 
series of chips had such stellar performance (on everything but FP))


--
Pipe-lining is faster, since if your trying to guess the next instruction
you have fewer to pick from thus increasing the odds you are going to
guess the correct one.
--

Yes, pipelining is faster, to a certain extent.  Current gen processors 
execute things out-of order (a part of super-scalar design) and because of 
this, if they make a mistake, or guess the answer to a command wrong, the 
entire pipeline has to be cleared, and re-started.  If you have a 2 stage 
pipelin, no prob.  even a 5 or 10 stage unit isn't that bad, but a 20 stage 
unit getting cleared and re-started all the time has the potential to 
severly cripple a chip, we better hope that Intel has engineered some damn 
good branch prediction algorythms.


---
RISC instructions are equal in size making it more efficient. (ie they are
all xx-bit instructions, not 8, 16 or 32 bit like cisc or x86
instructions). Which really doesnt come to play much in superscalar
processing but it does come to play in when your trying to get
the maximum utilization of your available bandwidth. (8-bit hunks of data
take the same bandwidth as 64-bit instructions unless your stacking
8 8-bit chunks together down the bus..I dont think anyone does that
though)
----

Yes, x86 chips can do that but it's still slightly slower.

--
The most important aspect of borrowing from RISC for superscalar design is
the instructions all execute in one clock cycle. You dont have the cisc
approach of one instruction, taking multiple clock cycles. Even if you
have variable length instructions, like 8, 16, 32 bit instructions with
super-scalar processing you can line them up and execute them
simultaneously. This is exactly what the G-4 does (borrowed slightly from
Cray via SGI) with the altivec instruction set. it will process 4 32 bit
instructions or 16 8-bit instruction simultaneously.
--

Yeah, x86 chips were doing this years ago with MMx, 3dNow, and SSE.  The 
Altivec unit does it better, but that's to be expected since it's a more 
recent design.



In short, I am not knocking RISC, --ALL-- recent processors of --ANY-- type 
use RISC technologies heavily.  However, Intel's implementation in this 
instance is not likely to prove anything but good marketing.  I will say 
however, that CISC/RISC chips have consistantly out-performed their Apple 
cousins.  Mhz for Mhz, the Apple will win any day, but when Apple's selling 
G4's @ 500Mhz, and AMD's selling Thinderbirds at 1.1Ghz.  I'm betting on the 
T-Bird.  I would guess that the 1Ghz T-Bird might beat a 500Mhz G4 by 
something like 30-50%.  Also, RISC programs are significantly larger and 
somewhat harder to write than their CISC equivalents (at least in 
assembler).  RISC has many advantages, but it also has many downfalls.  The 
only realy 'perfect' solution is a hybrid.  Also, Apple's G4 should no 
longer be considered truely RISC sincel the Altivec added how many 
instructions???  The G4 haas nearly as many as an Athlon w/ MMx and 3DNow!!!

I am quite familiar with all current desktop class processors, including the 
G4, and although on paper it appears to be better than the PIII or Athlon, 
Real-world performance/price says otherwise.

One final note, the PIII core is nearing it's 6th year of service, the PIII 
Coppermine is a derivitive of the PIII which is the same of the PII which 
goes back to the origional Pentium Pro core.  Yes folks with Coppermines out 
there, you're running massively tweaked PPro's.  On a related note, the 
Athlon core is about a year old.

--Tim


________________________________________________________________________
Get Your Private, Free E-mail from MSN Hotmail at http://www.hotmail.com