|
Athlon, Core 2 architectural efficiencies compared - TechAmok
Athlon, Core 2 architectural efficiencies compared - [hardware] 05:49 PM EDT - Oct,29 2008 - post a comment If you go shopping for sub-$100 processors right now, you'll be faced with
two main contenders: Core 2-based Intel chips and AMD Athlon X2s. The Athlons
typically have higher clock speeds for the price, but their Intel rivals often
perform better. Why?
Real World Technologies has published an article that take an in-depth
look at the Athlon and Core 2 designs. Rather than delve into the obscure
architectural details of the two offerings, RWT used apps called VTune and
CodeAnalyst to poke around under the hood and get hard numbers for things like
instructions per clock, branch predictor accuracy, and how the chips handle
their L1 and L2 caches:
Along the way, we learned several other lessons. First of all, performance
analysis tools can be very tricky - when we tried to measure instruction cache
accesses we got inconsistent results between different tools. This suggests
that perhaps the two tools were measuring slightly different events. On top of
that, we also got inconsistent results between different runs with the same tool
- which just suggests that the tool in question is flaky.
This brings us back to the original point of the article - explaining the
disparity in performance between the two contenders for mainstream client
systems. The first thing that leaps out is the huge difference in branch
prediction accuracy. The K8 mispredicts twice as often - and each mispredict
probably ends up squashing around 50-72 instructions (depending on the
occupancy of the re-order buffer). So for every 1000 instructions retired, the
K8 ends up squashing around 450-600 instructions due to branch mispredicts (9
MPKI). In contrast, the Core 2 is much much more efficient, squashing between
280-400 instructions for every thousand (re-order window is probably between
70-96 with 4 MPKI). The impact on performance is huge, since each time the
pipeline is cleared, somewhere around 50-100 cycles worth of work is waste -
that translates to 20-100ns per mispredict. The energy costs are just as
substantial - each extra second that the CPU is active consumes around 60W.
The other likely performance culprit is accesses which miss in the L2 cache.
While memory accesses are rare - around 2 MPKI for Core 2 and 4 MPKI for the K8,
the latency is huge, between 120-200 cycles (or higher if there is lots of
contention between pending requests). Unlike a branch mispredict, this latency
can be hidden to some extent - for instance, often times multiple cache accesses
will be initiated in parallel, or other instructions that are independent of the
cache miss can be executed, or even a cache miss could occur during a branch
mispredict - then when the CPU has started executing again, the data has arrived
already. Even assuming optimistically that half the memory access latency can be
hidden, that still leaves 60-100 cycles of stalls per cache miss.
One of the interesting factors is the substantial difference in miss rates
between the two cache designs, which is influenced by the underlying memory
subsystems. Intel's unloaded memory latency is around 55-60ns, while AMD's is
closer to 40ns and should also scale much better under load. Unfortunately,
there is no data available on the loaded latency for the respective CPUs, but a
reasonable guess would be that Intel's loaded latency is 40-70% higher. Given
that guess, we can come close to estimating the average latency contribution
from L2 misses. Intel has half the number of misses (2 vs. 4) per thousand
instructions retired, but 40% higher latency. That implies that Intel's average
memory latency contribution from L2 misses is 75% of AMD's (or 80% if we assume
Intel's L2 latency is 70% higher). Of course, this is only looking at one
aspect of the situation - it ignores the impact of the L1 caches, where AMD
tends to have an advantage due to larger capacity. But it's certainly an area
that could contribute to the performance difference between the K8 and the Core
2 and definitely does contribute to the power differences.
|
|
Add your comment (free registrationrequired)
Short overview of recent news articles |
Apr,19 2024 You Deserve this much OLED - AORUS CO49DQ Apr,19 2024 Unreal Engine 5.4 looks ULTRA PHOTOREALISTIC Apr,18 2024 Radeon RX 5700 XT vs. 7700 XT, 2024 Revisit Apr,18 2024 I Will Build You a PC Right Now! Apr,17 2024 These games carry REAL security risks! BEWARE! Apr,17 2024 Visible First to Offer Annual Payment Plan, with Discount up to 26% Apr,17 2024 Is Coding Still Worth Learning in 2024? Apr,17 2024 All New Atlas - Boston Dynamics Apr,16 2024 The NEW Chip Inside Your Phone! (NPUs) Apr,16 2024 XPS 14 vs 14" MacBook Pro - Apple just KILLED Intel! Apr,15 2024 The Most 2024 Laptop - Razer Blade 14 Review Apr,15 2024 NEVER install these programs on your PC... EVER!!! Apr,14 2024 Use Live Translate on Galaxy S24 series to translate a call's Apr,14 2024 I Tried a Non-Invasive Blood Sugar Watch. Miracle or Scam? Apr,13 2024 Samsung Galaxy Ring - This Just Got Interesting Apr,13 2024 Piracy Is Over Party - WAN Show April 12, 2024 Apr,13 2024 Conan O'Brien Needs a Doctor While Eating Spicy Wings Apr,13 2024 Beatbox Jcob recreats every sound Apr,13 2024 Intel is Gunning for NVIDIA Apr,13 2024 Building a Budget DIY Home Surveillance System Apr,12 2024 Lenovo Yoga Buyers Guide - What's the Best Thin and Light Laptop Apr,11 2024 DARK MATTER Trailer (2024) New Sci-Fi Movies 4K Apr,11 2024 How to Build a PC, the last guide you'll ever need! (2024 Update) Apr,10 2024 Intel 300 CPU Review - The Pentium Replacement is Finally Here... Apr,10 2024 Wubuntu, the Dubious Linux Windows Apr,09 2024 A Lite Version Of Windows 11 To Be Released This Year Apr,09 2024 This $150 Smartphone might be All You Need Apr,09 2024 I Can't Believe These are Real - Reacting to Ridiculous PCs on Apr,08 2024 A new video shows AirPower prototype charging an Apple Watch Apr,08 2024 Google Deleting Incognito Data, Intel $7B Foundry Loss, $350+ Curved Apr,08 2024 20 COOL GADGETS YOU SHOULD SEE Apr,08 2024 New HTTP/2 vulnerability leaves servers in danger of devastating DoS Apr,07 2024 3D Printed PC Fan Test: Does the Anti-Stall Ring Boost Performance? Apr,06 2024 The Greatest GPU of All Time: NVIDIA GTX 1080 Ti & GTX 1080 2024 Apr,06 2024 Top NEW RELEASES on Netflix in APRIL 2024 Apr,05 2024 Magician vs Slow-Mo Camera (Skill Challenge) Apr,05 2024 Re-Ranking All Current GPUs From Worst to Best (2024 Update) Apr,04 2024 Ripple to ISSUE STABLE COIN utilizing XRP AUTO-Bridging Function Apr,04 2024 HW News - Intel Battlemage Appears, Open Source GPU, Xbox Handheld Apr,03 2024 Vivo X Fold 3 Pro Hands-On: The New Best Foldable Hardware
>> News Archive <<
| |
|