Power consumption compared

Power consumption compared

As many readers of aceshardware.com forums have already read, I have measured the power consumption of a number of recent CPU-s. And a few not-so-recent ones, for a historical reference. This time I collected all of the more interesting data into easily understandable graphs, unlike the first report which was interesting reading to electrical engineers only or something like that.

In the following graphs I have used a bit more “technical” names for AMD CPU-s because I didn’t have all the interesting chips available to me. The only K8 chips currently available to me are Athlon 64 2800+ with the original C0 revision Clawhammer core (AP suffix), Sempron 3100+ chips based on Newcastle core (AX suffix) and a Sempron 3000+ presumably based on a Winchester core (BA suffix). So, I just made a bold assumption that perhaps all the different K8 chips based on the same core consume roughly equal amounts of power assuming the fequency and I adjusted the Sempron 3100+ voltage to be equal to the ones used on actual A64 chips to extrapolate the consumption on those.

So, I ran my Sempron 3100+ which is meant to be run at 1.4V core voltage also on 1.5V core voltage to get an idea what an A64 2800+ CG S754 could perhaps consume. Then I increased the core fequency to 2.0GHz and 2.2GHz to get a possible approximations of what 3000+ and 3200+ S754 chips could consume. Yes, I know Sempron has half of L2 cache disabled and no, I do not know how that affects power consumption. Somehow I don’t really believe they can manage to turn off power from the other half of the cache. This belief may be wrong and I am not going to force anyone to believe anything they don’t want. I certainly will test true A64 chips when I happen to see some. I will then update this report.

K7-TbredB 1.6GHz 1.5V is actually a Duron 1600 running at its stock voltage and frequency.

K7-TbredB 1.8GHz 1.6V is unfortunately a half-assed try to get an idea what a TBredB based Athlon XP2200 would consume. The problem was again that I don’t have Athlon XP2200. Actually I don’t have any Thoroughbreds rated for running above 1.5V near me. So, I overvolted and overclocked the one I did have. Fortunately, TbredB-s were unlocked, so the only thing keeping me from getting an almost real AXP2200 was that my motherboard (KT3-Ultra) just wouldn’t allow me to increase the core voltage more than 0.1V above its stock voltage. So I was limited to 1.6V even though true AXP2200 chips ran at 1.65V. On the other hand, I think it is a safe bet that for all of the latter TbredB batches AMD could have lowered the stock voltage down to 1.6V if they had wanted. In other words, I think this AXP2200 “experiment” is indicative of what AMD bulk 130nm process was capable of with a K7 core.

K7-Palo 1.4GHz 1.75V is a genuine Athlon XP 1600+ Palomino.

P4B 2.66GHz 533MHz FSB is of course Northwood, P4E and CeleronD chips are Prescotts.

I listed all K7 parts twice because they can be used in two different modes when idle. The blue bars represent what the chips were designed to consume when idle in Windows. In this mode the CPU would enter a “stop grant” state when Windows Idle process issues a hlt command. The state has sometimes been called “bus disconnect”. The problem seems to have been that most mobos were designed so that their CPU power regulator was fed from +5V PSU and most PSU-s and mobos couldn’t take the wild power consumption changes happening on this +5V rail (check out the max power consumption of a Palomino and see the huge difference). It was sometimes also said that earlier K7 chips had somehow “buggy” “bus disconnect” mode and mobos can only enable “bus disconnect” with Bartons. IMHO that was just a hogwash and when Bartons bacame commonplace, the bus disconnect was still not implemented on boards. I.e. I don’t think the “bug” was real, I think the proglem was only with wild fluctuations which plagued Bartons just as much. The red bars represent the K7 behaviour with the usual, default installations of Windows, Linux etc. in which case the CPU is not manually allowed to enter stop grant state at idle.

The small “2.2W” bars added to K8 chips represent the power that is supposedly consumed by the integrated northbridge. I have not measured this, it is mighty difficult to measure it without doing some serious excercises on a mobo using a soldering iron, because this power is drawn from a +3.3V rail among with all the other chips present on the motherboard. This number has been taken from AMD datasheets and for once, I tend to belive that this number might actually be somewhat close to the truth.

One interesting thing that somewhat suprised me when testing K8 and P4 boards. P4 chipsets actually consume the least amount of power. VIA chipsets which have two chips besides the integrated northbridge come a very close second, but most surprisingly, Nvidia nForce 3 chipset, with its one sole chip beside the integrated northbridge, actually consumes a whole lot more than all the others. Especially on boards which use linear regulators to feed the northbridge. My own Chaintech VNF3-250 with a low power videocard consumes almost 50% more from the 3.3V rail than Intel branded i865 board with integrated video (which consumed 10W from 3.3V rail using a single 512MB DS stick). MSI branded KM800 board with integrated video came very close to Intel board though (also with a single 512MB DS stick).

I generated the load using a BurnK7 program. The reason for this was that it generated the highest load of the programs that I could easily take with me to other peoples’ computers. Some people have suggested Prime95 for generating a load, but would be very surprised if it made the CPU to consume any more than BurnK7 does. BurnK7 is a part of BurnCPU package I got from the net. BurnP6 didn not produce as high a result, but was very close.

It is an interesting excercise to compare these numbers to published TDP figures. P4E and Palomino chips seem to come extremely close to their advertised TDP figures while K8 actually consume WAY below their advertised TDP. It seems as if the manufacturers sell CPU-s that consume the same amount as their advertised TDP only when they are in deep trouble as far as power consumption goes. AMD was in their deepest trouble in Athlon Thunderbird 1.4GHz and Palomino era. Intel now.

Still, I think AMD has gone way overboard in overspecifying their TDP.

A thought about AMD SOI transition

It has been talked about whether AMD process transition to SOI was “worth it”. That I wanted to get a good perspective on it was actually the reason why I very much wanted to have an 1.8GHz TbredB AXP2200+ in this comparison. Right now practically all of the AMD Newcastle chips seem to be able to do 1.8GHz using just 1.4V. I think that represents a mature 130nm SOI process. The Athon XP1800+ Thoroughbred B chip I used to estimate AXP2200+ power consumption was JIUHB stepping if I recall correctly, one that was highly regarded by overclockers. Thus I think it is good for representing a mature AMD 130nm bulk process. I was also forced to run it at slightly lower voltage than AMD spec sheets specify for 1.8GHz AXP parts, but I think it is a good guess that AMD would have sold them happily as having a 1.6V stock voltage at 1.8GHz if they had seen any reason for going through the trouble of updating all the spec sheets yet again.

Anyway. With 1.8GHz K7 -> 1.8GHz K8 change leakage seems to have remained largely the same if we compare 1.6V K7 to 1.4V K8. Load power (i.e.dynamic) consumprion seems to have gone down ca 10% even if we compare it to 1.5V 1.8GHz K8 chips (essentially comparing mature bulk to immature SOI) and gone down A LOT (46%) if we compare mature bulk to mature SOI.

Performance went up considerably at the same time: the ratings increased from 2200+ to 2800+. Whether the increase is justifiable I leave to judge by others, but my personal opinion is that yes, K8 performs a LOT better, easily enough to justify this increase in ratings.

So, power went down a lot and performance increased a lot. I think it sounds almost exactly to why SOI was pursued in the first place. Yes, I think SOI was well worth it.

A thought about 90nm process and leakage

I was very eager to find out the leakage currents of AMD’s 90nm SOI process. Many people have argued that 90nm must leak more than a 130nm process. It seems now that they are actually remarkably similar: at 1.1V Sempron 3000+ seems to leak ever so slightly less than a Sempron 3100+ at 1.1V, at 1.4V it is the other way around. But in both cases, the difference is really small. And 90nm K8 seems to consume considerably less dynamic power, as evidenced by a lower power consumption under load.

A thought about passive cooling

At 1GHz, 1.1V the new 90nm Semprons (and probably other chips with Winchester core too) consume remarkably little power. Under the highest load I was able to muster – the one generated by BurnK7 program – it consumes and dissipates only 12.5W of power – and that already includes the on-die memory controller. With real-life loads like video encoding and such the consumption is roughly 10W. Better still – the chips are actually capable of running at quite a bit lower voltages than what the P state specify, all of which will bring the power consumption lower still:

It should be very easy to cool such a chip passively – with the biggest problem being that there are no commercial passive heatsinks for A64 chips and making one by hand is more hassle than most people are willing to put up with. Just removing a fan from a cheap commercial cooler isn’t good enough because the very closely spaced fins don’t allow for the air to move between them as long as it is not forced to. Besides, the fins are usually in the wrong direction if the mobo is upright in the case. If the fins were vertical, even a cheap heatsink without a fan would probably be enough. The coolers I have seen have horizontal fins however (assuming the mobo is mounted vertically).

As for speed: this Sempron 3000+ running at just 1GHz actually encodes XVid movies at roughly the same speed as my old XP1600+ did on a KT266A board (but with 1/10..1/6-th of a power consumption!!!!). With playback it is the same story. It is certainly MUCH faster than those 1GHz Athlons and P3-s were thanks to an on die memory controller and those fast PC3200 memories with 1T command timings we have now.

A few words about measurement technique

I made myself a modified PSU, where I measure a DC voltage drop across the DC resistance present in smoothing inductors that are already in place in the PSU. I have RC circuits to smooth out all the voltage ripple so I can just measure an DC voltage drop and would not annoy my millivoltmeter with AC components.

I then measured the DC resistance of those inductors by placing various loads on three power rails and measuring the voltage drop across the inductors. Then calculated the resistance based on a known load resistance and measured voltage drop.

After that I measured all the other important bits that would consume current from the rails that interested me most: +12V and +5V. +5V rail was only interesting with K7 boards because my K7 boards consumed the CPU power from +5V rail. In these cases I only had to measure what HDD consumed from +5V rail. As long as I didn’t have PCI cards in the system, nothing else seemed to consume anything really from +5V rails. To single out current going to CPU from a +12V rail, I also had to measure current HDD consumed from +12V rail and also that of all the fans. The fans were different and so I measured all the fans separately whenever I measured the other stuff. Just pulled out the CPU cooler fan to see how much the consumption dropped, then inserted it right back. The fan in a PSU itself I had of course measured beforehand.

After subtracting all this “other stuff” from the currents I measured from PSU rails, I also had to take into account the efficiency of CPU power regulator. I assumed it to be 84% efficient. No, I have no real experimental basis for this, only a lot of reading of a few powerregulator datasheets. I think it is quite a good guess. If the efficiency is actually higher, then my measured results should be corrected upwards slightly, but more so for load power figures.