Wednesday, February 07, 2007

Intel's 2007 Processors: Mist In The Morning Sun

After more than two months of speculation about Intel's 2007 processor offerings it seems that the mist is finally clearing. However, what is left behind is much less than was suggested. It is now clear that the speculation greatly overestimated what Intel would be capable of delivering.

Most of the talk about Intel's 2007 processor offerings grew out of two sources: the VR-Zone roadmaps and glowing reports about Intel's 45nm process. According to VR-Zone, Intel was to release the 45nm quad core Yorkfield chips in Q3 07 along with the 45nm dual core Wolfdale. As seen in the chart, Yorkfield would clock from 3.46ghz to as high as 3.73Ghz. Likewise, Wolfdale would clock from 3.5Ghz to as high as 4.0Ghz. Clock speeds that high would have surpassed anything AMD could have offered and given Intel a tremendous lead. Any scepticism about these clock numbers was quickly brushed aside as people waxed almost poetic about Intel's 45nm process. Supposedly it would have allowed huge increases in clock speed while also drawing less power. The typically stated cornerstone of Intel's 45nm process was high-K, or at least, this was the only item that Intel had that AMD did not. Adding to this, AMD's 65nm Brisbane chips were not readily available in the channel in Q4 06. When they did become available in the channel they were at lower clocks than the existing 90nm chips. So, the chatter immediately moved to talk about problems with AMD's 65nm process and even speculation that Barcelona wouldn't arrive until 2008. Not surprisingly, when Intel recently announced that it had actual working chips with high-K, the press reports were overflowing with praise. The second half of 2007 looked to many like it would be a landslide success for Intel with performance even further ahead of Barcelona than C2D was ahead of K8.

However, in the latest roadmaps, all of that is gone; Intel's amazing 45nm speed has completely vanished. The highest clocking Yorkfield is now only 2.4Ghz and Wolfdale tops out at 3.0Ghz. Worse still, even with these reduced speeds, 45nm has been pushed back to Q1 08. And, adding insult to injury, even Anandtech now agrees that AMD's Barcelona quad core will begin production in Q3 07 at 2.5Ghz. This makes sense for a lot of reaons however it seems that history and common sense are often casulties of processor enthusiasm. For example, when Core 2 Duo arrived, it probably seemed that Intel's Midas Touch had returned. The problem with this view however was that C2D's improvement was not unique. AMD had a similar increase in IPC with K8 and it now appears that Barcelona will have a similar increase in IPC over K8. SSE was the only area where C2D saw phenomenal increases in speed and this was done by doubling the width of all the necessary hardware. However, Barcelona's FP/SSE hardware will receive a similar stretch in width. If Barcelona is any slower than C2D at the same clock, it won't be by much. Rather than just being a simple upgrade, it has now been indicated that K8 was completely resurveyed and many improvements were made all over the architecture including branch speculation, out of order loading, stack operations, bus access scheduling, prefetch, and instruction decoding. Barcelona is therefore obviously a new core in the same way that K8 was. Presumably, Barcelona would then be the K9 or K10 core but AMD has not indicated this yet.

So, 2007 is now shaping up to be pretty much even in terms of performance between Intel and AMD. And, 2008 shouldn't be much different since IBM/AMD also have high-K and will use it on 45nm just as Intel will. The talk about what Intel would offer in 2007 looked like giants in the mist. And, now that the mist is clearing we can see that Intel's processors don't stand any taller than AMD's. It looks like we are going to have some very nice processors from both AMD and Intel in late 2007 and early 2008. It looks like both will have good Integer and SSE performance with very similar IPC and power consumption. This should be great for customers of both companies. Nevertheless, this still seems to be unacceptable to a lot of enthusiasts. Perhaps this is why the talk now seems to be about AMD's supposed poor finances and speculation about bankruptcy. I'm sorry but AMD is not going to go bankrupt while introducing a new architecture, new mobile chipset, and being able to compete better in the commercial segment with its desktop, integrated chipsets.

An interesting final note is how popular this blog is with Intel employees. For example, AMD employees are no more likely to read my blog than employees of Pratt & Whitney or General Electric. However, I get 20 times as much traffic from Intel employees. In fact, Intel employees are the single largest group of readers. I've seen this blog dismissed a number times as blatantly pro-AMD FUD by various people (such as the Tommies at ForumZ). However, I do have to wonder then, if this blog contains so little valid information, why readers from Intel keep coming back.

224 comments:

«Oldest   ‹Older   201 – 224 of 224
Ho Ho said...

thekalif
"Physics says that as you increase speed you increase heat."

So it is. But why do you think Intel rates its 1.6-2.4GHz quads at same TDP?

thekalif
"There is a link to some seti numbers using what I assume is a Barcelona 2P system.

It basically destroys a 2.66GHz Clovertown. "


People please learn to read the data before you say something like that. Do you have any ideas for how long does this machine has to be runnig to get that amount of credit?

Christian H. said...

FAB 36's 65nm capacity will be absorbed by mid range Brisbane X2's although Chartered will help with this; and with 65nm mobile which is desperately needed. Chartered may build some of these too but it is not likely at all that Chartered will build K10 in 2007.

I expect AMD to cover the server segment fairly well with K10; think in terms of 25-50% coverage rather than robo's ridiculous 1% estimate. I'm also certain that AMD will cover FX since it is very small volume, however I doubt K10 will be available for much on the desktop except at the high end. AMD won't start catching up with K10 production until Q2 08. By catching up I'm referring to K10's being available in the desktop midrange.

It wouldn't make sense for 45nm production to start on FAB 38 since the tooling wouldn't even be qualified on 65nm at that point. 45nm production will start on FAB 36 with FAB 38 qualifying its tooling in early 2008 to replace 90nm production with Brisbane versions of single core Athlon 64 and Sempron. This would then allow the last of the 90nm tooling to be removed. FAB 38 should be ramping capacity until end of 2009.



I have to disagree. If AMD takes any of Fab 36 offline it would be a bad move. They have already said that Fab 38 is begining it's conversion to 300mm 90nm. That will cover 2/3 of Fab 30s output.

With Fab 36 online that measn they can supply the same amount of chips. I see the next move as taking another 1/3 offline for 300mm 65nm.
By then Chartered will be shipping 65nm for revenue. That will provide neary twice the chps they are producing now.

Once 65nm comes online it would be much more efficient to take 1/3 of Fab 38 to 45nm equipment. I mean they can take down part of Fab 36 for 45nm, but why do that as you're ramping 65nm?

All of the equipment in Fab 38 has to go anyway. Why not just use that advantage and move the 45nm there first?

I also don't see them waiting for 08 to ramp Kuma, espcially since Kuma is half the size of Barcelona. Mind share is nearly as important as market share.

Of course I am just speculating but I seem to figure them out.

Barcelona SPEC numbers next month is my bet.

Christian H. said...

"Physics says that as you increase speed you increase heat."

So it is. But why do you think Intel rates its 1.6-2.4GHz quads at same TDP?


People please learn to read the data before you say something like that. Do you have any ideas for how long does this machine has to be runnig to get that amount of credit?



Process tweaks allow similar envelopes as you increase clock speed. The pont still is that just clocking up to 3GHz will drive up power somewhat.

I doubt if they can keep the 800MHz range as you go higher.

As far as the SETI numbers, I woul dadmit that I have no idea how they rate those scores but it was interesting.
I guess you're then saying that the site that posted those wanted AMDs chip to have an unfair advantage by running longer?

I would say those are the first signs of the Beast.

Ho Ho said...

About those results. here are links to those two systems. If you compare their daily average credit numbers to Core2Quadro systems you can see the later ones are about three times as high:

unknown 4-core AMD at 1,377 credits per day
qx6700 at 4,419 credits per day.

Those AMD's are probably just 4x4 systems with some weird settings so that Boinc can't read the information correctly. Also please note the date when those two AMD owners started contributing to SETI.

Ho Ho said...

thekhalif
"Process tweaks allow similar envelopes as you increase clock speed"

Yes, that's why Intel is able to relese 50W TDP 1.6GHz quadcore this quarter. What makes you think they can't put that upcoming 3GHz quad to the same level of current 2.66Ghz quad?

Here are the other computers from one of the unknown AMD owners. That unknown PC has earned a total of around 150k credits. At 38.5k per month it takes nearly four months to get that amount of credits. I'm not sure what kind of Opterons there are but they are surely rather old.

Also you can see that this should really be a real 4P machine. For dualcores they say 1(2), that means one socket, two cores. For that AMD there is simply 4.

Christian H. said...

Yes, that's why Intel is able to relese 50W TDP 1.6GHz quadcore this quarter. What makes you think they can't put that upcoming 3GHz quad to the same level of current 2.66Ghz quad?

Here are the other computers from one of the unknown AMD owners. That unknown PC has earned a total of around 150k credits. At 38.5k per month it takes nearly four months to get that amount of credits. I'm not sure what kind of Opterons there are but they are surely rather old.

Also you can see that this should really be a real 4P machine. For dualcores they say 1(2), that means one socket, two cores. For that AMD there is



But remember that those kind of tweaks increase the price equally. So if they do release a 3GHz C2Q at less than 150W, it will carry a premium and WON'T be a high volume part.

If it was that easy they would have done it already.

Ho Ho said...

thekhalif
"So if they do release a 3GHz C2Q at less than 150W, it will carry a premium and WON'T be a high volume part."

Highest end CPUs are never high volume. Same goes for energy efficient models.

Intel has released lots of Xeons.

First, there were dualcores. From 1.6-3GHz, all but the highest performing one were 65W and the 3GHz one was 80W. Later there were LV versions, 2.33GHz at 40W.

Next came quadcores. From 1.6-2.66. 1.6-2.4 were 80W and 2.66GHz was 120W. That means that for 2.33GHz quadcore, two 65W CPUs together was rated at 120W (E5345), not 130W and those weren't even LV versions. Another interesting Quadcore is X3220. 2.4GHz and 105W. That is a bit faster than that E5345 but with even lower TDP (2x65=105?).

That is why I'm confident that 3GHz quadcore won't have massively increased TDP.

thekhalif
"If it was that easy they would have done it already."

Why would they need that 3GHz quadcore? They are already dominating 1P/2P market. That quadcore will probably come just before Barcelona becomes availiable. Before that all they are "fighting" are themselves, AMD dualcores and 4P machines. The latter are quite a bit more expensive than Intel quadcore 2P's.

Unknown said...

I'm sure scientia would be posting this here soon anyway, but from AMDZONE, we have

http://www.boincstats.com/stats/host_cpu_stats.php?pr=sah&st=0&or=10

This shows average output per processor and is rated by that number. As we can see, the Unknown Opteron beats the closest competing xeon by about 25%.

The question is, what type of operation does a Fast Fourier transform classify as? I don't program, so I have no idea if it's Integer, Floating Point, etc.

Unknown said...

Messed up some numbers there. The processor at the top of the linked page is performing on a per core basis, better than the xeon 5350 @ 2.66ghz, a quad core part. I'm assuming this comparison is at a per core basis, the only way an fx70 would get up to the top. Also, the fx 74 is only 13% faster than said fx70, so I'm doubting it's the unknown opteron at the top there, seeing as that is 31% faster than the fx70.

I'm gonna have to say this can only either be barcelona, or someone who got really lucky and managed to run a 3600+ 65nm proc at 3.8ghz-ish (since those seem to be the highest clocking AMD procs yet, but would still be ridiculous). I guess you get to pick what you want to believe.

Unknown said...

*correction, these scores are based on a per processor basis, not per core.

Ho Ho said...

greg, please read some of my posts a few posts up. That thing is 4P box with singlecore opterons inside and actually performs 3x slower than 2.66GHz Core2Quad. Also that unknown Opteron has gathered those results for almost four months.

Unknown said...

Hoho, the total credit and average credit don't add up.

However, I'm starting to see that I'm probably misinterpreting some of that data. But, I really can't figure out how to interpret it anymore, being as the only way that makes perfect sense doesn't add up.

Let's just wait for scientia on this one. It's from his site, and there is likely good reason for it being there.

Ho Ho said...

I gathered a bit of information on one of the unknown AMD's into this post.

Do you really think that core-to-core Barcelona at 2.6GHz is only 16% faster than CoreDuo (not core2) in MacMini?

Scientia from AMDZone said...

robo

Okay, your less than 1% number for servers is for the entire year. However, this is still too low.

The figure I gave, 25-50% is production volume in that segment by years end. So, in other words, 1/4 to 1/2 of all server chips manufactured will be K10 by years end. This would correspond to 1/16 - 1/8 of all server chips in 2007, about 6 - 12%.

Intel's Woodcrest volume by years end was around 70% so my estimate is conservative rather than optimistic.

Robo, your estimate of less than 1% would correspond to about 4% total production volume by years end. AMD could hit this target if they didn't start making Barcelona's until October (instead of now as is actually the case). I'm sorry but your estimate is way too low.

Scientia from AMDZone said...

enumae

I'm sorry but you are confusing me.

Northbridge - memory controller plus direct access to video. AMD doesn't have this because the memory controller is on the CPU and access to video is via HT.

Southbridge - floppy, CDROM, and harddrive interfaces; USB and firewire ports; IDE, PCI, PCIe expansion slots. All motherboards (AMD and Intel) have this. The only exception would be some embedded applications where limited I/O is on the CPU.

Scientia from AMDZone said...

This is what I said on AMDZ about the seti numbers.

I've looked over the numbers several times. The simplest explanation would be that the AMD number is invalid for some reason (seti has been hacked before). However, if we assume that it is valid then we basically have:

K8 overclocked to 3.8Ghz. As far as I know no one has ever done this (Ghost?). And, given the running time, I think we can rule out exotics like dry ice. The well known K8 "cold bug" would seem to rule out things like LN2.

K10 clocked to 2.6Ghz showing a 43% increase in IPC.

K10 clocked to 2.9Ghz showing a 29% increase in IPC.

K10 overclocked to a modest 3.1Ghz from the stock 2.9Ghz.

A 2.9Ghz K10 would still have to be considered an exceptional sample since these aren't due out until Q3.

Ho Ho said...

Here is a WU that has been calculated by one of the unknown AMD's and regular x2:
http://setiathome.berkeley.edu/workunit.php?wuid=114756486

Results from each:
x2@2GHz
http://setiathome.berkeley.edu/result.php?resultid=479218991
unknown@2.45GHz
http://setiathome.berkeley.edu/result.php?resultid=479892924

Here are the host information for both:
x2:
http://setiathome.berkeley.edu/show_host_detail.php?hostid=2548994
unknown:
http://setiathome.berkeley.edu/show_host_detail.php?hostid=2877462

When I compare the time it took for both to crunch through the WU, the measured integer and FP performance and clock speed it seems like that unknown thing is just a regular K8. Reason why it is unknown is probably that it is an 4x4 system and Boinc just an't recognise it.

Also remember that Boinc isn't multithreaded, number of cores has no effect on single WU.

Reason why everyone got so exited about the results was that most people just can't interpret the numbers on that site. In fact, on that boincstats page, there is no numbers you could use to compare the CPU's. You must start comparing specific work units to get any kind of useful information on how CPUs perform.

Ho Ho said...

Btw, those two machines have (FP perf)/(integer perf) difference of about 1%.

Scientia from AMDZone said...

ho ho

Okay, so you are assuming the indicated clock speed is wrong as part of the cpu misidentification.

The numbers would be proportionate if the unknown is clocked at 2.8-3.0 Ghz.

enumae said...

Scientia
I'm sorry but you are confusing me.

I am not sure what to tell you except that the new AMD 690G chipset has both a North Bridge and South Bridge.

Tech Report

Ho Ho said...

scientia
"The numbers would be proportionate if the unknown is clocked at 2.8-3.0 Ghz."

... or when the unknown has much better RAM compared to the slower one.

You have the links, you can search for other WU's and compare with the other CPUs listed there. I've just picked some random results.


Boinc gives credit for the amount of FLOPS the WU requires to be computed. WU's that require more FLOPS give more credit. How fast you compute the result is irrevelant. P2 crunching for days and C2D doing the same WU in a couple of hours will both get the same amount of credits.

Now, the WU I gave as an example was ~62.4 credits and that unknown machine computed that in ~13.6k seconds.

Here is another WU with about the same number of credits. This WU is also computed on my 2.93GHz e4300. Thing is, my CPU took only 8.5k seconds to crunch through the same amount of FLOPS. That means my C2D @2.93Ghz is around 60% faster than that unknown AMD.

So, anyone still believes it is Barcelona they see there?


As for the norhtbridge, main difference is that in AMD case it doesn't connect to RAM and it doesn't have MC inside it. It connects southbridge and GPU to each other and is connected to the CPU via HT link.

Scientia from AMDZone said...

enumae
I am not sure what to tell you except that the new AMD 690G chipset has both a North Bridge and South Bridge.


I'm sorry but those are not northbridges. The 690G is a GPU and has nothing to do with accessing memory. Yes, I know that TR uses the term, "NorthBridge" but they are using the term incorrectly. Presumably, the author is just used to the idea that a chipset is composed of two chips: a North and SouthBridge.

It is possible that Intel has some chipsets which use a combined NorthBridge/GPU but this would be completely different as the cost and level of complexity for such a combined chip would be significantly greater.

Christian H. said...

I am not sure what to tell you except that the new AMD 690G chipset has both a North Bridge and South Bridge.



It's actually very simple. The North Bridge connects to the PCIe bus. Just look at a diagram.

abinstein said...

ho ho: "When I compare the time it took for both to crunch through the WU, the measured integer and FP performance and clock speed it seems like that unknown thing is just a regular K8. Reason why it is unknown is probably that it is an 4x4 system and Boinc just an't recognise it.

Also remember that Boinc isn't multithreaded, number of cores has no effect on single WU."


Good one. I was highly suspicious of the K10 credit claim and you make the important point that Boinc isn't multithreaded (but can be very well multiprocessed). I think most who thought this is K10 basically had bad mental against 4x4 and ruled out the possibility that it would perform better than C2Q.

«Oldest ‹Older   201 – 224 of 224   Newer› Newest»