Friday, February 29, 2008

2008: AMD Still Trailing

Intel is still moving along about the same as it has been making slow, incremental progress from the time that C2D was launched in 2006. It is clear that the increase in speed from Penryn doesn't match the early Intel hype but nevertheless any increase in speed is just that much more that it is faster than AMD. Likewise the tiny speed increases from 3.0 Ghz to 3.16 Ghz (quad core) and 3.4Ghz (dual core) are no doubt frustrating for Intel fans who would like more speed. On the other hand, AMD currently has nothing even close..

AMD will probably deliver 2.6 Ghz common chips in Q2. This chart at Computerbase claims AMD will release an FX chip in Q3. I'm not so sure about this because everything would suggest an FX of only 2.8 Ghz. This is probably the lowest clock that AMD could possibly get by with on the FX brand. There is no doubt that AMD needs a quad FX because people who bought FX in 2006 were promised upgrades and none have been forthcoming. Such a 2.8 Ghz FX would probably be clockable to 3.0 - 3.1 Ghz (with premium air cooling) based on what I've seen of the B3 stepping. This is probably the best AMD can do for now as I haven't seen anything that would suggest that B3 can deliver 2.8 Ghz as a common volume. This means that the poor man's version of FX, Black Edition will probably bump up to 2.6 Ghz as well. Intel seems to be somewhat behind in terms of 45nm but this hardly matters since their G0 stepping of 65nm works so well. But there is no doubt that AMD will be facing more 45nm Penryns in Q2. The shortages of chips have shielded AMD somewhat from increased presssure from Intel during Q4 and Q1 (although with Barcelona delayed server share may take another hit in Q1). However, as Q2 is the lowest volume of the year AMD will have to be aggressive to avoid a volume share drop during that quarter.

Probably, Fuad is closer to the truth of FX saying Q3:

The Deneb FX and Deneb cores, both 45nm quad-cores, are the first on the list. If they execute well we should see Deneb quad-core with shared L3 cache and 95W TDB in Q3. If not, the new hope will slip into Q4.

The timeline for FX being Q3 or maybe Q4 is not surprising at all. What is surprising is the idea that AMD's first new FX chip would be 45nm. If this is true then this would support the notion that AMD has suspended development of 65nm. But it would be surprising if 45nm could ramp that quickly.

The question then is what will happen in Q3 as AMD faces a steadily increasing volume of Penryn chips. The rumors suggest that AMD will not try to release a 65nm 2.8 Ghz Phenom. I'm not sure if this would then indicate that the 65nm process would hit a ceiling or whether this is to suggest that AMD will pursue these speeds with 45nm Shanghai. Another question is what 9850 might be. 9750 is supposed to be 2.4 Ghz while 9950 has been suggested to be 2.6 Ghz. So, would 9850 be 2.5Ghz perhaps? The topping out of the naming scheme does lend some credibility to the idea that AMD will suspend 65nm development and try to move to Shanghai as quickly as possible. Nevertheless, there is a big, big question of whether AMD could really deliver a 2.8 Ghz 45nm Shanghai in, say, Q3. Ever since the release of 130nm SOI, AMD's initial clock speeds on the new process have always been lower so there is a lot of doubt that AMD could reach 2.8Ghz on 45nm any sooner than Q4 2008. Nehalem will almost certainly be too small of volume in Q4 to be much of a factor. So, it looks like AMD's goal is to somehow get clock speed up and this seems even less likely with a mid year switch in process unless with 45nm AMD exceeds all past SOI efforts.

Early 2009 looks pretty good for Intel since it will not only have Penryn and Dunnington but increasing volumes of Nehalem. It still remains to be seen if Intel really will give up its lucrative chipset business on the desktop with Nehalem. It certainly seems that it wouldn't take much effort to modify an AMD HT based chipset to work with Intel's CSI interface. That would seem to remove a lot of Intel's current proprietary FSB advantage. On the other hand, with ATI out of the way this would seem to be the best time for Intel to face more competition in chipsets. Still, this does leave things a bit up in the air. If it becomes easier and cheaper to design chipsets as it surely would be if CSI is similar to HT then VIA might become more competitive. For AMD's part there seems little they can do in 2009 except try to ramp the clock speeds on Shanghai.

We have three other issues to talk about: one immediate and two longterm. The immediate issue is Hester's interview at Hexus where he mentions the slow clock speeds of K10. Basically, Hester says that the 65nm process is fine; it is a matter of adjusting some critical paths. I've seen this statement heckled by some who insist that you can't separate process from design. Curiously, these are the same people who also insisted that Intel's 90nm process was fine and that it was only a poor design with Prescott that was the problem. Anyway, this statement by Hester actually seems quite accurate to me. It was my impression that AMD had intended K10 to run at lower voltage which would have allowed higher clocks. This again seems to fit what we've seen with K10's limited by TDP. The reason for the higher voltage seems to be that the transistors don't quite switch fast enough and this causes some of the "critical paths" that Hester talked about to get out of synch. You could fix this at a low level by improving the transistors to get them back into spec with the design. Or, you could relax the timing on these critical paths which would get the design on spec with the transistors. Because 45nm is right around the corner it appears that AMD has decided to not expend more resources on 65nm improvement and will instead relax the timing. AMD's work on 45nm transistors will theoretically migrate down to 65nm, at least this is the theory of AMD's CTI (Continuous Transistor Improvement) program. However, we may now be entering a new era where improvements are so specialized that they may be unable to cross process boundaries as they used to and we may see AMD following Intel's lead. This would mean tighter design at the beginning of each process node and less reliance on later improvements.

The two long term issues concern the possibility of a New York FAB for AMD and the announcement on EUV. There are three questions about a NY FAB: Does AMD need it? Can they afford it? And, why NY instead of Dresden where FAB 30 and 36 are now? Need is most obvious because without a new FAB AMD's capacity will top out by mid to late 2010 unless the FAB 38 ramp is slower than expected. Affording is a big question but one that AMD can leave aside for now hoping that their cash situation will improve. The question of location is a curious one. One suggestion was that NY simply offered more incentives than Dresden but this by itself seems unlikely. In every case in the past Germany has shown itself more than willing to contribute money for AMD's FABs. So, the real reason for the NY location may have more to do with other factors. In fact, we even seemed to have some evidence of this from the EUV announcement.

"The AMD test chip first went through processing at AMD’s Fab 36 in Dresden, Germany, using 193 nm immersion lithography, the most advanced lithography tools in high volume production today. The test chip wafers were then shipped to IBM’s Research Facility at the College of Nanoscale Science and Engineering (CNSE) in Albany, New York where AMD, IBM and their partners used an ASML EUV lithography scanner installed in Albany through a partnership with ASML, IBM and CNSE, to pattern the first layer of metal interconnects between the transistors built in Germany."

Secondly, we need to remember that AMD only fell behind on process technology when it moved to 130nm in 2002. Prior to this AMD was doing pretty well. Although things seemed to improve after AMD's rocky transition to 130nm SOI AMD now seems to be falling behind again at 45nm. AMD used to operate its Submicron Development Center (SDC) in Sunnyvale, California. This facility was leading edge back in 1999. It surely is not lost on AMD that they have now surpassed IBM. Back in 2002 AMD only had a 200mm FAB while IBM had a more modern 300mm FAB as well as more capacity. AMD today has caught up in terms of FAB technology but passed IBM in terms of capacity. The big question for AMD has to be how badly IBM needs leading edge process technology and for how long. Robust server and mainframe chips need reliability more than top speed. Secondly, IBM has been steadily divesting hardware so one has to wonder when the processor division might become a target. Notice that in the above announcement the wafers had to be flown from FAB 36 in Dresden to New York. Given these facts I think it is possible that AMD wants to create another research facility at New York. I think this could serve both to tweak processes faster and optimize them better for AMD's needs as well as pick up any slack if research at IBM falls off. There has been no indication of this but it does seem plausible.

The recent EUV announcement is incomplete however. If we look at an IBM article on EUV in EETimes from February 23, 2007 we see that IBM very much wanted EUV for 22nm but figured that it wouldn't be ready in time for early development work.

The industry hopes EUV will make it into production sooner than latter, but the technology must reach certain milestones. ''I think the next 9 to 12 months are very critical to achieve this,'' said George Gomba, IBM distinguished engineer and director of lithography technology development at the company.

Twelve months from February 2007 would be now. So, what is missing from the EUV announcement is whether or not this recent test puts EUV on track for IBM for 22nm or whether it will have to wait for 16nm. A second question is why the test wafer was made at Dresden by AMD. If IBM had already tested its own wafers then why didn't it announce earlier? This could mean that AMD has decided to try to hit the 22nm node for EUV but that IBM has decided to wait until 16nm. If this is a more aggressive stance for AMD then it could mean that AMD will rely less on IBM for process technology for 22nm. This again would support the idea that AMD wants a new design center in NY. I think it is entirely plausible that AMD could surpass IBM to become the senior partner in process development over the next few years.

Friday, February 15, 2008

Bandwagon Journalism (AMD is sooo last year)

Much like wearing Prada, it has become fashionable to attack AMD. Often it seems that web authors say negative things less because there is any valid reason and more because they simply want to be part of the crowd. A good example of this type of trash journalism is this Extreme Tech piece by Joel Durham. Durham and many others suggest that the evidence is everywhere that AMD became lazy and stopped innovating. The reality is that there is no such evidence.

Let's start with the argument that AMD has been generations behind Intel in terms of process technology. In early 2003, Intel's Northwood P4 was at 3.2Ghz while AMD's Barton K7 was at 2.2Ghz. Both were using a 130nm process.

Intel P4
2003, 3.2 Ghz - Q4 2006, 3.8 Ghz, 19% increase in 14 quarters


2003, 1.8 Ghz - Q4 2006, 2.8Ghz, 56% increase in 14 quarters
Equal to K7, 2.0 Ghz -2.8 Ghz, 40% increase
Equal to P4, 2.2 Ghz - 2.8 Ghz, 27% increase

After changing processes twice P4 topped out at 3.8 Ghz on 65nm, a very modest 19% increase in clock. AMD increased clock by 56% in the same period of time. Of course, it could be argued that K8's initial 1.8 Ghz was slower than the fastest 2.2 Ghz Barton at the time. However, looking at either the 2.0 Ghz point where K8 matched the fastest K7 or the 2.2 Ghz point where K8 was faster than the fastest P4 we still see that AMD increased clock more than Intel over the same period of time.

The second argument is that AMD has been doing much worse with 65nm than it has before with process technology and is way behind where it should be. This is not exactly true when compared to AMD's previous track record with 130nm SOI. It took AMD about a year to match K7's old process speed of 2.2 Ghz and deliver 2.2 Ghz K8's in volume in early 2004. We see an almost identical scenario with 2.8 ghz 65nm K8's now arriving about a year after their 90nm counterparts.

The third argument is that AMD's 65nm process is broken. The supposed evidence of this is that K8 hit 3.2 Ghz on 90nm while 65nm is only now at 2.8 Ghz. This may sound good but it isn't a fair comparison. AMD stopped developing the process used for K7 because K7 was on the old socket A and therefore had a limited lifespan. If K7 had continued to be developed it very likely too would have been at a higher speed in early 2004 when 2.2 Ghz K8 arrived. We could easily have been comparing 2.2 Ghz K8 to 2.4-2.6 Ghz K7 much as we see today. In fact, this very thing did happen to Intel with P4. Intel continued to develop the Northwood core on the old 130nm process and reached 3.46 Ghz which exceeded the initial 90nm clock speeds. In fact, if we use Intel's highest 130nm clock then Intel's efforts look particularly poor as we only then see a tiny 10% increase in clock speed to 3.8 Ghz on 65nm in the next 3 years. By the logic of the bandwagon analysts Intel's 90 and 65nm processes must have been broken.

However, the reality is quite a bit different from such a superficial view. In between the period of 2003 and 2006 both companies shifted to dual core on P4 and K8 which slowed clock increases. We really can't compare one to one a dual core Tulsa at 3.8 Ghz to a single core Northwood Xeon at 3.46 Ghz. We clearly saw that even though AMD's 90nm process was mature by 2005 the initial clock speeds for X2 were 400 Mhz slower than the single core speeds. Adding in the core doubling factor we can see that the actual clock increases were greater than the apparent increases. Similarly today we see speeds being held back because of a shift from dual core to quad core.

It is clear then that K8's clock speeds advanced at a normal pace and that 65nm matches the rate of development of AMD's 130nm SOI process. This leaves the question of why the notion that AMD became arrogant and lazy has become so pervasive since 2006. There is no doubt that Intel scored a big win both by introducing an architecture with increased IPC and increasing clock at the same time. This is similar to Intel's introduction of Northwood P4 where the IPC increased over Williamette and the improved 130nm process allowed a faster clock. Compared to this AMD's necessary shift to revision F for DDR-2 seemed very disappointing. Thus at the end of 2006 Intel was at 3.0 Ghz on a 65nm process with quad core compared to AMD which was just introducing 65nm and was only at 2.8 Ghz with 90nm dual core. Some have tried to claim that AMD should have moved to 65nm earlier but FAB 30 was not capable of 65nm production and any money spent on outdated 200mm tooling for upgrades would have been wasted. AMD had to wait on FAB 36 and the 65nm ramp there seems inline with industry expectations.

So, in looking at AMD and Intel more closely we simply don't find the arrogance, laziness, or lack of innovation that it has become so chic lately to attribute (with airy wave of hand) to AMD. The change to a DDR-2 controller no doubt consumed development resources but added no speed to the processor itself. The bottom line was that Intel's 65nm process was mature when C2D arrived because it had already been wrung out with Presler and Yonah and there was simply no possible way that K8 with internal 64 bit buses was going to compete with C2D with new 128 bit buses. Intel basically got lucky with quad core since the shared FSB architecture was adaptable to this. I saw a lot of people claim that AMD should have done an MCM type design with K8 but I still haven't figure out how well this would have worked with a single memory controller feeding two chips and the second chip being fed via an HT link. Presumably the performance would have been very similar to dual socket with only one chip having a connection to memory and these only showed 50% performance for the second chip. I still have doubts that this at 2.8 Ghz would have had much effect in late 2006 and it seems that the memory bottleneck would simply have gotten worse as the speeds increased to 3.0 and 3.2 Ghz. Rather than laziness it is clear that 2006 found AMD with very few options to respond to Intel's successes.

I've seen comment after comment claiming that the purchase of ATI was a huge mistake. I'll admit that it cost a lot of money when AMD had none to spare but what exactly was the alternative? If AMD had not purchased ATI the five quarters worth of losses would have been the same. There was nothing about the ATI purchase that affected AMD's processor schedule. I've also seen claims that AMD overpaid for ATI and the supposed proof of this is the $1.5 loss of goodwill charged in Q4 07. The problem with this idea is that the purchase price had to include ATI's prospects including business from Intel. Naturally, ATI lost this business when it was purchased by AMD. Since the loss of Intel business was a direct result of AMD's purchase this loss of value at ATI was inevitable. However, on the positive side AMD acquired the 690G chipset which remained the most popular chipset for AMD systems through 2007. Likewise it is a certainty that the 790 chipset will be in 2008. AMD also gained a purpose designed mobile chipset. The lack of such a chipset prevented the superior Turion processor from matching Penium M for the past several years. Gaining this chipset is difficult to underestimate. This also puts AMD in a much more competitive position with Fusion. There is no doubt that AMD has troubles but I can't see any which were caused by the ATI purchase. Without the purchase AMD would have more money but its competitive position would be worse.

I've unfortunately found that when people state my position they usually only get it right about 1/3rd of the time. So, I'll try this clearly. It is obvious that AMD is behind and the clearest indication of this is the lack of FX chips. The Black Edition doesn't count since this is actually a volume chip, more like a poor man's version of FX. True FX chips are at the bin limits and therefore only available in limited quantities. The fact that FX chips have been replaced with Black Edition shows that even AMD knows that it is behind. AMD's official highest clock on 65nm for X2 is 2.8Ghz. This X2 4000+ review at OC Inside shows 2.93 Ghz at stock voltage. This 200Mhz margin is the difference between common and low volume. In other words AMD should therefore be capable of delivering FX 65nm X2 chips at 3.0Ghz. Of course, there would be no reason to since these would not be competitive. However, using the pattern of X2 we would assume that X4 would be 2.4Ghz common volume and 2.6Ghz FX volume. Again, a 2.6Ghz X4 would not be competitive as an FX chip so there are none.

These clocks match closely with what we've actually seen. I have seen accounts of 2.7 and 2.8Ghz on Phenom X4 with stock voltage. This of course would contrast sharply with suggestions from places like Fudzilla that Phenom will top out at 2.6Ghz since one would assume that another quarter or so would give 2.8Ghz common volume for X4 in Q3. This would then seem to allow 3.0Ghz at FX volumes. These are both good questions: whether AMD could truly deliver 2.8Ghz in Q3 and whether AMD would consider 3.0Ghz fast enough for an FX branded chip. I have seen suggestions that AMD will abandon 65nm in favor of 45nm at mid year. However, this would not seem to match AMD's previous behavior since 65nm chips use the same socket and therefore would not be end of life as K7 was in 2003. It would seem more likely that AMD would continue to rely on 65nm during 2008 for the bulk of its chips and highest speeds and that 45nm even if at reasonable volumes in Q4 will not reach competitive speeds until early 2009. In other words, barring a big process leap for 45nm I would expect AMD's best in 2008 to be 65nm. I don't suppose we will get any real idea of AMD's 45nm process until someone gets ahold of some 45nm ES chips and that probably won't happen any earlier than late Q2.