IDF, AMD And Intel In Perspective
Intel currently has the leading performance in desktop hardware. AMD is woefully uncompetitive versus Intel's E6600, E6700 and X6800. The Core 2 Duo architecture strips AMD's top FX chip from being a front line enthusiast's processor to one merely competitive with Intel's third highest speed grade. Essentially, AMD's best processor is mid-range for Intel's Core 2 Duo. Being not one but two full speed grades ahead puts Intel in the peformance lead more solidly than it has been since 2002. If this were the only thing that mattered AMD's position would be rather desperate today. However, AMD has chips that are competitive with E6300 and E6400 and they have the X2 3800+ which is cheaper than E6300. Likewise, Intel has no advantage in the Celeron range which is matched by Sempron. In terms of servers, Intel's Woodcrest is roughly a match for Opteron in single or dual socket. Woodcrest cannot do 4-way and in this area the older Prescott based cores just don't match up to Opteron.
The information from IDF is mixed. Unfortunately, when Intel feels pressed it tends to toss out future technologies with abandon and the show ends up looking like Futurama II from the 1964 New York World's Fair. Unfortunately, people may see these technologies as real rather than the very forward looking demonstrations that they are and forget that Intel has a habit of canceling and scaling back far reaching projects. It takes a special type of mental contortion or extreme forgetfulness to suggest that the same company that recently delivered a scaled back Monticeto late will be delivering processors based on photonics anytime soon.
Although Core 2 Duo is a success it also has to be seen as Intel's failure. Until recently, AMD has been too small to work on multiple processor projects. So, K8 was a one-size-fits-all strategy using the same core for servers, desktops, and notebooks while Intel supported three entirely different architectures. That Intel has switched to a similar one-size-fits-all strategy and is now using C2D for everything while also scaling back work on Itanium shows that Intel was unable to adequately manage all of those projects. Their project management now has been essentially cut in half. They've scaled back from three separate architectures to about one and a half. This is even true in terms of server chips where Xeon used to have many modifications compared to P4. Xeon had 36 bit addressing extensions, plus L3 cache, plus multi-chip support. However, Woodcrest in servers today is much more like Conroe on the desktop than Xeon was like P4. This again suggests that Intel's success with C2D is primarily a function of putting most of its engineering efforts into a single project. Curiously, this happens at a time when AMD is splitting its line into two separate cores for notebooks and desktops/servers. If AMD truly gains an advantage from this, Intel will have no choice but to match with a specialized notebook core of its own to protect its current Centrino market. Unless Intel can do something to overhaul its bureaucracy this would put it back into the same sinking boat it just jumped out of. Similarly, Intel would like to avoid mentioning that AMD's new instructions like POPCNT are specifically to compete with Itanium in the 4-way and up market. Scaling back on Itanium design doesn't really fit with this reality, especially at a time when Woodcrest is limited to 2-way and Presler cored Xeon's are outdated. This means that Intel's entire current server strategy is either due for scale back (Itanium) or end of life (Presler and 5000 chipset).
Apparently, lightening can strike twice as Intel has essentially just repeated the failure of RDRAM with FDMIMM. With Intel's massive pullback of FBDIMM and AMD's cancellation of future plans to support it, FBDIMM is essentially End Of Life upon release. In some ways this is good because it is primarily FBDIMM that is holding Woodcrest back in terms of power draw and performance versus Opteron. Without FBDIMM, Woodcrest should pull somewhat ahead. However, this means scrapping Intel's brand new 5000 series chipset and it will have to scramble to come up with a replacement. Apparently, Intel's real strategy will start Q2 07 with the Bearlake replacement chipset for 5000. This will put a dent in Intel's bottom line because it has spent a lot of money on this technology and will have to continue to subsidize it somewhat for the customers who are buying into it now.
Intel's information at this IDF has been significant both for what was said and what wasn't said. There was no mention of CSI (Common Systems Interface). CSI was supposed to have been Intel's answer to AMD's HyperTransport. It had been suggested that CSI would be released in 2008 and even as early as 2H 07. However, given the extremely long ranged technologies mentioned, like photonics, the silence on CSI is suprising to say the least. If we combine the lack of information about a CSI release with the new initiative to license the Intel FSB (Front Side Bus), and the Geneseo upgrade to PCI-E, we wind up with a picture without CSI. This suggests that Intel's next core release in 2008 will use the current FSB and will not have CSI. More than anything else this leaves Intel without a solid architectural foundation.
It is true that, on the face of it, the FSB licensing and the Geneseo PCI-E standard would be similar to AMD's Torrenza. The proposed speed for PCI-E 2.0 is similar to HT 3.0 and HTX. However, these two are quite different. If a coprocessor plugs into an AMD socket it can communicate with the processor using the same protocol, HT, that it would use if it plugged into an HTX slot. However, there is no similarity between Intel's FSB and PCI-E. Manufacturers can create enhanced PCI-E products but these would not be adaptable to the FSB. It appears that PCI-E will not include a cache coherency protocol whereas HTX could. PCI-E also suffers latency because it has to jump through a PCI-E hub of some sort whereas HTX can connect directly to the processor. Another big difference is that HT 3.0 and HTX will be in systems before Geneseo is even finalized. Also, if Intel were releasing CSI in 2008 then it would have made sense to have folded CSI into the Geneseo initiative and skip the FSB licensing. Geneseo with CSI would be very competitive with Torrenza. Geneseo without CSI is little more than a PCI-E upgrade and not true competition for Torrenza.
It is clear that Intel will not abandon the FSB because there would be no point in licensing a FSB that was going to be dropped and there would be no reason to waste money developing products for a FSB that would be gone by 2008. Therefore, we must assume that Intel is not dropping the FSB. This also has to mean that Intel is not releasing CSI in 2008 and is not following AMD's lead to an onboard memory controller. In fact, the announcement of a GPU built into the Northbridge seems to further bolster this assumption. All of the evidence points to the conclusion that Intel's FSB will be around for several more years. The big question is why. Obviously if Intel can build a memory controller as part of the Northbridge then they could certainly include it on the cpu die. In fact, Intel's Bearlake chipset shows that they have a DDR3 controller ready to go. Using an on-die memory controller requires a separate bus for I/O and interprocessor communication. Presumably, with Intel's experience with PCI-E they have most of the necessary protocol down as CSI is primarily an upgrade of PCI-E. Intel would also need to use a protocol like MOESI instead of MESI as they use now. However, AMD made this change in 2002 on the Athlon MP and there is no reason why Intel could not follow. The only conclusions left are two things. First of all, IBM made a substantial investment on a scalable Northbridge for Intel processors. It may be that Intel wants to maintain compatibility. Intel blundered with RDRAM and has now blundered with FBDIMM. However, it will be able to fix this by releasing another chipset fairly soon. If the memory controller were on the die this would take longer to fix. The biggest reason may be that Intel sees a limitation in speed with an onboard memory controller. If two channels are the practical limit for an onboard controller then potentially the bandwidth could be increased by increasing FSB speed and using more than two controllers on the Northbridge.
The TeraFlop chip was not particularly impressive. These were essentially lightweight processors and IBM has already seen a limit in using Cell. A machine based soley on Cell would be woefully inadequate for general processing as well as any type of computation that exceeded the small memory space contained in Cell. Consequently, IBM used a hybrid design with both Cell and Opteron so that Opteron could handle both the general processing and complex computation loads. Cell is used for small, parallel computation. Pursuing a lightweight processor design that is lower than Cell is unlikely to create anything beyond specialized coprocessor technology. It is also not clear who Intel woul partner with for this technology since they no longer build supercomputers of their own. The only two obvious partners, IBM and Cray are currently pursuing other directions. It is possible that Intel only intends this to be used as a coprocessor with its own procesors. However, even at that, the time frame to deliver seems too long as there will likely be several AMD compatible coprocessors in use by then.
What was not explicitly mentioned in the photonic description was that the only way to multiplex fiber optics is to use separate colors of lasers. This means that having eight channels would require eight separate lasers. There are problems with discriminating the channels while maintaining enough light intensity to carry the signal, as well as problems with having true, monochromatic light from diode lasers. This is not really a near term technology.
Intel showed technology that will not be ready for two or more years. Meanwhile it seems to be abandoning any attempt to create an onboard memory controller with distributed memory and a separate point to point interface. AMD has plans to move to Direct Connect Architecture 2.0 which will enhance their current lead in 4-way and beyond configurations. Intel's direction is very puzzling as it appears to leave Intel with no hope of catching up to AMD in the 4-way and above market. IBM does have a linkable Northbridge that can use the current Intel FSB but this would seem to leave Dell and Gateway with no choice but to use more AMD servers for 4-way and higher. However, the curent memory situation is not ideal for AMD. AMD needs to have low latency memory and unfortunately latencies have increased with DRR2 and will increase again with DDR3. This does not hurt Intel as much as having a large cache can somewhat offset the effect of increased latency.
The current outlook for desktops is unchanged. The Intel FSB has enough capacity to handle single processor systems and even dual processor systems using a dual FSB Northbridge. It is not clear that Intel has a path to adequately reach quad processing systems however these will probably not be a factor on the desktop for several years. AMD's desktop outlook is good once K8L is released in 2007. AMD's 4-way and higher outlook is much better than Intel's due to the switch to Direct Connect Architecture 2.0 in 2008. It is not clear whether Intel is simply giving up on the 4-way and higher market or whether it only has plans to pursue this with the Itanium family. For single and 2-way systems there does not appear to be any current way for AMD to gain a lead over Intel. DIMM speed seems to be the limiting factor. For example, AMD's memory controller on Revision F is capable of handling DDR2 800 memory however memory this fast is not yet available. It may be in AMD's interest to see about creating its own DIMM initiative such as using TTRAM or TTRAM caching on top of DDR DIMMs. It would also be helpful for AMD to create a new DIMM standard such as the HTDIMM that I wrote about earlier. These two technologies are not exclusive. HTDIMM is a configuration for communication and fanout on DIMMs whereas TTRAM or TTRAM caching would be the underlying storage technology. In other words, TTRAM would be technology on the DIMM for storing and retreiving data faster whereas HTDIMM would be technology for communicating with the processor faster.
It appears that AMD is likely to catch Intel with K8L both in terms of dual and quad core. In the near term, 4x4 should make AMD competitive again in the FX range. In the longer term AMD does not seem to be able to gain an advantage over Intel as long as DIMM speed is a factor. However, for servers, it appears that AMD will simply leave Intel behind on 4-way and higher. This IDF had to be very disappointing to anyone looking to Intel for future x86 based server technology.