Monday, October 02, 2006

IDF, AMD, and Intel in Perspective -- 2nd Look

After a lot of discussion and consideration since the last article I've decided that FSB licensing does not mean that CSI has been scrapped. I had been under the assumption that CSI was essentially ready to go so the FSB licensing didn't make a lot of sense. I've now decided that I was overestimating Intel and that FSB licensing and Geneseo both make a lot of sense. I believe that my assumption that CSI was ready to go was incorrect and that, in fact, CSI will be two years late. CSI was supposed to be released in 2007 but now I doubt it will be released on any Xeon chip until 2009. In light of this, the FSB licensing would be a necessary stopgap. However, I do disagree with some assessments of Geneseo. Geneseo looks to me very much like CSI without cache coherency. If it were released today it would be somewhat competitive with HTX with the HT 3.0 standard. Geneseo/CSI would be competitive in terms of speed and latency however HT 3.0 includes a distance mode that can reach one meter, a power saving mode that is good for notebooks or reducing total power draw when full speed isn't being used, an unganged mode that splits a single 16 bit channel into two 8 bit channels, and HT 3.0 is hot pluggable. Secondly, by the time Geneseo is actually available we will probably be looking at HT 4.0 and Geneseo will be further behind. Nevertheless, it is a reasonable upgrade path for CSI to eventually catch up to HT. With a big upgrade to the Geneseo/CSI standard in 2009, CSI might be able to close the gap by 2010 or 2011.

The second issue is attitude. Intel and AMD have been complete opposites. Intel talked about photonics and the TeraFlop chip. Yet, Geneseo won't be ready for 2 - 2 1/2 years, TeraFlop is 5 years at the earliest, and photonics are indefinite. This is the height of vaporware. TeraFlop is unlikely to ever be released because it will most likely be outdated in five years. Similarly, it is not clear that photonics will ever be used for on-die signaling. The most likely use for fiberoptics would be something like a more robust HyperTransport link for connecting multiple cpu boards. This could even be used to connect a memory board to the processor board. The use of fiber would remove the need to run serpentine data traces all the way across the motherboard to the DIMMs. Memory could be on its own board and just plug in with fiber. I/O could also be on its own board. This would make boards easier to design because the signal path would be much smaller and easier to control. It would be easier to design just a cpu board or just a memory board or just an I/O board. I could see this application in servers. Interestingly, this would move servers back toward the way they used to be as minicomputers when the cpu occupied half a dozen to a dozen separate boards.

If Intel has been talking about technology in the distant (or nonexistent) future; AMD has been amazingly silent. AMD has not been merely quiet; it has been so secretive that even though both socket AM2 and F have been released there are still no pinout diagrams available. Typically, these pinout diagrams would be available a little before official release because chipset designers and motherboard makers would already have them. Yet, a search of the sea of information on the internet gets no hits for pinouts for either socket. And, if AM2 and F are a secret then 4X4 would make the CIA and the US military envious. I've seen demonstrations of future weapons and spy devices that won't be released for two or three years. 4X4 is due within 3 months but the technical detail available when written down would still leave space free on the back of a postage stamp. There are rumors that it will be socket F or socket AM2. There are rumors that it will support a cache coherent HTX slot or it won't. There are rumors that it will be used for workstations and support registered memory for increased reliability or that will only use unregistered DIMMs for higher speed. It still is not clear what will differentiate 4X4 and the new FX from the 2xxx series of Opterons. 4X4 could include 2 cache coherent HT links or maybe not.

One could take Intel's ramblings about future technology as a bluff because it doesn't want to admit the glaring gap in its server lineup for the next two and a half years. However, one could equally take AMD's deafening silence as an indication that it has nothing to compete with Conroe on the desktop for the indefinite future. However, I doubt everything is quite as bad as that. Intel does have a huge problem with Woodcrest. The current performance problem is small and will get fixed just as soon as Intel releases a registered DDR2 chipset because it is really FBDIMM that is making Woodcrest look bad. Dropping FBDIMM will improve latency with full DIMM sockets and reduce power draw. This should allow Woodcrest to pull noticeably ahead of Opteron. However, when it comes to 4-way, Woodcrest just isn't quite tall enough to reach the bar. This is a problem. However, IBM has the robust XA-64e (Hurricane) chipset for servers and this is probably enough to make a big tin server out of the currently wimpy Woodcrest. The only thing Woodcrest really needs are robust 4-way and 8-way connections. The need for this is even more desperate with Clovertown and its two dies crammed into one package. It should have this with XA-64e and I'm certain that IBM will be releasing heavy duty servers based on Woodcrest (Clovertown or Yorkfield), perhaps later in 2007. There is some possibility that this could even be intentional on Intel's part as Intel might wish to give Itanium some breathing room before X86 steals the markets above 4-way. Intel's statements suggest that it is all too aware that Itanium will get pushed higher and higher up the n-way ladder until it is no longer viable. The clock is ticking.

If IBM and DDR will be the saving grace for Woodcrest then K8L (Barcelona) would be the same for K8. By the time Intel gets a DDR chipset out the door for Woodcrest AMD will be releasing K8L. This processor should give K8 a huge push in FP/SSE performance and a smaller boost in Integer performance. If the rumors are true then AMD will be moving up the release of K8L Opteron to Q2 07 and the release of the desktop version to Q3 07. AMD knows that it will lose marketshare if it can't deliver something more competitive. While Woodcrest with DDR will leap ahead of Opteron it won't leap ahead of the dual core version of K8L Opteron, and Clovertown should find itself looking at tailights from quad core K8L. Opteron on HPC is where Intel wishes it were with Itanium or perhaps Woodcrest. AMD has three announced supercomputing wins (Sun, Cray, and IBM) for machines that will all be in the top 5 and has just upgraded the Oak Ridge machine. In contrast, there have been no new announcements for either Itanium or Woodcrest in the HPC arena. To add insult to injury, the purely Opteron Cray PetaFlop machine will probably rank 1st until it is displaced by the hybrid Cell/Opteron IBM machine. This would put Opterons in the number 1 and 2 systems with Sun's Opteron machine still in the top 5 (possibly even ranking 3rd). This isn't quite so bad for IBM because it has had the lead with Blue Gene L and will be putting its Cell machine into the number 1 spot and it's current Blue Gene machines will still be in the top 5. The only machine competitive with these is the proposed 3 PetaFlop Fujitsu machine which won't be ready until 2010 or 2011. It could potentially make it to 2nd on the list but there may be others by then.

IBM's XA-64e should keep the Woodcrest/Clovertown lineage from falling off the map on quad core/quad socket but it will be nearly impossible for it to match Opteron through 2008. The extra instuctions like PopCnt in 2007 will steal some of Itanium's thunder. In addition to the robust HT 3.0 added to K8L Opteron in 2007, AMD will bounce the technology even harder when it releases Direct Connect Architecture 2.0 in 2008 (Intel will still be waiting for Geneseo/CSI). This will give Opteron even greater connectivity with 4 HT links. This will be good for Sun and Dell (as well as HP and IBM if they pursue it) and will put pressure on the Intel based SGI, HP, and IBM offerings. However, IBM should sell enough Xeon systems to make a nice return on its XA-64e development. Likewise, the massive available memory on SGI's boxes should maintain a specialized market for their products. It would be nice to see how these products compare with proper load testing but I haven't seen this yet from any of the major review sites. It has yet to be done for even the dual core processors.

I think Intel's vaporware show and tell is a symptom of being ahead and trying very hard to make everyone believe that they have a solid strategy to stay ahead. The reality suggests that Intel has almost no hope of getting ahead in servers and is in serious danger of losing its advantage in notebooks. AMD now appears to be very serious about producing a proper low power chipset for notebooks with ATI. This could be a problem for Intel in 2008 when AMD moves to an integrated CPU/GPU for notebooks. Intel will need to work hard to keep from falling behind in power consumption which has been the trademark of the Centrino line. However, pressure will come much sooner than 2008 as AMD qualifies multiple chipsets and wireless adapters for OEM's. This is a sharp contrast to Intel's all or nothing Centrino position.

I think AMD's silence is related to the fact that while everything seems to be improving in terms of servers, notebooks, and corporate customers the desktop is still a problem. Intel gained massively by both increasing IPC and increasing clock. It would be impossible for AMD to increase the clock quickly enough with K8 to offset the IPC gains that Intel has. AMD needs K8L to increase IPC and needs 65nm to continue increasing clock. This gives AMD a fighting chance on the desktop in 2007 but doesn't help today. Today, AMD is clearly behind on the desktop and is very much aware of this. I believe that AMD has been quiet and extremely secretive because they realize that they have to make a strong showing when they announce something new. People who were expecting a jump in performance with Revision F were disappointed; AMD doesn't want another disappointment. AMD only has three things it can show. It can show that the power draw is less with 65nm. This will be helpful for notebooks and with large scale servers to be sure, but won't do much on the desktop.

AMD has 4X4 which should regain the FX position by essentially selling two chips for the price of one. However, for this to have any effect, AMD will need demos or benchmarks that can take advantage of four cores. Most games and benchmarks do not. AMD is going to have to carefully plan a presentation for 4X4. I suppose it is possible that they are talking to one or more software companies for some kind of endorsement. We can only wonder. However, whatever 4X4 does, it will be compared to Kentsfield and it is unlikely that the major review sites will do anything but superficial testing. I wouldn't be surprised with claims that Kentsfield is faster in everything than 4X4 when common sense would indicate that Kentsfield will bog down on any memory intensive task. This simply indicates the very sad and unreliable state of big review sites today as I've already talked about in my articles on Tom's Hardware Guide and Anandtech. The final thing that AMD has to show is the demo of K8L. This should give some indication of where AMD will be in Q2 07. However, K8L is unlikely to have the same IPC increase as Conrore which means that AMD will need a higher clock to match Conroe's performance. Considering that AMD is currently trailing in clock this will be a challenge even with 65nm. So, Intel's banter is a company pretending that being ahead on the desktop means that it is ahead everywhere and that it will stay ahead. AMD's silence is a company that knows it is behind on the desktop and needs maximum effect to get any notice. Intel has a fighting chance to stay ahead on the desktop in 2007 but should be facing increasing competition in notebooks, corporate accounts, and servers. AMD seems to be improving in notebooks, servers, and corporate accounts, but is going to have to work hard to catch up on the destkop in 2007.

55 comments:

Erlindo said...

Scientia: Intel has a fighting chance to stay ahead on the desktop in 2007 but should be facing increasing competition in notebooks, corporate accounts, and servers...

So, do you think that K8L won't be competitive with conroe (call it dual or quad) on the desktop during the 2007 timeframe?
Is that what you are trying saying?

Scientia from AMDZone said...

K8L dual core should be much closer than K8. However, a lot of this will depend on what clock speeds are available. If Intel can quickly release a 3.2Ghz chip then AMD will be behind even with K8L dual core unless it can release a chip in the 3.2 - 3.4Ghz range.

K8L quad core should be better than Kentsfield however the desktop versions won't be out until Q3 07. Yorkfield may be out by then and it could be closer in performance to K8L.

Erlindo said...

yeah but, what about the new enhancements the new core (k8L) will hae?

Wouldn't that ne enough to counter conroe at lower clock speeds as K8 did against Prescott?

If K8L would be 40-50% better than K8, then that means that there's no need for AMD to clock higher their new cores since conroe is only 15-20% faster than K8 processors.

Anonymous said...

"If IBM and DDR..."
IMC?

http://www.hkepc.com/bbs/news.php?tid=679375&starttime=0&endtime=0
New AMD info! Rather unfortunately named Spica:(
Digitimes/HKEPC are reporting Q307 for the desktop parts, and theinquirer.net recently had 3 parts coming, with Q207 server, Q407 A64/12XX...

Weird, the 65nm parts are coming out at 2.6 max, and the [quad/dual]parts coming 3 quarters later are just 2.9.
http://www.aceshardware.com/forums/read_post.jsp?id=120067747&forumid=1
I'm expecting Intel to add more than just 45nm, SSE4, and DDR3 with Yorkfield..Possibly IMC/CSI? Charlie says Nehalem is Conroe with lots of integrated parts.

http://www.aceshardware.com/forums/read_post.jsp?id=120066879&forumid=1
And Charlie expects Conroe[not Yorkfield] to still be better than Rev H.

Anonymous said...

http://news.zdnet.com/2100-9584_22-6041120.html
Intel in the past claimed 20% over AMD, 40% over Pentium D.

http://www.xbitlabs.com/news/cpu/display/20060731233200.html
http://www.hkepc.com/bbs/news.php?tid=678736&starttime=0&endtime=0
AMD claimed 80% over Cinebench with 4x4, now 40% with K8L vs K8? They originally had 60%...
http://www.amd.com/us-en/assets/content_type/DownloadableAssets/PhilHesterAMDAnalystDayV2.pdf
...
"performance per watt[already confusing enough to compare by clock]" increase. Also, all Altairs are 125W, vs non native Kentsfields at 50W/80W/120W. How do you expect Altair to perform?

Anonymous said...

http://www.vr-zone.com/?i=4109
Links from previous posts show AMD claiming 40% better than K8, which should be 17% better than Conroe? A 2.9 Altair should theoretically be at best case better than a 3.4 Conroe. Yorkfield should bring that down[even possibly gaining with all the new features], and VR-Zone reports that these will come in 3.46-3.73[nullifying the per clock gains of Altair] parts.

Anonymous said...

I see you denounce VR-Zone from AMDZone.

http://forums.vr-zone.com/showthread.php?t=97553

Written from an English deficient [probably from Taiwan?] administrator of the site.

Scientia from AMDZone said...

yeah but, what about the new enhancements the new core (k8L) will hae?

Wouldn't that ne enough to counter conroe at lower clock speeds as K8 did against Prescott?


The enhancements on K8L (Barcelona) are a big jump in SSE. This should put Barcelona in a very good position with SSE versus Conroe. However, I doubt that the improvements (dual L1 cache bus and doubled prefetch with enhanced redordering) will be enough to match the increase in Integer IPC that Conroe has. Yonah was roughly equal in IPC to K8. However, Conroe has twice the L1 bus width and 4 instruction issue. The macro ops fusion can sometimes allow 5 instructions. My best guess is that Barcelona will close half of the IPC gap. This would mean that Barcelona (K8L) would be slower in Integer at the same clock.

Scientia from AMDZone said...

Weird, the 65nm parts are coming out at 2.6 max, and the [quad/dual]parts coming 3 quarters later are just 2.9.
http://www.aceshardware.com/forums/read_post.jsp?id=120067747&forumid=1


AMD is using 65nm to fill the highest volume range. AMD can get 3.0Ghz on 90nm and should be at 3.1Ghz on dual core when Barcelona is released.

I'm expecting Intel to add more than just 45nm, SSE4, and DDR3 with Yorkfield..Possibly IMC/CSI? Charlie says Nehalem is Conroe with lots of integrated parts.

Sorry. CSI won't be out until 2009. DDR3 won't help much until DDR2 tops out at 800. AMD's current memory controller can handle this speed.

http://www.aceshardware.com/forums/read_post.jsp?id=120066879&forumid=1
And Charlie expects Conroe[not Yorkfield] to still be better than Rev H.


Well, Charlie has been wrong more often than right. However, my best guess is that Conroe will still be faster in integer at the same clock.

Scientia from AMDZone said...

I'm afraid that the benchmarks are confusing. Some of these are quad core compared with dual core and only represent an incremental increase in speed rather than an increase per clock.

Basically, any task that is memory intensive is going to bog down the processor. However, Kentsfield will bog down worse than Barcelona (K8L). Barcelona should be faster than Kentsfield on any quad threaded code or even dual threaded code that is memory intensive. Kentsfield will be faster on single threaded code.

I think dual core K8L could be faster than Conroe on memory intensive code if AMD can get its clock up to match. Basically, I think K8L will handle loads better. However, in any lightly loaded code (single or dual threaded) Conroe's higher IPC is going to put it ahead.

Neither Anandtech nor Tom's have done proper load testing. I am certain that Conroe's performance will drop drastically if the second core is loaded during the tests.

Anonymous said...

I'm sure they can go past 3GHZ on dual core, but on quad..at 2.9@125W, they're really reaching. And I don't think they would want 3GHZ duals competing with lower clocked quads.

And 45nm should at least equate to Core 2+, 65nm brought the cooler Pentium D 900s with more cache and clock and the dual core Core Duos. 3.46-3.73 seems very reachable when Conroes are overclocking without effort, Netbursts@3.8, Celeron D 365@3.6 for just $69.

Anonymous said...

Q306 Kentsfield worse than Q307 Altairs[geez AMD, pick a name already]? That's a surprise;)

Are you comparing Conroe vs K8 or Conroe vs K8L[or Deerhound, or Barcelona, or Altair]

Yorkfield will come Q307, same time as AMD quads[which has previously been reported to be Q407/Q108]. Using 40% gains over K8, that's 17% over Conroe? A 2.9 AMD quad should be[once again] equivalent to a 3.4 Conroe. But Yorkfield should reduce that lead, and with added features, be possibly faster, clock for clock. Even if it isn't, it'll still be clocked higher, for overall better performance.

Anonymous said...

I am getting tired of you constantly making excuses that tom and anand didnt do a good job reviewing conroe or kentfield and your assumptions about kentfield being bogged down with memory intensive stuff.

Please tell me of the 100 professional pc review websites on the net that show conroe whooping k8 is there ANY one of them that does a good review in YOUR OPINION and if so please post a link so we may all know how a proper review is done?

Do u have any link showing kentfield being bogged down in memory intensive application that us consumers will be using in our day to day lives or are you pulling this info from your A$$?

Anonymous said...

Scientia, you seem pretty certain about the expected performance of K8L. I am not saying you are incorrect in your estimates, but how accurate do you think the information about K8L is that we have seen currently?

Considering how secretive AMD has been lately, I can't help but wonder how much we really know about the real K8L. Thoughts?

Scientia from AMDZone said...

I am getting tired of you constantly making excuses that tom and anand didnt do a good job reviewing conroe or kentfield and your assumptions about kentfield being bogged down with memory intensive stuff.

Well, I don't know that the article on Anandtech is an excuse. It shows either incredible incompetence or a conscious effort to bias the tests. I've also shown that THG tosses in tests that are not linear and therefore useless in comparison. I'm also not sure how you would explain the anomalies in their own data. However, THG readily shows bias by the fact that AMD chips are NEVER overclocked against stock Intel chips, but Intel chips are routinely overclocked against stock AMD chips.

Please tell me of the 100 professional pc review websites on the net that show conroe whooping k8 is there ANY one of them that does a good review in YOUR OPINION and if so please post a link so we may all know how a proper review is done?

I would expect Conroe to beat K8 by a large margin in SSE and by a significant amount in Integer. The problem is that both Conroe and X2 are dual core. I need to see proper testing where the 2nd core is loaded during the benchmark runs on the 1st core. I haven't seen that. If you have seen that somewhere, please post a link.

Do u have any link showing kentfield being bogged down in memory intensive application that us consumers will be using in our day to day lives or are you pulling this info from your A$$?

This is actually pretty funny. In this same article I am simultaneously defending myself against AMD fans who think my estimates on K8L performance are too low and defending myself against Intel fans who think I'm being too critical of Conroe and Kentsfield. If both sides are criticizing you that probably indicates at least some objectivity. I guess I would wonder when the last time was that THG or Anandtech were criticized for being too tough on Intel.

Scientia from AMDZone said...

Are you comparing Conroe vs K8 or Conroe vs K8L[or Deerhound, or Barcelona, or Altair]

I'm comparing K8L quad core (Barcelona) against Kentsfield, Clovertown, and Yorkfield. I'm comparing K8L dual core against Conroe.

Yorkfield will come Q307, same time as AMD quads[which has previously been reported to be Q407/Q108].

No. K8L Opteron will be released Q2 07. The desktop version, K8L Athlon 64 will be released Q3 07.

Using 40% gains over K8, that's 17% over Conroe? A 2.9 AMD quad should be[once again] equivalent to a 3.4 Conroe. But Yorkfield should reduce that lead, and with added features, be possibly faster, clock for clock. Even if it isn't, it'll still be clocked higher, for overall better performance.

No. Conroe should be faster in Integer than K8L dual core at the same clock. The only reason why K8L will be faster than Kentsfield is due to bus contention. However, Yorkfield has shared cache so Yorkfield could be faster than K8L; it depends on the implementation. The reason I am not optimistic about Yorkfield has to do with 4 cores all trying to access L2 at the same time. Compare this with K8L where each core has its own L2 and the sharing is only at L3. To make this really work, it seems like Intel would need to double the L2 to L1 bus width. The cache scheduler would also need some serious improvement.

Scientia from AMDZone said...

Scientia, you seem pretty certain about the expected performance of K8L. I am not saying you are incorrect in your estimates, but how accurate do you think the information about K8L is that we have seen currently?

We know that a second FP pipeline has been added. It has been suggested that the FP execution units will be upgraded. This should make K8L very competitive in terms of FP. I also know that some of the new instructions (which have already been published) are to compete with Itanium. This shows how serious AMD is about HPC applications. Their recent wins in supercomputers tend to bolster this notion. There has been no indication of any change to the three decoders. This is the important point. We know that the L1 cache bus has been doubled but to really improve Integer speed AMD needs to either increase the number of controllers or upgrade the controllers in some fashion to increase instruction issue.

Considering how secretive AMD has been lately, I can't help but wonder how much we really know about the real K8L. Thoughts?

The problem is that AMD has suggested some increases in FP. I also have estimates from Cray for K8L FP performance in their next supercomputer contract so I'm pretty confident about that. However, AMD hasn't said much about Integer so I'm not expecting much there.

Erlindo said...

Scientia: ...The macro ops fusion can sometimes allow 5 instructions. My best guess is that Barcelona will close half of the IPC gap. This would mean that Barcelona (K8L) would be slower in Integer at the same clock.

OK, but you have to remember that the Core2 architecture suffers once running in native 64-bit:

All the claims of relatively low Core 2 Duo performance in 64-bit modes are based on two facts. According to some info confirmed by Intel representatives, there are two limitations imposed over the EM64T support in Core microarchitecture. Firstly, Core 2 Duo processors do not support Macrofusion technology in 64-bit mode. Secondly, the processor code decoding may slow down because of the instructions working with additional registers available only with EM64T enabled. Let’s try and get to the roots of these two problems.

As for the overall performance slowdown caused by instructions working with additional registers, it results from the single-byte REX prefix that is added for all 64-bit operations. This prefix probably affects the average length of instructions processed by the CPU in 64-bit modes. As a result, there may be fewer instructions within the 16-byte code sample from the L1 cache that is decoded in a single clock cycle. In other words, the average instruction length in x86 code is about 2.5-3.5 bytes, while in 64-bit mode it increases because of the REX prefix. When the average instruction length exceed 4 bytes, the CPU may lose its ability to process 4 instructions per clock.

At the same time I would like to point out that it looks like Athlon 64 processors ensure higher performance increase when switching to 64-bit work mode. The average performance improvement we have seen from Athlon 64 FX-62 equaled 16%, while Core 2 Extreme X6800 demonstrated only 10% average performance boost.


As you see, intel will lose the 64-bit performance once K8L introduces new extensions to the AMD64 instructions set and I "do hope" for AMD to increase integer performance (or do something about it). Maybe AMD is keeping something for the final release. ;)

Here is the link: X-bit labs

Anonymous said...

Intel should also have dual core 45nm CPUs with SSE4, etc. to compete with dual core K8Ls.

http://www.overclockers.com/tips00993/
And that's AMD's record of roadmap vs deliverance for when to expect Rev H..Though Digitimes has reported that partners have received samples already, so they have a good chance of being on time.

Anonymous said...

"AMD is using 65nm to fill the highest volume range. AMD can get 3.0Ghz on 90nm and should be at 3.1Ghz on dual core when Barcelona is released."

This is one possibility - I think the other thing that has been overlooked is AMD's "CTI" (continuous transistor improvement) process. The initial 65nm transistor design is just a shrink of the 90nm design; and while smaller Lg helps on performance, if you do not adjust other aspects of the transistor you start runnning into short channel effect issues.

I'm wondering if the speeds being released are actually near current capability of the process at a reasonable bin split. Bin splits on 90nm which has been running for years (and on 200mm) is probably much better than current 65nm splits.

I believe many of the 65nm announcements (numerous strain techniques) have yet to be implemented - If I recall correctly AMD had 3-4 more steps in transistor iterations planned for 65nm when they showed their last technology update foil which showed CTI (I can;t find the link readily)

This is in contrast to Intel which puts all of the major architectural changes in upfront at start of technology and then does minor tweaks over the course of the technology cycle. Just a different approach to technology development.

And BTW if you are pissing off BOTH Intel and AMD fans that just means you're probably closer to the truth then either of them or at least trying to look at things somewhat objectively!

Anonymous said...

"This is actually pretty funny. In this same article I am simultaneously defending myself against AMD fans who think my estimates on K8L performance are too low and defending myself against Intel fans who think I'm being too critical of Conroe and Kentsfield. If both sides are criticizing you that probably indicates at least some objectivity. I guess I would wonder when the last time was that THG or Anandtech were criticized for being too tough on Intel."

thts all well and good but you still havent answered my question. The only knock against kentfield and conroe people seem to have is that it is theoritically bandwidth limited due to its implementation of FSB compared to the IMC of AMD. However you have yet to provide me with a link of any program that consumers use that is actually memory intensive enough to bog down conroes or kentfield. Since you talk about it with such confidence i am sure you have a link to verify your claims.

Anonymous said...

http://www.xbitlabs.com/news/cpu/display/20051025211140.html
"It is currently unknown whether the “dedicated high-speed interconnect” of the Caneland has anything to do with the common serial interconnect or is a more advanced version of the [dual] independent bus used in the Truland platforms (E8500 chipsets) available today and will be used with Bensley platforms (Blackford and Greencreek chipsets)."

http://www.aceshardware.com/forums/read_post.jsp?id=115161817&forumid=1
polopolo wrote: My understanding is that Caneland will make FSB effectively a point to point link. Of course, this is not an ideal solution. Intel really needs a new bus protocol to handle 8+ core chips..

Charlie: Yeah, point to point, but each core still has 1333/4 FSB. It is a simplification yes, but you get the idea. You can also look at it as 1333*4/16.

Stopgap for CSI?

Anonymous said...

The dual 2.6, 2.8, 3.0 4x4 is still worse than Kentsfield on single threaded.

2.66*1.2[Core 2 advantage over K8] 3.192 > 3.0 max.

On threaded apps, AMD has claimed 80% over dual core. 1*1.8=1.8
Intel has claimed 70% over dual core. 1*1.2[Core 2 over K8]*1.7=2.04
2.04 > 1.8

Anonymous said...

Conflict on interests?
http://forums.vr-zone.com/showthread.php?t=97656
They should become the new XS, leaking Yorkfield info and such;)
http://badhardware.blogspot.com/2006/10/amds-k8l-revealed-in-cray-rainier.html
According to that AMD fan blog, K8L is 15% faster ?clock for clock? but able to clock 25% higher..I guess that 40% advantage is comparing at similar price points?

Scientia from AMDZone said...

The only knock against kentfield and conroe people seem to have is that it is theoritically bandwidth limited due to its implementation of FSB compared to the IMC of AMD. However you have yet to provide me with a link of any program that consumers use that is actually memory intensive enough to bog down conroes or kentfield.

To some extent your question seems to be a bit like asking if an elephant will fit into a telephone booth. Then if I say, no, you insist that I can't prove it unless I have actually tried it or I can post a picture of someone attempting it.

I have no idea what your level of knowledge is for assembler, high level programming languages, cpu architecture and system architecture. So, I'll try to explain it as best I can.

The maximum usage of data by C2D is 32 bytes per clock per die. This would be 96 GB's/sec for a 3.0Ghz clock. The maximum speed of memory with a 1333Mhz FSB is 10.7GBs/sec. This means that the processor can consume 9X as many bytes per clock as memory can supply. Kentsfield doubles the demand to 192 GB's/sec. If you can think of any possible way that a 10.7 GB/sec memory can keep up with a 192 GB/sec demand then please let me know.

I assume from your comment about "programs that consumers actually use" that you believe that benchmarks that show Kentsfield running twice as fast as Conroe are "real" and anything else would be somehow contrived to make Kentsfield look bad. The truth is that standard benchmarks are not really representative of program code. The biggest source of bandwidth contention in regular software is mispredicted branches for short sequences of code and long sequences of code that can't fit into the L2 cache. Another source is simple data conversion requiring less than 16 operations per data unit. There are no benchmarks based on this. As soon as a benchmark saturatates the memory bus there is nothng else that can be tested so benchmarks are often written in a distorted fashion to be able to test something besides memory speed.

At some point, we should see 4X4 going up against Kentsfield. It doesn't take a genius to see that memory intensive applications would be completely lopsided since 4x4 would have double the memory bandwidth. I'm guessing therefore that most review sites will avoid any testing that would be memory intensive. If some review site does does do testing like this then it will definitely show in the benchmarks.

Scientia from AMDZone said...

The dual 2.6, 2.8, 3.0 4x4 is still worse than Kentsfield on single threaded.

4X4 is just two sockets; it doesn't affect the speed of each processor. If Intel only delivers a 2.66 Ghz Kentsfield then the 3.0Ghz 4X4 FX should be pretty close.

On threaded apps, AMD has claimed 80% over dual core.

No. The scaling for AMD K8 dual socket processors has typically been 95% or higher. There would be some overhead from NUMA but it wouldn't be 20%.

Scientia from AMDZone said...

According to that AMD fan blog, K8L is 15% faster ?clock for clock? but able to clock 25% higher..I guess that 40% advantage is comparing at similar price points?

K8's SSE engine is really out of date. So, was Yonah's but this was beefed up in Conroe. K8L should be much better in SSE than K8.

Integer is a bit different. I haven't seen anything yet that would cause a big jump in Integer speed. I'm estimating 7.5% but if it is higher that is fine. I'm not so sure about the HPEC clock speeds. Roughly speaking if AMD can do dual core at 3.0Ghz on 90nm then they should be able to do 2.8Ghz quad core on 90nm. 2.9Ghz doesn't sound like a bad introductory speed but I would expect a bump in Q4 07.

I'm a bit sceptical that Intel can go from 2.66 Ghz to 3.6 Ghz. I know that people on Forumz ascribe this to the supposedly magic 45nm process. I just don't have the same faith in Intel nor do I really think they can get out any volume of 45nm in Q3 07.

Anonymous said...

"Roughly speaking if AMD can do dual core at 3.0Ghz on 90nm then they should be able to do 2.8Ghz quad core on 90nm. 2.9Ghz doesn't sound like a bad introductory speed but I would expect a bump in Q4 07."

Its not going to happen AMD has always had frequency issues they havent even broken the 3.0 ghz mark yet. How you conclude that their intro speed will be 2.9Ghz with such ease is ridiculous.

"I'm a bit sceptical that Intel can go from 2.66 Ghz to 3.6 Ghz. I know that people on Forumz ascribe this to the supposedly magic 45nm process."

Kinda like how you conclude that 65nm will give great yields for k8l at 2.9Ghz with its magical 65nm transition.

"I just don't have the same faith in Intel nor do I really think they can get out any volume of 45nm in Q3 07."

And AMD with k8l can rite???

Scientia from AMDZone said...

Kinda like how you conclude that 65nm will give great yields for k8l at 2.9Ghz with its magical 65nm transition.

Let me see if I understand what you are saying:

1. Intel is currently at 2.66 Ghz with Kentsield on 65nm. Intel plans to release a 3.2 ghz Conroe.

2. AMD is currently at 2.8Ghz on cual core and plans to release a 3.0Ghz dual core on 90nm.

3. At the introduction of 45nm, Intel will be able to do a 40% increase in clock on quad core, from 2.66 Ghz to 3.73 Ghz. This would be a 17% increase over dual core.

4. However, you are suggesting that AMD will not be able to manage a 3% reduction in clock on quad core, from 3.0 to 2.9 Ghz.

Well, I have to say that you do seem to have much more optimism for Intel than AMD. I guess I just don't have anything to suggest that this optimism is based on anything but hope.

Anonymous said...

Your last post makes it seem like your talking to one person. I believe its 80% me and 20% someone else;)

I know that socket/core count doesn't matter in single threaded, but 2.66*1.2 > 3.0.

4x4 having double the bandwith? AMD has been king in Sandra Memory, doesn't translate into real world performance. And THG's preview of Kentsfield showed that at 1333 vs 1066, games/encoding is within 0-2 seconds of tests, seems Kentsfield has enough.

Typically 95%? ;)..And law of diminishing returns, we keep hearing 80 percent about 4x4.
http://www.theinquirer.net/default.aspx?article=33081
http://www.xbitlabs.com/news/cpu/display/20060731233200.html
All saying AMD demoed Cinebench..and other heavily threaded apps..And they decide to use Cinebench as their best case scenario.

http://www.hkepc.com/bbs/itnews.php?tid=653100
"Facing Intel quad-core, AMD would only give 4x4 platform to share a similar performance, yet the cost is in great difference."
Just like Prescott Extreme Editions and the like theoretically could beat AMD for the crown, but it was pointless since it was so hot and all.
http://www.theinquirer.net/default.aspx?article=34918
I see Core 2s selling for less than Intel's pricing, so I don't believe that AMD/Intel's pricing is what they actually sell to folks, but MSRP?
2.6*2=$999
2.8*2=$1132
3.0*2=$1500
THG reports[and is often wrong, such as their prediction of an October Kentsfield] that Kentsfield will be $1200(likely for retail, but Intel's official pricing should be $999). $999 would be consistent with http://forums.vr-zone.com/showthread.php?t=96018
I wonder if those numbers are AMD's MSRP..
--Rumor mill--
http://forumz.tomshardware.com/hardware/AM2-Tested-800-Mhz-DDR2ftopic-178405-days0-orderasc-25.html
jkflipflop98, who works at Intel, said they were capable of binning 3.5 Conroes. Pointless though, already having 2.4, 2.66, 2.93 ahead of FX62. They had 3.2 for launch in mind and 3.33 for Q4, but 3.2 would've made 4 over FX62.. And Q3 time, they upped Kentsfield, so 3.2/3.33 dual core would've confused the market going up against a TDP maxxed 2.66 quad.
http://dailytech.com/article.aspx?newsid=2649
More than just a process shrink..
"..shifts away from Silicon Dioxide gate dielectrics - a process the company has used since the mid-90s - to High-k dielectrics. With new dielectric techniques, the company will also revamp its gate electrodes to metal instead of Polysilicon derivatives."
http://forumz.tomshardware.com/hardware/AMD-ROAD-AHEADftopic-180907-days0-orderasc-25.html
45nm Netburts binned in March.--End rumors--
http://www.dailytech.com/article.aspx?newsid=4334
D1D should be ready for Yorkfield, Fab 32 coming online just in time and Fab 28 coming 2008. Volume should be better than AMD's 65 nm rollout, not that they need any [volume], as Yorkfield and other 45 nm will come out just to be "Me first!".

"..Intel plans to release a 3.2 ghz Conroe."
We haven't heard on the 3.2 Conroe in a while. I believe it's canned.

"AMD is currently at 2.8Ghz on cual core and plans to release a 3.0Ghz dual core on 90nm."
FX62 is AMD's fastest and hottest CPU. The 3.0 should extend both leads;) Why not a 3.0 quad/dual for Altair, at least for good PR? Seems like they can't?

Fair point on 3/4, Intel seems to have extremely ambitious goals to meet:)

Scientia from AMDZone said...

Your last post makes it seem like your talking to one person. I believe its 80% me and 20% someone else;)

If you want to take credit for your posts you need a name other than anonymous. My ID shows what I post here, on Sharikou's blog, Sharikou180's blog, and on AMDZone.

2.66*1.2 > 3.0.

I generally figure Conroe as 15% faster. I don't tend to put a lot of faith in the testing at Tom's Hardware Guide; they have shown their lack of professionalism too many times. I suppose proper testing could show that it is 20% faster. And, I have to give you credit because many claim it is 25-40% faster. However, right now, I'm comfortable saying Conroe is at least 15% faster.

And THG's preview of Kentsfield showed that at 1333 vs 1066, games/encoding is within 0-2 seconds of tests, seems Kentsfield has enough.

THG's testing was baloney. Kentsfield should be nearly 2X faster for any application that is compute intensive but hardly any faster for any application that is memory intensive. THG hasn't done real testing for a long time, neither has Anandtech. For example, I ran the SuperPi benchmarks on my computer. After examining the benchmark algorithm I concluded that the results are consistent with a benchmark that does not stress the memory. It takes so many operations to get each decimal point that the processor has plenty of time to save the results to memory. On the other hand, the scaling factor is 2.25X when the number of decimal points are doubled rather than 2X. This pretty much destroys the value of SuperPi as a comparative benchmark. It is fundamentally impossible to compare benchmarks that work with logarithmic output. It has to linear. Yet, THG is still using benchmarks with non-linear output. Therefore, THG is either about 1/10th as knowledgable as I am about benchmarks, or is extremely sloppy in testing, or are trying to skew the results. Take your pick.

.And law of diminishing returns, we keep hearing 80 percent about 4x4.

No. You get pretty close to 100% when you do dual socket with K8. The law of diminishing returns will catch up to you if you keep increasing the number of sockets, not at two. Testing applications that do not rely entirely on the processor are not indicative of scaling.

Let's say that whatever price these are they will be lower volume. It appears that Kentsfield will be at most, 7% of the initial volume of Conroe but probably closer to 2%. And, FX's are typically 0.2% compared to Athlon 64. It will be interesting to see how Intel handles the 3.2Ghz Conroe versus Kentsfield. Right now, Intel can just wait since X6800 really has no competition, nor does E6700 or E6600.

http://dailytech.com/article.aspx?newsid=2649
More than just a process shrink..
"..shifts away from Silicon Dioxide gate dielectrics - a process the company has used since the mid-90s - to High-k dielectrics. With new dielectric techniques, the company will also revamp its gate electrodes to metal instead of Polysilicon derivatives."


Well, I hope Intel has managed to solve the bandgap problem because that problem has stopped every other large company cold that was working on high K. I can understand why Intel doesn't follow AMD's lead though because Intel has less experience with SOI. AMD is also switching to metal gate connectors. I have to wonder about this one because AMD switched to copper interconnects a full year ahead of Intel.

Volume should be better than AMD's 65 nm rollout

AMD's 65nm yield is fine; the FAB however was only at 1/4 capacity by mid 2006 and will only be at 30% 65nm output at 1/2 capacity by year's end. That's 15% of the eventual capacity. AMD will 70% 65nm by mid 2007.

FX62 is AMD's fastest and hottest CPU. The 3.0 should extend both leads;)

A 3.0 Ghz FX-64 would be a complete waste of time. The only way this will work for AMD is to sell these in pairs for 4X4 at basically two chips for the price of one FX-62. Released strictly as an Athlon 64 it would be competitive with E6600, still two speed grades down from X6800 and would have to be priced accordingly.

Why not a 3.0 quad/dual for Altair, at least for good PR?

I don't understand your question. 2.9Ghz at launch seems like a good clock for quad core. This is closer than Kentsfield will be with 2.66Ghz versus 2.93Ghz. However, if they can do 2.9Ghz on quad core they should be able to do 3.1Ghz dual core without a problem.

Let's say that I would have more faith in Intel's 45nm clock goals if they released 3.2 and 3.4Ghz Conroe's on 65nm.

Anonymous said...

20% is the lower end of the spectrum, some also suggest 25%. With your 15% though, 2.66*1.15 is still more than 3.0.

You can claim that 95% is typically the improvement for single to dual socket;) AMD's best numbers is on the very synthetic unreal world Cinebench.

And yes 3GHZ AMD now would be a waste of time, but you reiterated that AMD could do it.

But 3.0 has that magical PR buzz, 2.9 is so close..yet so far. Heat issues? As both AMD/Intel move from clock to cores, I don't think they want their dual cores competing with their quad cores on single threaded.

What reason would they have to release 3.2/3.5 Conroes? A 3.2 Conroe would be competitive with a 3.68[your 15%]/3.84 K8. A 3.5 Conroe would be competing with a 4.025/4.2 K8. Only those won't come anytime soon.

Anonymous said...

xx
For 3.4, AMD would need a 3.91/4.08 K8 to compete. And a year from now, we'll have HKEPC's AMD numbers at 2.7-2.9 and Intel sponsored VR-Zone's numbers of 3.46-3.73.

Scientia from AMDZone said...

AMD will release 90nm 3.0Ghz FX 4x4. These will be low volume. They are basically two chips for the price of one.

I assume you aren't a programmer. 95% is good for one application with threading. You get nearly 100% for multiple applications. Kentsfield will not get nearly 100% for multiple applications. It should be interesting to eventually see Clovertown in genuine transaction benchmarks to see what it can do. I'm guessing it will bog down pretty badly. Presumably this is the reason for Yorkfield.

Anonymous said...

Yes I know that AMD will release 90 nm 3.0 FX 4x4 at low volume.. $1500 for two is not 'for the price of one' though.

Reason for Yorkfield? ..Don't both companies always have something better down the line? Should companies just rest with mediocre because they can?

If 95% is good[while you claimed before that 95% and higher was normal] then it isn't typical. Again, it is AMD that's giving us their best 'normal' number on the synthetic Cinebench.
http://www.theinquirer.net/default.aspx?article=33081
"..they saw an 80% average increase going from two to four cores, but exact testing methodology was not disclosed. Whether or not this carries over to current games is the hot topic of the moment, none of these benches were gaming benches."
http://www.xbitlabs.com/news/cpu/display/20060731233200.html
"..AMD demonstrated CineBench benchmark and said that the 4x4 platform provides 80% performance benefit compared to a similar one with a single dual-core microprocessor."
You can claim that AMD will have 95% scaling on dual socket, but AMD can only give us 80% behind close doors.

ashenman said...

Why does AMD take so long to ramp their facilities? And why is Intel so quiet about how they run theirs. They say they use the Copy Exact system, but that's just a buzz word that means they try to make production independent of their fabs, which is pretty pointless if you ask me.

You still seem to be dodging the real app that uses lots of memory bandwidth question. I guess the question isn't, give us results showing the memory performance increases. We're asking how better memory performance effects us as everyday users. We already know (unless we haven't seen the technically specs of each processor) that AMD chips have superior memory performance in bandwidth and latency. However, video games, multimedia encoding, and general use don't seem to make use of this bandwidth. So what does, and if I'm mistaken about my other assumptions concerning applications' bandwidth usage, explain how.

Anonymous said...

"AMD will release 90nm 3.0Ghz FX 4x4. These will be low volume. They are basically two chips for the price of one."

Kinda funny when one company aka intel released their prescott chip how amd zealots were laughing and now when AMD pulls the same thing times two all is forgiven. If you thought prescott was an oven wait till you check out the thermal reactor which is the 4x4.

"I assume you aren't a programmer. 95% is good for one application with threading. You get nearly 100% for multiple applications. Kentsfield will not get nearly 100% for multiple applications. It should be interesting to eventually see Clovertown in genuine transaction benchmarks to see what it can do. I'm guessing it will bog down pretty badly. Presumably this is the reason for Yorkfield."

Thats the point, the applications and benchmarks you are SPECULATING that kentfield will bog down on are neither useful for mainstream people nor gamer enthusiasts. Infact you have yet to point to any popular program that has shown to bog down the bandwidth of kentfield so i can assume most of those memory advantages the connect structure of k8 brings are theoritical and benchmark in nature kinda like your complains about THG and superpi or are your assumptions any different than the websites you criticize?

Intel plans to bring quad cores to mainstream prices by the time AMD rolls its k8l along as its new FX. And IF k8l DOES manages to outpace kentfield it will not JUST be competing with that but intel's next evolution to their current microarch. Either way its an uphill battle for AMD.

Anonymous said...

The_Ghost

"the few benchmarks that i have seen on foreign sites , seems to suggest that brisbane is about 10% faster on some benchmarks then a 90nm cpu at the same mhz"

I'd like to see those also;)

Anonymous said...

Rahul Sood just got a kentfield chip and he ran quite a few torture tests with no instances of the fsb bogging down the system...Here is the link http://voodoopc.blogspot.com/ would be kinda interested to know what you think about the tests rahul sood did and what your conclusions are about this.

Anonymous said...

http://voodoopc.blogspot.com/2006/10/intel-kentsfield-general-observations.html
Of course he isn't going to say anything bad about his partners..But it's pretty good considering how much he's hyped up 4x4 before;)
Good experience with Kentsfield compared to good expectations with 4x4..

http://techreport.com/onearticle.x/10091
http://www.theinquirer.net/default.aspx?article=34918
http://www.anandtech.com/tradeshows/showdoc.aspx?i=2770&p=2
Not supported by mainstream manufacturers, with power/cooling requirements too high for DIYers, and seems like a last minute rushed effort to compete with Intel.

"When Larry Clark at Intel insisted that Kentsfield would "shock us all" we went ahead and built a couple of platforms for fun.

Let's just say Larry wasn't lying."

"Our lead tech on the project decided to run four instances of Prime 95 in a “torture test” while running Farcry in a looping “max settings” demo. Still no trouble, so he ran four instances of Prime 95 while viewing a DVD with multiple applications sitting idle in the background to eat memory. The system still had 10% of its CPU resources left to spare.

"There is clearly something amazing about this processor - it easily takes place as my most wanted CPU for 2006."

To put this in perspective, usually two instances of Prime will cripple any dual core CPU at any level of performance. Not only did this machine run the through the most intense benchmark without breaking a sweat, it blew away our expectations with the DVD test."

"In fact, I would say based on the performance of the platforms we tested that it shouldn’t really matter about the architectural advantages or disadvantages – the proof is in the benchmarks."

Have they tested 4x4?
http://home.businesswire.com/portal/site/google/index.jsp?ndmViewId=news_view&newsId=20061009005755&newsLang=en
Alienware will demo it in a couple of days.

Anonymous said...

http://www.thechannelinsider.com/pages/article.aspx?articleid=191008&page=1&pagetype=article
Found most concise info on Barcelona yet. Not anything new I think, just more summarized.

http://www.xbitlabs.com/articles/cpu/display/core2duo-preview_9.html
Brain hurt. But there's Intel's specs also.

"so the CPU is still a three-issue architecture.."

Don't know what that means:p
Care to decode all that for us? And how about AMD's claim of 60% increase in performance per watt, HKEPC's 40%[clock/price?], and badhardware.blogspot.com AMD fan blog's claim of 15%?

Anonymous said...

"Well, I hope Intel has managed to solve the bandgap problem because that problem has stopped every other large company cold that was working on high K."

Are you talking about conduction/valence band offsets between high k and Si channel or Fermi level pinning? If Fermi level pinning, metal gate eliminates this issue. If you are talking about band offsets (which in turn impacts leakage) this differs depending on the actual high K material.

As you apppear to have some knowledge on high K could you elaborate on what you mean by "the bandgap problem"

Anonymous said...

"AMD is also switching to metal gate connectors."

I assume by this you mean fully silicided gates? This is not the same as metal gate on high K.

If AMD does this it would be bit surprising as it would be a one generation solution unless of course they are one generation behind on high K which seems to be the case based on IBM's announcements that high K won't be ready until 32nm node.

Darth Solarion said...

AMD just released some details on Barcelona and some were mentioned at Extremetech. Any thoughts?

Anonymous said...

http://www.vr-zone.com/?i=4140
Wizzow..
Wolfdale dual core is 57W and 4.0 compared to 89W and 2.9 Antares. Those Intel TDP numbers look fishy.. But they have magic 45 nm tricks:p

Erlindo said...

Hey Scientia:

What do you have to say about recent Barcelona info?

Anonymous said...

Recent Barcelona seems same before. PS, it's obvious you don't check this blog just once a week..And I'm just assuming;) Why should you be afraid of spammers/flamers? You can always delete/hide comment later.

ashenman said...

Though 45 nm would decrease power requirements somewhat, it wouldn't do magic. If you look at that number as Intel's "average power consumption" then ya, that kinda makes sense (though I'd still assume it would be in the low to mid 50's). Seein as the core 2s are rated by Intel at 65 and 75 watts and actually consume 80 and 90 watts (and those are conservative estimates at the end, as it is likely they consume more) the wolfdale is probably at 7x watts, though I'm sure it's actually higher because Intel is probably factoring in better power savings at idle functions in their processors, and we all know that Idle power consumption matters a lot when you choose your heatsink.

ashenman said...

Also, about Rahul Sood, I don't get how he could run his procs at full blast and take the fan off the heatsink. Seeing as THG (which has always been very Intel biased) has basically calculated (though arbitrarily) that the kentsfield uses 100 watts more power than the 6800, meaning it's at about 190 watts, far more than any single unfanned heatsink could handle. He's either not telling us something about his amazing heatsink, or he's gone to the darkside, and become biased. I'm not saying anything really though, I'm just very confused and unconvinced of his results.

Scientia from AMDZone said...

Recent Barcelona seems same before. PS, it's obvious you don't check this blog just once a week..And I'm just assuming;) Why should you be afraid of spammers/flamers? You can always delete/hide comment later.

I was away for a few days and that is why the comments didn't appear right away.

I've decided I would try it with moderation turned off. This means that any comments will appear immediately instead of my having to approve them.

However, I decided that with the relaxed standards it would be reasonable to require registration so that we can tell who says what. This means that there will no longer be anonymous comments.

I appologize if this makes it more difficult but at least it does seem to b easier to post with word verification turned off. Thank you.

Scientia from AMDZone said...

You still seem to be dodging the real app that uses lots of memory bandwidth question.

Folding applications, Prime95, and SuperPi are all compute intensive enough to not saturate the memory bandwidth. Real applications hit memory every time a branch is mispredicted. They also need to pull in code to be ready for predicted branches. This isn't much of a problem if the entire application is with 5X the cache size since only about 20% of application code is commonly used. If the application code is larger it will tend to overflow the cache and this requires memory access. Any application that uses few enough operations will saturate the bus.

In the THG benchmarks, the only two that show this are DivX6.22 and Photoshop.

Scientia from AMDZone said...

SOI isn't perfect but it has a high bandgap. It is my understanding that it is easy to find high K materials but the vast majority have a smaller bandgap than SOI and therefore have a leakage problem. If Intel is using a high K material they must have found one with a high bandgap. Otherwise, they would have to make the layer even thicker and that would remove the advantage of using a high K.

Anonymous said...

https://beta.blogger.com/comment.g?blogID=9020108538297955304&postID=8162126254282160886
" ashenman said...
Well, I'm anti-Intel, not necessarily pro-AMD,"

Same person is asking you to list apps that take advantage of K8's amazing Sandra memory scores. Apps aren't 'real' because they aren't 'memory intensive'?

And what about DivX6.22 and Photoshop? They weren't exactly the most impressive scores on that preview.

THG probably got one of the earlier Kentsfields.
http://www.xbitlabs.com/news/cpu/display/20060615113237.html
Using whatever they did to future Pentium Ds, they probably done to Kentsfield.
http://sg.vr-zone.com/?i=4098&s=6
http://dailytech.com/article.aspx?newsid=4317
Conflicting temps, but THG seems to be the exception.

Intel has stated that they'll deliver 1M quads before AMD. Assuming a June 2007 debut of AMD quads, Intel has 8 months to deliver 125K quads per month. They've delivered 5M in 60 days so far. Production is continuing to ramp, and the 1M before AMD quota would have quads at 5% of Core 2 production. That 5% would be even less of total production, and I believe they'll pick the coolest running parts for quads to achieve total desktop non meltage.

I also doubt the 57W figure, but 3.5 under 100W is fine with me:)

Scientia from AMDZone said...

The THG testing is done so badly and so unprofessionally that it takes almost as long to figure out if it shows anything as it took them to generate it. They are either very incompetent or are actively trying to skew the results in Intel's favor; take your pick.

The problem with the benchmarks is that you can't tell which are threaded enough to take advantage of the extra two cores. Therefore, it is difficult to tell the difference between bandwidth stagnation and a lack of threading. I compared the benches with an earlier set that THG apparently "forgot" to include in the Kentsfield article. By using the earlier missing results you can tell which applications are thread limited and which are bandwidth limited. Most of the applications that don't show a big increase for Kentsfield are simply thread limited and don't show anything at all about the Kentsfield architecture.

Scientia from AMDZone said...

Erlindo and Darth.

As you probably are aware by now there seemed to be too much information for a simple reply so Barcelona is a new article.