Originally posted by: Idontcare
Originally posted by: Nemesis 1
If you open the link . Second point . It says this.1.1x-1.25 performance increase in single thread apps' = 10%-25% increase. Off course its at the same clock . How else would Intel do it. .
You are missing the point I was attempting to communicate.
If Intel sells a 3GHz Nehalem and you run a single-threaded application then the chip is going to automatically "turbo mode" the core running the single-threaded app (if you believe the marketing hype) to something >3GHz.
So if you compare single-threaded performance of a "3GHz" Bloomfield chip (albeit running 1 core at 3.5GHz and the remaining 3 cores at 2Ghz to fit into its power envelope) to a 3GHz Yorkfield are you comparing "clock to clock"? No you won't be.
So my question is how much of that 15-25% single-threaded performance boost is from the CPU up-clocking the loaded core by 15-25% versus how much is actually going to come from IPC improvments?
And if that is the case, what happens if I load a Bloomfield with 4 instances of a single-threaded application and turn them all on at once?
Because of TDP restrictions the chip won't operate any of the cores in turbo mode (as they are all fully-loaded with single-threaded apps) so will I still see a 15-25% performance boost in my single-threaded apps on Bloomfield relative to loading a Yorkfield in similiar fashion?
Have I clearly communicated my question now?
Originally posted by: Nemesis 1
Than you wrote this.
Because of hyperthreading you can get up to 2X more threads operating in parallel - so multi-threaded performance could potentially double on Bloomfield over Penryn simply because Bloomfield supports 8 threads and Penryn supports 4.
Exactly and it scales from 1.2 -2x multi threaded arch improvement. First point on slide 1. = 20%-100% improvement in multi thread arch.
If the multi-threaded performance increase is "at best" a linear extrapolation of the number of threads on the chip despite the chip architecture changing from Yorkfield to Bloomfield then that further suggests there is little to no IPC improvement per thread.
If they doubled the threads AND increased the IPC per thread AND integrated the memory controller then I would expect the upper end of the improvement range to be >2X and not just simply listed as "up to" 2X.
Originally posted by: Nemesis 1
Well i don't care. I seen you quoted Munky . So explain what you meant by that Video' and how it had anything to do wiyh my families Xmas.
Jeg87 . I call BS. Just go to the Skull trail thread . Anything and everthing ya said has been proven wrong.
Originally posted by: Nemesis 1
Its too close to a server platform. dual socket means less cpu overclocking, and FBDIMM = expensive trash. plus I really doubt nvidia will be making a chipset for that, so no SLI.
frankly, if I needed skulltrail I could buy it today. just grab a supermicro or tyan board with dual sockets and a couple of Clovertown Xeons. 8 gigs of FBDIMM, and im ready to go. but do I need that? NO. Do i want it? NO. If I had a business that depended on multi threading and large memory amounts (such as graphic design) then of course. But I dont, I power my PC to surf the web, chat, watch audio/video, and games.
I want the best of everything, and skulltrail is definetely not it. Heck, AMD's quad father is a better solution, at least it uses the 680a chipset and supports SLI. but nvidia is too strict with their chipsets and their platforms. Yes you can make the argument that you can run crossfire with skulltrail, sure thing, but your gonna spend all that money and still not have the best? Heck no. When they put 8 cores on 1 cpu and nvidia makes an SLI chipset for it, I will buy. Or the faggots in Santa Clara could be less strict about their SLI license...
perhaps you did not understand what I said.
what is there in skull trail, that we cannot do already? pretty much nothing. you think that faster overclockable memory is going to bring tangible speed increases?
we can already build dual processor systems, and yes, even though they cannot be overclocked, believe me, skulltrail will not overclock well either. you have no idea what it means to run 8 CORES at high speed. First of all, you will generate so much heat that simple aircooling inside a desktop tower will not suffice. if you think watercooling is a better idea, well prepare to buy 2 cpu blocks and 2 rads. second of all, lets not forget that ALL 8 CORES need to be stable at the given speed. Being in 2 different CPUs, the cores would have very significant temperature differences, and very different voltage requirements to work at a certain speed. Yes, the chips are sold in pairs, but that doesn't guarantee they are identical. And if you decide to touch the FSB (despite the unlocked multiplier) you now have 2 sets of NB, SB and FSB voltages that need to stable.
All I have to say is good luck. Multi processor systems are not meant for the enthusiast, they are meant for the servers that run 24/7 at stock speeds with no problems. Whether intel makes the RAM overclockable, or unlocks the multiplier on the CPUs will make very little difference. You guys just dont see the troubles that will come with such a system. You just look at the potential.
- AMD 4x4 will be better
actually it is already out on the market, and it supports SLI ont he 680a chipset. and when you drop in two Agena FX chips in there, you never know what might happen. So I dont see how you can make an argument that a "future" dual yorkfield machine is better than a dual K8 machine. gee I had to go to university to figure out that an architecture from 2005 wouldnt stand a chance against one from 2008.
Jag87 I took the liberty of bolding everthing false you said. With the exception of the last one. 4X4 was a product but AMD scraped after its miserable showing . And seeing the Great performance of Skulltrail. There is more I could have bolded but that would have been nitpicking.
Originally posted by: BrownTown
Well actually FWIW Skulltrail DOES have SLI, but it is still incredibly overpriced for only minimal performance gains
Does overclock decently though:
http://www.xtremesystems.org/f...howthread.php?t=169421
Originally posted by: jones377
Not wanting to get involved in your private forum war (which you really should take to PM!) but I gotta ask you JAG87, is your system stable 24/7?
Originally posted by: Nemesis 1
Originally posted by: BrownTown
Well actually FWIW Skulltrail DOES have SLI, but it is still incredibly overpriced for only minimal performance gains
Does overclock decently though:
http://www.xtremesystems.org/f...howthread.php?t=169421
Ya it does. It is over the top. But I cann't think of one person that wouldn't shell out $2,000 for this if they could get it.
As far as performance ya don't know it depends on a lot of things,
Back to performnce . Here what movie man said . Among many other things. You read the thread so you know.
It's also the WR in Cinebench by SO much it's not even funny!
Maybe a good 10,000 points over any other known machine.
Originally posted by: JAG87
wawaweewa, its 10,000 points ahead in a multi threaded benchmark. wooooopie.
proof of concept, nothing more. unless you encode video 24/7. and if you do, why waste money on skulltrail, you might as well get a dual clovertown system or a future dual xeon yorkfield, which I'm sure will be cheaper than skulltrail.
I'm not sure how you can suggest that it will be able to run 2x the number of threads simultaniously, without increasing the execution units 2x.Originally posted by: Idontcare
Increasing threads/core to >1 without inducing a thread performance penality is not new. Niagara processors do it, Power6 as well.
I would expect POV-Ray (the multi-threaded beta) performance to scale linearly with the number of available threads.
So, if IPC per thread is not improved in Nehalem versus Penryn then I'd expect a Bloomfield to perform 2X as fast as Yorkfield is "clock-for-clock" unless the new and improved hyperthreading in Bloomfield is in fact really crappy and does turn out to introduce a performance penality to the 2nd concurrent thread running on a given core...
Originally posted by: Nemesis 1
I love what Swinburn wrote here.
Personally, after writing this I'm actually quite concerned about Intel's positioning - I'm worried that now Intel is the current preferred product over AMD, it'll use this leverage to try and suck out as much cash from enthusiasts as possible, and not have them overclock lower parts, like the E6300, Q6600 etc, to perform like £700 CPUs. At the same time, it's also potentially limiting the availability of multi-GPU by its competitors by forcing the separate north bridge, which offers better performance, to potentially only be available onto Bloomfield CPUs. It seems all the cards are in Intel's hands to deal precisely how it wishes.
I don't understand this kind of thinking. It looks to me like Intel is adderessing every segment in a way that will be cost effective for all sectors top middle and bottom . Effectively locking AMD and NV out of the intels lowend parts. So intel is making syre this is an intel chipset only . I love it. In the mid range only 1 16x pci-e or 2 8x pci-e slots.
This is good for midrange. Any see a problem here I sure don't.
Than the highend . I can't wait. I hope Larrabe works like I am hoping in so far as scaling . No way do I exspect Larrabbe to out perform Nv or ATI topend cards. But I am hoping that you can install 4 cards and get better scaling than either sli or XF.
It is really going to be nice knowing if you put out the $$$$ for the Highend desktop parts. The cheaper mid and lower end parts won't be able to = the performance. It about time.
Really whats it matter . As long as intels lowend stumps on AMDs topend its all good.
Originally posted by: VirtualLarry
I'm not sure how you can suggest that it will be able to run 2x the number of threads simultaniously, without increasing the execution units 2x.
Multiple threads run on spare execution units. If one thread is running, then it is using at least 1 execution unit, possibly more. Thus, there is total execution units-1 (at best) available for processing the second thread. Thus there has to be at least some penalty for processing multiple threads at the same time. Execution units are limited and are not free.
I would expect at best a modest speedup due to SMT, perhaps on the order of 25%, much like netburst. Not something along the lines of 2x, that's impossible.
Originally posted by: dmens
you are correct, SMT is yet another attempt to utilize resources that would otherwise stay idle, but consider the possibility of false work on one thread preventing real work by the other thread. also note that nehalem has three levels of cache.
That was a very fun video. Thanks.Originally posted by: Nemesis 1
Just to lighten things up a bit.
http://youtube.com/watch?v=cbjtuo0_rTg&feature=user
Originally posted by: VirtualLarry
Do you hate everyone, or just love Intel so much that you can't help but praise them when they plan on screwing everyone else.
That's taking "fanboy" to a new level of sincerity and sickness.
Originally posted by: VirtualLarry
I'm not sure how you can suggest that it will be able to run 2x the number of threads simultaniously, without increasing the execution units 2x.
Originally posted by: dmens
you are correct, SMT is yet another attempt to utilize resources that would otherwise stay idle, but consider the possibility of false work on one thread preventing real work by the other thread. also note that nehalem has three levels of cache.
Originally posted by: Nemesis 1
Originally posted by: dmens
you are correct, SMT is yet another attempt to utilize resources that would otherwise stay idle, but consider the possibility of false work on one thread preventing real work by the other thread. also note that nehalem has three levels of cache.
I not sure how intels H/T will work out . But I will make the mistake of assuming Intel has improved it considerably over netburst performance.
On the Nehalem 3level cache thing. From what I have seen and read only the server parts will have L3 . Could you expand on this . Also if the server parts are the only ones with L3 . On the desktop does this statement stand up . Intel Nehalem will have multi-leveled shared cache. Intel has done something with L1 if its is .5mb. I not sure Li can be shared. But the cache size increase for L1 would lead one to believe something is going on . Either way thats a healthy increse in transitors on L1.
Originally posted by: Nemesis 1
I am at AT forums. Someone has finely assciated Itanium with Intels desktop processors.
Originally posted by: Nemesis 1
Intel calls this "coarse multithreading" to distinguish it from "hyperthreading technology"
