cbn
Lifer
- Mar 27, 2009
- 12,968
- 221
- 106
Very minor impact in overall performance per thread, huge impact in power savings and die level savings (cost).
That is a nice way of putting things. (makes sense to me).
Very minor impact in overall performance per thread, huge impact in power savings and die level savings (cost).
An 8-core BD could well be smaller than a 4-core SB, since adding the four secondary cores apparently only increases size by 5%, and it doesn't have the IGP that SB does.
Speaking of harvested chips,
Does anyone have ideas or speculation on how the "shared components" in each BD module would affect this? (Both positively and adversely)
Presumably they'd cut their losses at the module level, losing two cores in the process. Consider that there is only 12.5% die-area within the core which is truly redundant from a core vs core segmentation. If the fault in the core's logic lies in the other 87.5% of the core's die-area then the module is dead anyways.
Applications are oblivious to the core count. They spawn threads (not request cores or anything), and the OS caters to their needs by scheduling any such threads time with CPU, and now that CPUs have multiple cores, they are scheduled to available cores as is ideal. I can spawn 16 threads on a program I create, and those 16 threads will be handled by the OS despite having only a quad core, or even a single core CPU. It's basic multi-tasking, and OSes have that down to a pat. But if I create a program that only ever uses one single thread, then my quad core will perform just as fast as if it were only a single core CPU..
But if I create a program that only ever uses one single thread, then my quad core will perform just as fast as if it were only a single core CPU.
The problem is that applications don't request/spawn/need many threads at all when the processing needs are serial in nature. In fact, in such a scenario, they can only really use one. That is not Microsoft's fault.
It will be impossible for the OS to "multi-thread" an application that does not work on anything more than a single thread. The OS will have no way to transform a serial workload into a parallel workload, especially a program it knows nothing about. At least, not by non-magic means, and if actually done without magic, that would be a major breakthrough in parallel / multi-threaded programming. I would certainly want in on that, because as it is now, I have to go the painstaking route of optimizing my programs to use multiple threads, and it is no easy task figuring out the best way to parallelize as much as possible from what used to be, or easily are, serial programs (if at all possible - sometimes the program is simply 90% serial, and parallelizing it is impossible or impractical given the costs (code complexity, which affects costs related to development, debugging and maintenance) versus the gain).
I do not know where to start here, and calling Intel's HT implementation as "quasi emulated core" just makes me wonder more if you actually understand the topic (but just like calling it what it isn't), or you actually don't (hence you come up with nonsensical descriptions).
Sorry, perhaps I should not have brought it up. I hope you are not mad.
I give up on this topic. For one thing, all of what we are talking about now is actually off-topic. The real place for this is another thread (or, if #1, then that thread should be in Programming or OS subforums). So have your say if you please, then we'll let it go so as not to continue with the derailment
your still missing the point. 90% of the performance issues as far as execution and user interaction as well as CPU utilization are essentially based on windows being a pretty shitty OS. The CPU manufacturers are simply trying to work around this flaw.
No, he doesn't. What he illustrates is that you clearly have no clue about what the OS is in charge of and what the application is in charge of. The OS should NEVER try to change how an application behaves, EVER. And no OS on the market does this. Taking a single threaded application and trying to make it multithreaded is bad on so many levels NO operating system for ANY system tries to do this.Saw it slow say it fast. Windows drags CPU performance down.
You illustrate my point right here
No, Thread generation should NOT only occur in the os (Ok, the OS should be in charge of creating and managing WHEN threads run, but not WHAT they run). Figuring out what can and can't be locked is very much a consideration that application developers need to make, not OS writers. For an OS writer to make such a decision would mean that before each application starts to run, the OS would have to comb through the application, see what is running when, see if it could be split up, and see if there would be any race conditions. That is a HUGE problem to solve that would really get you complaining about application launch speed if any OS ever even dared to attempt it.Why does the application even care ? It should simply make commands of the Os API to execute instructions. Thread generation should only occur in the OS.
You are seriously a moron. When it comes to threading APPLE DOES THINGS JUST LIKE WINDOWSTHIS IS THE FATAL FLAW WITH WINDOWS
It absolutely is. That why we are being faced with ever more exoctic sulutions to problems that should not even exist. AMD by the looks of things is essentially dealing with a windows centric issue. Application dependance. Why is the OS not doing its job. Before you say another word. go look at apple Operating systems.
Wrong wrong wrong wrong wrong.
these guys do it with no problem.
http://www.google.com/url?sa=t&sour...fdta0G&usg=AFQjCNGZwUNnhrO3EI1y_LRQXT7XuoI55w
http://www.haiku-os.org/
go tell them they can't do what they already have a OS doing.
I'll have a good laugh. they also have a very aggresive threading engine in the OS. actually if they put more eye candy on it and get a slightly more modern looking GUI and they manage to get a functional version on modern hardware.
Umm. You are an idiot.
"
I have no idea what kind of code you write. but I work with low level stuff. You know binary, assembler. I can assure you that
1. the way the threading engine works is vastly different
2. windows sucks at it
3.the apple implementation is vastly different
4. It is not a x86 issue.
CPU's are vastly underutilitzed. Period. Especially where system responsiveness is concerned.
bullshit.
Nobody, nobody works with binary.
I think Cogman was a little too emotionally invested in his rebuttal. However, he has some solid arguments. Just read up on what is being talked about openly by Intel and AMD. They are saying they are running out of ways to optimize code execution on their cores and asking compiler writers and programmers to pick up the pace in brainstorming multithreaded techniques. So while Windows threading might not be ideal, it is not the only limiter and most likely not the biggest roadblock in utilizing the multicore beasts coming down the pipe.
01001001001000000111011101101111011100100110101100100000011101110110100101110100011010000010000001100010011010010110111001100001011100100111100100100001
Wow, this thread went to piss and vinegar fast!
Be careful, or someone's going to get a ripped pocket protector and broken slide rules!
Computers, as you could probably tell, are my passion.
Well, While we're off topic from the AMD BOBCAT/BULLDOZER discussion, I might as well add in my own $0.02
Really? given your profile pic I'd imagine that a much simpler mechanical device (dating to the late 19th century) was your passion, and that you also had a penchant for trendy devices that failed to provide a concrete advantage was also an interest of yours. (admittedly I like vintage pointless trendy crap too, in addition to several of those, I have biometric chainring lying around somewhere)
I know exactly what Cogman's Profile Pic is, does anybody else?
Or we might try to go back on topic. ;-)
Because of the shared resources in a module (eg decoder, FPU), I'm not sure if you can speak of 'physical cores' with Bulldozer, to be honest.
I think we can say this:
A Bulldozer module is similar to one physical core on a HT processor: It contains two logical cores.
Logical cores on Bulldozer and HT processors can be considered equivalent.
But I'm not sure what a 'physical core' would be for Bulldozer. I think perhaps we should not even try to define it, as it isn't very relevant.
But yes, I think AMD will be marketing it on their logical core count.
Yes, you do. ((A*B)+C)/D cannot be made parrallel, and must run in order. You must do the mul (A,B), then add (result,D), then div (result,D), waiting on the pipelines to go all the way through, each time. OOO helps when you have instructions that are not so dependent. Luckily, that's quite common.With OoO (Out of Order) execution you dont need to execute instructions one after the other in a given order.
Yes, you do. ((A*B)+C)/D cannot be made parrallel, and must run in order. You must do the mul (A,B), then add (result,D), then div (result,D), waiting on the pipelines to go all the way through, each time. OOO helps when you have instructions that are not so dependent. Luckily, that's quite common.
the only similarity btwn ht and a BD module is that they each take up ~ 5-10% more die space. Obviously the proof will be in the pudding, but up to 80% extra performance on the 2nd core is much better than ht's 15%.
Careful there...
"AMD is also careful to mention that the integer throughput of one of these integer cores is greater than that of the Phenom II's integer units."
Problem is, each Phenom II core has 3 integer units (or well 3+3, if you break it down to ALU/AGU).
Making the statement a bit of a 'no shit, Sherlock'-one (two units better than one? really?)
Yea I know... Barcelona will also be 40% faster than Kentsfield.
While you're waiting for mul(A,B), why can't you schedule the next non-dependent uop, say like MUL(X,Y) and then do ADD (result, D) all back to back.
You can, provided that is there, and needing execution.Cerb said:OOO helps when you have instructions that are not so dependent. Luckily, that's quite common.
You can, provided that is there, and needing execution.
