beggerking
Golden Member
- Jan 15, 2006
- 1,703
- 0
- 0
Originally posted by: ElFenix
*cough*xbox*cough*Originally posted by: apoppin
i.e. we will have very specialized platforms optimized just for gaming
lol
loved its hackability!
Originally posted by: ElFenix
*cough*xbox*cough*Originally posted by: apoppin
i.e. we will have very specialized platforms optimized just for gaming
I'm not sure of any reason for which the GPU would have to send instructions to the CPU, but I know that the CPU has to send its binary data to the GPU to process it into a picture. This act of CPU--->GPU sending could benefit from a more unified architecture such as a CGPU integration.why does GPU need to communicate with CPU or vice versa?
Originally posted by: josh6079
I'm not sure of any reason for which the GPU would have to send instructions to the CPU, but I know that the CPU has to send its binary data to the GPU to process it into a picture. This act of CPU--->GPU sending could benefit from a more unified architecture such as a CGPU integration.why does GPU need to communicate with CPU or vice versa?
Originally posted by: beggerking
There is no reason for direct CPU->GPU communication. Its NOT an bottleneck of CPU--->GPU, its an bottleneck of CPU -> memory ->GPU -> memory -> output.
Originally posted by: Aikouka
Originally posted by: beggerking
There is no reason for direct CPU->GPU communication. Its NOT an bottleneck of CPU--->GPU, its an bottleneck of CPU -> memory ->GPU -> memory -> output.
Begger, I always believed that the CPU decoded the higher-level graphic instructions into the video card's "language" via the drivers. If this is so, then direct CPU to GPU communication could speed up the transfer of commands.
Tell AMD that.There is no reason for direct CPU->GPU communication. Its NOT an bottleneck of CPU--->GPU, its an bottleneck of CPU -> memory ->GPU -> memory -> output.
More information concerning HTX shows that:Because of its flexibility, the HyperTransport protocol allows a multitude of co-processor designs that are already compatible with systems on other platforms. For example, with Torrenza, specialized co-processors are able to sit directly into an Opteron socket, and communicate directly with the entire system...Although AMD acknowledges many of these applications can run off PCIe and other connection technologies, Torrenza emphasizes HT-3 and HTX in particular....AMD representatives said that because of the archicture, Torrenza allows very low latency communication between chipset, main processor and co-processors.
Obviously there is a bottleneck if this technology is shows improvements of 10-100x faster than the current tech.With AMD opening up their HT architecture, quick communication between their processors and a HTX card or socket could allow for media encoder co-processors that can accelerate media encoding by up to 10-100 times...Eitherway, using HT links that allow for lower latency and high bandwidth access directly to the CPU look like it could bring some exciting developments to PC platform processing power.
We know. That's what AMD will be changing.no "direct" communication. Processed data does not go straight from CPU to GPU.
Originally posted by: josh6079
Tell AMD that.There is no reason for direct CPU->GPU communication. Its NOT an bottleneck of CPU--->GPU, its an bottleneck of CPU -> memory ->GPU -> memory -> output.
This picture, found here, gives a good preview of direct communication between a CPU and a GPU.
More information concerning HTX shows that:Because of its flexibility, the HyperTransport protocol allows a multitude of co-processor designs that are already compatible with systems on other platforms. For example, with Torrenza, specialized co-processors are able to sit directly into an Opteron socket, and communicate directly with the entire system...Although AMD acknowledges many of these applications can run off PCIe and other connection technologies, Torrenza emphasizes HT-3 and HTX in particular....AMD representatives said that because of the archicture, Torrenza allows very low latency communication between chipset, main processor and co-processors.
Obviously there is a bottleneck if this technology is shows improvements of 10-100x faster than the current tech.With AMD opening up their HT architecture, quick communication between their processors and a HTX card or socket could allow for media encoder co-processors that can accelerate media encoding by up to 10-100 times...Eitherway, using HT links that allow for lower latency and high bandwidth access directly to the CPU look like it could bring some exciting developments to PC platform processing power.
We know. That's what AMD will be changing.no "direct" communication. Processed data does not go straight from CPU to GPU.
Picture now idiot-proof.where on the picture does it show direct communication between CPU and GPU? do you even know how to show a direct relationship? please stop giving out false info. Thank you.
Talk about giving false information. And what do you have to support that claim?your socket co-processor != GPU
So the support for your "co-processor != GPU" theory is me e-mailing AMD? That's your evidence? That they'll "tell me the same?"...you can email AMD on that but they will tell you the same
Except that the math co-processor was integrated into the core of a CPU back in the early 1990's, hinting that other co-processor's will be integrated if they can be. That's why this GPU integration is interesting. It's one of the first co-processors to plan on becoming integrated with the CPU in nearly 20 years....co-processors such as physics processor, math-coprocessors, but definately not a GPU.
Please give us any info detailing how the GPU is not by any means a co-processor. I gave links to, DailyTech, HardOCP and Wikipedia. Where is the information to support your claim of it not being a co-processor?Once again, please stop give out false info.
You have a habit of giving evidence that does not support your claims. In that FAQ, even ATI stated to:Originally posted by: beggerking
Josh, please read this whole and complete "graphic adaptor myth demystified"
Now you better go correct Wikipedia and ATI.Then add a Radeon CrossFire Edition co-processor board and plug in the external cable to unite multi-GPU power.
umm.. where is the GPU? it says optron socket, not GPU. stop misinterpret .Originally posted by: josh6079
where on the picture does it show direct communication between CPU and GPU? do you even know how to show a direct relationship? please stop giving out false info. Thank you.
Originally posted by: Josh6079
You have a habit of giving evidence that does not support your claims. In that FAQ, even ATI stated to:
Now you better go correct Wikipedia and ATI.Then add a Radeon CrossFire Edition co-processor board and plug in the external cable to unite multi-GPU power.
bios, memory, driversJosh:
Please give us any info detailing how the GPU is not by any means a co-processor. I gave links to, DailyTech, HardOCP and Wikipedia. Where is the information to support your claim of it not being a co-processor?
Regardless of whether it is replaceable or integrated, the video adapter consists of three components:
A video chip set of some brand (ATI, Matrox, Nvidia, S3, Intel, to name some of the better known). The video chip creates the signals, which the screen must receive to form an image.
Some kind of RAM (EDO, SGRAM, or VRAM, which are all variations of the regular RAM). Memory is necessary, since the video card must be able to remember a complete screen image at any time. Using AGP, the video card may use the main memory of the motherboard.
A RAMDAC - a chip converting digital/analog signals. Using Flat panel monitors, you do not need a the function of a RAMDAC.
Have you not even been reading the HTX articles I've linked? HTX may replace PCI-E by creating a direct path from the HTX socket with the GPU attached to the Opteron socket.umm.. where is the GPU? it says optron socket, not GPU. stop misinterpret.
What? So now the Crossfire motherboard is a co-processor in your mind? What else in a CF setup would they have been talking about? You have two GPU's and a motherboard, guess which one could possibly be a co-processor.you missed the part where it says its a "co-processor board"
I'm echoing what HardOCP, DailyTech, and Wikipedia said. I guess you're just one really smart guy.skipped your crap, because obviously you have no idea/knowledge of what you are talking about.
This coming from the guy who said "SOFTWARE HAS NOTHING TO DO WITH HARDWARE!!!!"please read my link if you really don't know how videocard works. You need bios, memory, drivers etc, not just a co-processor aka your GPU in co-processor theory.
What is a "co-processor board?" Can you enlighten me? Specifically, what is a "Radeon CrossFire Edition co-processor board?" **I think the "Radeon" gives a big hint...**again, you missed the most important part.. its a "co-processor board", very different from just a co-processor.
Originally posted by: beggerking
Regardless of whether it is replaceable or integrated, the video adapter consists of three components:
A video chip set of some brand (ATI, Matrox, Nvidia, S3, Intel, to name some of the better known). The video chip creates the signals, which the screen must receive to form an image.
Some kind of RAM (EDO, SGRAM, or VRAM, which are all variations of the regular RAM). Memory is necessary, since the video card must be able to remember a complete screen image at any time. Using AGP, the video card may use the main memory of the motherboard.
A RAMDAC - a chip converting digital/analog signals. Using Flat panel monitors, you do not need a the function of a RAMDAC.
Originally posted by: redbox
Originally posted by: beggerking
Regardless of whether it is replaceable or integrated, the video adapter consists of three components:
A video chip set of some brand (ATI, Matrox, Nvidia, S3, Intel, to name some of the better known). The video chip creates the signals, which the screen must receive to form an image.
Some kind of RAM (EDO, SGRAM, or VRAM, which are all variations of the regular RAM). Memory is necessary, since the video card must be able to remember a complete screen image at any time. Using AGP, the video card may use the main memory of the motherboard.
A RAMDAC - a chip converting digital/analog signals. Using Flat panel monitors, you do not need a the function of a RAMDAC.
Looking at all of those componets I don't see why it would be impossible to intergrate the gpu into the die of the cpu.
Originally posted by: josh6079
Have you not even been reading the HTX articles I've linked? HTX may replace PCI-E by creating a direct path from the HTX socket with the GPU attached to the Opteron socket.umm.. where is the GPU? it says optron socket, not GPU. stop misinterpret.
[/quote]What is a "co-processor board?" Can you enlighten me? Specifically, what is a "Radeon CrossFire Edition co-processor board?" **I think the "Radeon" gives a big hint...**Josh6079
Then add a Radeon CrossFire Edition co-processor board and plug in the external cable to unite multi-GPU power.
beggerking
again, you missed the most important part.. its a "co-processor board", very different from just a co-processor.
Originally posted by: beggerking
Originally posted by: redbox
Looking at all of those componets I don't see why it would be impossible to intergrate the gpu into the die of the cpu.
what chipset / corresponding driver would you be using?
what IO /bios and/or RAMDAC are you going to use?
Originally posted by: redbox
Well the componets you listed didn't have a chipset other than the gpu, it didn't talk about drivers in that componet list either, or bios. The ramdac could be implimented on the motherboard couldn't it? .
Originally posted by: beggerking
Originally posted by: redbox
Well the componets you listed didn't have a chipset other than the gpu, it didn't talk about drivers in that componet list either, or bios. The ramdac could be implimented on the motherboard couldn't it? .
1. explain to me how can you translate data into pixels without a driver?
2. how do you even start your system without some type of video bios? you don't have input/output!!
3. ramdac is chipset specific, it could be implimented on mb, but how does your single co-processor GPU going to know and use it?
physics processor is an add-on card, a graphic card is not. A graphic card is an output device that incidently includes a GPU to help processing, but a BIOS(basic Input Output ) is required.
It is possible to put GPU as a co-processor (as in 3dfx era) but we will still need a videocard (there eliminates Josh's idea of not needing pci-e slots). performance will suffer though as it does not have its dedicated memory. ( contradicts Josh's idea that there will be performance gain). Without performance gain and still requiring a videocard makes the possible of implementing GPU as a co-processor pretty much useless.
Its not a bad idea to have co-processor sockets, but Josh is just wishful thinking and argumentive without any qualification.
Originally posted by: redbox
I understand the need for a bios with a video card it is just that the bios wasn't included in the list of what a video adapter consists of. As far as the ramdac is considered since AMD bought ATI then I would imagine it wouldn't be that hard to put a chipset specific ramdac on the mb.
Would you consider a sound processor a co-processor it has basic in and outs? What are your requirments for being a co-processor.
You say it is possible to put GPU as a co-processor why would we need a video card then? You would be right that performance would suffer if it would have to use the onboard ram we have right now. What about using an on die cache system like cpus use now? Would that speed up performance a bit. I will agree with you that the area this design is likely to incure bottle necks at is when it has to use main memory, but IMO caching could be used to circumvent this problem. Perhaps AMD is thinking about adding another heirarchy of cache onto this fusion chip? Would that be a worth while road to go down? Granted the size of the memory wouldn't be very high, but it would have the potential for high bandwidth.
Originally posted by: thilan29
Here we go again....
. . . this morning?s comment from AMD?s CTO, Phil Hester, fires a broadside squarely at Intel?s design philosophy of multicore schemes in multiples of two:
?In this increasingly diverse x86 computing environment, simply adding more CPU cores to a baseline architecture will not be enough,? said Hester. ?As x86 scales from palmtops to petaFLOPS, modular processor designs leveraging both CPU and GPU compute capabilities will be essential in meeting the requirements of computing in 2008 and beyond.?
In other words, AMD?s future ?Fusion? platform, which the company says could be ready by late 2008 or early 2009, won?t just be an Athlon and a Radeon sharing the same silicon. The companies will truly be looking into the potential benefits of using multiple pipelines along with multiple cores, as a new approach to parallelism in computing tasks.
When the two companies first announced their merger intentions last July, although both sides were using strange new hybrid almost-terms like ?GP-CPU? to describe the research they had read about ?- if not actually participated in ?- their willingness to head down this new architectural road seemed more tentative than definitive.
With three months having passed, and analysts throughout the duration of that time having discussed the real possibility of Intel?s having recaptured the market share momentum, the new AMD seems more willing to embrace a development path that could more clearly distinguish itself from its core competitor.
Almost from the day that pipeline processing was added to the first hardware-assisted rendering systems for PCs, researchers (though mostly with universities, not manufacturers) have been exploring the notion of co-opting graphics processor-style parallelism for use with everyday tasks. Both multicore CPUs and modern GPUs use parallelism to divide and conquer complex tasks, although their approaches are radically different from one another. That difference, researchers say, could make them complementary rather than contradictory.
Multicore parallelism delegates discrete computing tasks among differentiated logic units. If you can imagine the CPU performing triage when it first examines a passage of object code, it delegates tasks into separate operating rooms where individual doctors, if you will, can concentrate upon each one exclusively.
The approach the CPU takes depends on both its own architecture and that of the program. In discrete parallelism, programs are compiled by software that already ?knows? how tasks should be delegated, so the compilers ?toe-tag? tasks in advance. The processor reads these tags and assigns the appropriate logic units, without much decision involved on the CPU?s part.
This is the kind of ?discrete parallelism? approach that Intel?s Itanium and Itanium-2 series (IA-32 and IA-64 architectures, respectively) take. As you probably already know, the fact that Itaniums have not been hardware-compatible with x86 architecture, has been one of Intel?s historical detriments.
The other approach CPUs take is with implied parallelism where the processor analyzes a program that would otherwise run perfectly well in a single-core system, but divides the stream into tasks as it sees fit. This is how Intel?s hyperthreading architecture worked during its short-lived, pre-multicore era; and it?s how x86 programs can run in multicore x86 systems, running Core 2 Duo or Athlon processors, today.
But there are techniques programmers could take to help out the implied parallelism process, usually involving compiler switches, including on Intel?s own compilers. It wouldn?t change the program much, except for possibly a few extra pragmas and some tighter restrictions on types. Still, as Intel?s own developers have noted, programmers aren?t often willing to make these quick, subtle changes to their code, even when the opportunity is right in front of them.
Keep that in mind for a moment as we compare these to GPU-style parallelism. In a GPU, there aren?t multiple cores or logic units. Instead, since their key purpose is often to render vast scenes using the same math over and over, often literally millions of times repeated, GPUs parallelize their data structures. They don?t delegate multiple tasks; they simply make the same repetitive task more efficient by replicating it over multiple pipes simultaneously. It?s as though the same doctor could perform the same operation through a monitor, on multiple patients with the same affliction simultaneously.
Although CPU manufacturers arguably pioneered the concept of Single Instruction/Multiple Data for use with the first graphics acceleration drivers in the late 1980s, it?s GPU manufacturers such as ATI and Nvidia which took that ball and ran with it, to some extent clear out of the CPU companies? ballpark.
As university researchers have learned, co-opting the GPU for use in processing everyday, complex data structures, yields unprecedented processing speed, on a scale considered commensurate with supercomputers just a few years ago.
But unlike the case with implied parallelism in multicore CPUs, a shift to SIMD-style programming architecture would, by most analyses, mean a monumental transformation in how programs are written. We?re not talking about compiler switches any more.
Several new questions and concerns emerge: How does the new AMD evangelize programmers into making the changes necessary to embrace this new platform, in only two years? time, while simultaneously (to borrow a phrase from parallelism) trumpeting ?Fusion?s? compatibility with x86 architecture. When Itanium weighted down Intel?s momentum, AMD didn?t pull any punches in pointing that out to everyone. It?s made astonishing success by touting its fully-compatible, x86-oriented approach to 64-bit computing, at Intel?s expense.
AMD cannot afford to make a U-turn in its philosophy now, not with Intel looking for every weakness it could possibly exploit. And if AMD succeeds in hybridizing an x86/Radeon platform while staying compatible and holding true to its word, if there are no automatic performance gains in the process of upgrading ?- as there clearly are with multi-core -? will customers see the effort as a benefit to them?
We present a new method in rendering complex 3D scenes at reasonable frame-rates targeting on Global Illumi-
nation as provided by a Ray Tracing algorithm. Our approach is based on some new ideas on how to combine
a CPU-based fast Ray Tracing algorithm with the capabilities of todays programmable GPUs and its powerful
feed-forward-rendering algorithm. We call this approach CPU-GPU Hybrid Real Time Ray Tracing Framework.
As we will show, a systematic analysis of the generations of rays of a Ray Tracer leads to different render-passes
which map either to the GPU or to the CPU. Indeed all camera rays can be processed on the graphics card, and
hardware accelerated shadow mapping can be used as a pre-step in calculating precise shadow boundaries within
a Ray Tracer. Our arrangement of the resulting ve specialized render-passes combines a fast Ray Tracer located
on a multi-processing CPU with the capabilites of a modern graphics card in a new way and might be a starting
point for further research.
We plan to design and evaluate hybrid and reconfigurable graphics rendering pipelines for parallel and distributed graphics on the CPU-GPU cluster with a tiled-display. The graphics pipeline can be divided into two stages -- geometry processing (transformations, clipping, and lighting) and raster processing (scan-conversion, visibility, shading, texture mapping, and compositing). One of the most interesting ways of looking at parallel graphics is to consider the stage at which the depth sorting of display primitives is carried out. While almost all parallel rendering approaches have relied on one sorting scheme, we plan to explore reconfigurable and hybrid sorting schemes that can dynamically change the type of sorting algorithms based on the available mix of primitives, pixels, computing, and bandwidth.
We plan to explore schemes that allow direct rendering from compressed data for cluster-based hybrid CPU-GPU rendering. Specifically, we plan to investigate strategies that will permit direct rendering from compressed data representing geometry, connectivity, illumination, and shading (including textures) information in an integrated framework.
We plan to exploit the high degree of coherence in visualization applications in developing efficient schemes for pre-fetching and caching. Unlike traditional applications where the operating system uses simple heuristics such as sequential pre-fetching, we have observed that visualization applications can gain significant performance gains by using the spatial correlation in the geometric data. We plan to use 3D spatial proximity heuristics to devise prefetching and caching schemes. Towards this end we will investigate schemes for arranging data so that spatially adjacent data lies on the same (or adjacent) block as far as possible. This work also involves devising data-structures that permit hierarchical storage of the view-dependent levels of detail so that they naturally lend themselves to pre-fetching in blocks over networks.
By batching LOS queries we are able to hide the latency of GPU occlusion queries and utilize the CPU and GPU simultaneously. While one batch is culled, the non-culled queries from the previous batch are processed by the CPU.
David Orton, currently the CEO of ATI, has expressed the vision that GPUs will become even more capable in the next few years. Orton, who will probably continue to manage the ATI division after the merger, has said that they are currently working to improve data bandwidth performance and add double-precision (64-bit) capabilities to these devices. He believes this can be done without sacrificing the efficiency of traditional graphics processing. By using the same semiconductor technologies that are driving price/performance in CPUs, GPUs will continue to improve. A 500-gigaflop processor is apparently already in the works.
Supercomputing users are envisioning CPU-GPU hybrid systems offering boatloads of vector performance at a reasonable price. Vendors like PANTA Systems are busy introducing such systems. PANTA's new platform, which can combine traditional Opteron modules with NVIDIA GPU modules, is profiled in this week's issue of HPCwire. On the software side, startups like PeakStream and RapidMind are building development platforms for general-purpose computing with GPUs (GPGPU), giving application programmers access to the raw computing power on these devices.
At this year's Supercomputing Conference (SC06) in November, there are a number of presentations that focus on using GPUs for high performance computing (at SC05, they were barely mentioned). GPGPU now vies with the Cell BE, FPGAs and multi-core processors as one of the hottest topics in HPC. We'll be reporting on GPGPU developments during our upcoming LIVEwire coverage of SC06, next month.
AMD seems to be out in front of the GPGPU wave. Whatever the company's plans are for the ATI devices, one can assume it involves making these devices even more commonplace across computing platforms. Whether they migrate from external devices to co-processors to on-chip cores remains to be seen. But however it plays out, AMD's plans to bring the GPU to mainstream computing is certainly a bold move.
...
In general though, AMD is sound financially and its strong product lineup will continue to vex Intel for the foreseeable future. The ATI merger is an additional annoyance, presenting Intel with an unusual, asymmetric challenge. It will be interesting to see how they respond. The fun has just begun.[/b]
AMD seems to be out in front of the GPGPU wave. Whatever the company's plans are for the ATI devices, one can assume it involves making these devices even more commonplace across computing platforms. Whether they migrate from external devices to co-processors to on-chip cores remains to be seen. But however it plays out, AMD's plans to bring the GPU to mainstream computing is certainly a bold move.
