- Jan 13, 2009
- 294
- 0
- 0
For a machine that will be primary used for rendering/ drafting and not gaming, what is more important, L2 or L3 cache and why?
Originally posted by: Idontcare
(this is intentionally a very simplistic view of computer architecture)
Performance is basically the product the completed instructions per clock (IPC) and clockspeed (GHz).
Performance = IPC x GHz
Completed IPC's are a function of many many things, but one of those things is the efficiency of the underlying memory (cache) sub-system. L1, L2, and L3 caches as well as the ram and even the harddrive matters when it comes to IPC.
Every architecture has a peak theoretical IPC based on the hypothetical scenario where the L1 cache always has exactly what it needs (no L1$ misses)...but this peak IPC is rarely reached in real-world applications because L1$ size is quite small.
This is where L2$ and L3$ comes in. They are successively larger (L3$ > L2$ > L1$) and slower (L3$ < L2$ < L1$) in an intentional hierarchy designed to maximize practical/actual IPC when running real-world applications.
It is impossible for anyone to tell you, legocitytruck, which matters more in terms of keeping real-world IPC from degrading due to excessive L2$ or L3$ misses for your specific application unless you tell us the specific application...and even then we would mostly still be guessing unless your usage pattern was well characterized.
Generally L2$ and L3$ sizes and latencies are modeled in simulation during the design phase of a CPU's architecture for a variety of applications and usage patterns so that the resultant cache hierarchy represents a tradeoff between production costs and cpu performance (IPC). Only in rare instances do we get the chance to evaluate those design tradeoffs as they impact real world applications of interest to us (the recent AMD Phenom II X2 versus Athlon II X2 is a practical example of this, not that you'd buy either one for rendering apps because they are dual-core but they still help answer the question from purely a computer scientist point of view).
In reality, as vj8usa speaks to, the end-user is typically just interested in Performance, not the question of where that performance comes from (is it coming from L2$ hits or L3$ supporting L2$ misses, etc).
The performance (which takes into account clockspeed) is what you pay for, so knowing the performance of your app is really the question to seek an answer to.
edit: grammar/spelling fixed, kinda.
Originally posted by: rogue1979
Originally posted by: Idontcare
(this is intentionally a very simplistic view of computer architecture)
Performance is basically the product the completed instructions per clock (IPC) and clockspeed (GHz).
Performance = IPC x GHz
Completed IPC's are a function of many many things, but one of those things is the efficiency of the underlying memory (cache) sub-system. L1, L2, and L3 caches as well as the ram and even the harddrive matters when it comes to IPC.
Every architecture has a peak theoretical IPC based on the hypothetical scenario where the L1 cache always has exactly what it needs (no L1$ misses)...but this peak IPC is rarely reached in real-world applications because L1$ size is quite small.
This is where L2$ and L3$ comes in. They are successively larger (L3$ > L2$ > L1$) and slower (L3$ < L2$ < L1$) in an intentional hierarchy designed to maximize practical/actual IPC when running real-world applications.
It is impossible for anyone to tell you, legocitytruck, which matters more in terms of keeping real-world IPC from degrading due to excessive L2$ or L3$ misses for your specific application unless you tell us the specific application...and even then we would mostly still be guessing unless your usage pattern was well characterized.
Generally L2$ and L3$ sizes and latencies are modeled in simulation during the design phase of a CPU's architecture for a variety of applications and usage patterns so that the resultant cache hierarchy represents a tradeoff between production costs and cpu performance (IPC). Only in rare instances do we get the chance to evaluate those design tradeoffs as they impact real world applications of interest to us (the recent AMD Phenom II X2 versus Athlon II X2 is a practical example of this, not that you'd buy either one for rendering apps because they are dual-core but they still help answer the question from purely a computer scientist point of view).
In reality, as vj8usa speaks to, the end-user is typically just interested in Performance, not the question of where that performance comes from (is it coming from L2$ hits or L3$ supporting L2$ misses, etc).
The performance (which takes into account clockspeed) is what you pay for, so knowing the performance of your app is really the question to seek an answer to.
edit: grammar/spelling fixed, kinda.
Not trying to insult you, but that did absolutely nothing to answer the OP's question
I certainly don't know the answer, but there has to be somebody that has benchmarked rendering and drafting programs with several different cpu architectures. It should be possible to give us a general idea as to what performs the best in those specific types of applications.
Originally posted by: daw123
Originally posted by: rogue1979
Originally posted by: Idontcare
(this is intentionally a very simplistic view of computer architecture)
Performance is basically the product the completed instructions per clock (IPC) and clockspeed (GHz).
Performance = IPC x GHz
Completed IPC's are a function of many many things, but one of those things is the efficiency of the underlying memory (cache) sub-system. L1, L2, and L3 caches as well as the ram and even the harddrive matters when it comes to IPC.
Every architecture has a peak theoretical IPC based on the hypothetical scenario where the L1 cache always has exactly what it needs (no L1$ misses)...but this peak IPC is rarely reached in real-world applications because L1$ size is quite small.
This is where L2$ and L3$ comes in. They are successively larger (L3$ > L2$ > L1$) and slower (L3$ < L2$ < L1$) in an intentional hierarchy designed to maximize practical/actual IPC when running real-world applications.
It is impossible for anyone to tell you, legocitytruck, which matters more in terms of keeping real-world IPC from degrading due to excessive L2$ or L3$ misses for your specific application unless you tell us the specific application...and even then we would mostly still be guessing unless your usage pattern was well characterized.
Generally L2$ and L3$ sizes and latencies are modeled in simulation during the design phase of a CPU's architecture for a variety of applications and usage patterns so that the resultant cache hierarchy represents a tradeoff between production costs and cpu performance (IPC). Only in rare instances do we get the chance to evaluate those design tradeoffs as they impact real world applications of interest to us (the recent AMD Phenom II X2 versus Athlon II X2 is a practical example of this, not that you'd buy either one for rendering apps because they are dual-core but they still help answer the question from purely a computer scientist point of view).
In reality, as vj8usa speaks to, the end-user is typically just interested in Performance, not the question of where that performance comes from (is it coming from L2$ hits or L3$ supporting L2$ misses, etc).
The performance (which takes into account clockspeed) is what you pay for, so knowing the performance of your app is really the question to seek an answer to.
edit: grammar/spelling fixed, kinda.
Not trying to insult you, but that did absolutely nothing to answer the OP's question
I certainly don't know the answer, but there has to be somebody that has benchmarked rendering and drafting programs with several different cpu architectures. It should be possible to give us a general idea as to what performs the best in those specific types of applications.
I thought IDC answered the OP's question very well.
The OP wants to know what is more important (and why); L2 or L3 cache for rendering / drafting. The first part of IDC's post was a general explaination of how CPU performance is related to L1 / L2 / L3 cache and CPU speed. IDC then went on to explain that in order to answer the OP's question, we need to know the specific applications the OP will be using. Even then it will be a guess on our part as to whether more L2 or L3 wil benefit the OP because we are not privy to the results of the simulations run by the CPU manufacturer.
Edit: You are correct that benchmarks using chips with different cache sizes in the same system will help answer the OP, but there could be scenarios where the surrounding components (MB, RAM, etc.) 'skew' the results. i.e. It may be difficult to isolate the benchmark performance of the chip from the rest of the system.