CPU vs RAM

deveraux · Feb 12, 2009

Hey guys,

I made a minimalistic 3D engine some time ago. It has been a while since I have done any serious programming so instead of just grabbing the code again, I thought I'd just remake it. Its for fun anyway so I have no time limit/restrictions. Anyway, my question is this: is there any rule of thumb so to speak on whether its better to eat CPU cycles or more RAM to save CPU cycles?

This is more of an opinion based question. One of the issues I am having is that when I define an object, it is made up of surfaces which are themselves made up of vertices. These vertices can have duplicates in memory due to different surfaces using the same vertex. I could make it less memory intensive but it will obviously eat CPU cycles by searching a list for the data.

Another small issue is one for the offset. Most objects are obviously defined at the origin and then moved around through their offset. I can either integrate the offset value into the object's coordinates or just leave it as a separate variable and recompute the final result every draw cycle.

Any opinions would be appreciated, thanks. Just in case it makes a difference, this will be written in C++. Does language actually matter in the whole CPU vs RAM argument?

chronodekar · Feb 13, 2009

I have no experience in 3D or game programming (map-making not included), but I do work with embedded systems.

Like you mention yourself, it's a question of opinion. I think you should just concentrate on the end-user. Ensure that they will be able to run your application without any compromises.

With that in mind, I think it's safe to be liberal on memory. It's cheap these days, so should not be considered a bottle-neck. However, if you go crazy allocating stuff around, you'll run into trouble.

That means finding a middle path somewhere.

Hmmm... that brings us back to where we started. I guess I haven't been too helpful this time, eh?

deveraux · Feb 13, 2009

Well, most important thing I took away from that is to focus on the end-user. Ironically though, I AM the end user. Maybe I'll just make a simple program to start with first and test extremities to see the performance hit of using both methods.

My main reason for asking is really just curiosity. I never had formal training, so was really just wondering if there was a sort of standard so to speak to which route to take when presented with those options.

Ken g6 · Feb 13, 2009

One thing to keep in mind is the size of the memory caches on modern CPUs. You can pretty safely assume 16kB fits in the L1 cache, so if your frequently used data structures in a section of code are up to about that size, that's no problem at all. You can also pretty safely assume 512kB to even a MB or two fits in the L2 cache, so it's relatively fast to access, but I wouldn't make a single lookup table that's bigger than the L1 cache without a good reason. Anything bigger than that will probably require a main-memory access, which will probably slow your algorithm more than the extra memory gains in speed.

I'm not that familiar with 3D engines specifically, but about the offset issue, are your offsets ints or floats? If they're floats, you might want to recompute every draw cycle, because if you don't, floats tend to introduce creeping errors. Things will look fine for awhile, but eventually your shapes will break apart! For speed, you might find some compromise, e.g. recompute every 15 frames or so.

Language probably doesn't matter much, unless you're using a language like Javascript that treats all arrays as hashes, or something.

degibson · Feb 13, 2009

Originally posted by: Ken g6
...a lot of things

For some reason, I find myself agreeing entirely with Ken g6's point of view these days.

To re-summarize, for no apparent reason:
1) Keep data structures smaller than CPU caches. Go crazy on computations that fit in cache. Fear computations that exceed cache size.
2) Fear floating precision issues.
3) Language matters. If your language is too far away from the metal, you have no idea what your actual data set sizes are, hence you'll have trouble doing #1.

chronodekar · Feb 13, 2009

Originally posted by: Ken g6
One thing to keep in mind is the size of the memory caches on modern CPUs. You can pretty safely assume 16kB fits in the L1 cache, so if your frequently used data structures in a section of code are up to about that size, that's no problem at all. You can also pretty safely assume 512kB to even a MB or two fits in the L2 cache, so it's relatively fast to access, but I wouldn't make a single lookup table that's bigger than the L1 cache without a good reason. Anything bigger than that will probably require a main-memory access, which will probably slow your algorithm more than the extra memory gains in speed.

Where can I find some reading material online for this info ? (What you say makes sense, I just want to go in a bit deep)

Also, is this still valid in a managed environment like .NET ? Or Java ? Just thinking...

deveraux · Feb 13, 2009

Originally posted by: degibson

Originally posted by: Ken g6
...a lot of things

Click to expand...

For some reason, I find myself agreeing entirely with Ken g6's point of view these days.

To re-summarize, for no apparent reason:
1) Keep data structures smaller than CPU caches. Go crazy on computations that fit in cache. Fear computations that exceed cache size.
2) Fear floating precision issues.
3) Language matters. If your language is too far away from the metal, you have no idea what your actual data set sizes are, hence you'll have trouble doing #1.

No idea if there's any history between the two of you, but is agreeing entirely something bad?

I was not aware of creeping errors when storing/retrieving floating point numbers. I only knew there were errors when computing them. And for point 1, I doubt I have much control over how large the data structures will be.

Draw a million triangles and you can be certain it'll overflow into main memory. I suppose I could come up with tricks to try to manage that, but I don't foresee running into that problem any time soon. Would be good to keep that in mind however.

Thanks for the help guys!

recoil80 · Feb 13, 2009

I don't think there's a good answer to your question, because it really depends on the hardware.
I work on embedded systems, and everyone has its constraints.
Sometimes I have to cope with a small amount of memory, so I have to keep the data structures as tiny as I can. Sometimes memory is not the real issue, but the CPU is overloaded and I have to optimize the code, while I have plenty of ram to use.

On a PC I'd save CPU cycles and use more ram, because nowadays PC have plenty of RAM. I think is better to use less CPU, especially in laptops, so you can keep the frequency low and save battery.

degibson · Feb 13, 2009

Originally posted by: chronodekar
Also, is this still valid in a managed environment like .NET ? Or Java ? Just thinking...

The same principles hold, but you rely entirely on the runtime to make them happen -- its a lot harder to reason about your own code that way.

Originally posted by: deveraux
Draw a million triangles and you can be certain it'll overflow into main memory.

Sure. 1 Triangle ~= 3 xyz tuples, some shading data (maybe another 3 entries). Assume 32-bit floats. Thats (3*3+3)*4 = 48 bytes per triangle. So, if you have, say, a 512 KB cache on your chip, draw you triangles ~10k at a time or so if there's any recurrence in the data stream. If there's not recurrence, just make sure you unroll your loops a bit.

Then again, if you're using 3D acceleration you're dumping across a PCI(e) bus anyway, so the memory overflow will matter less. If you're entirely in software, draw your triangles small enough batches so that once they're drawn you don't have to bring them back onto the chip -- you can still be cache-friendly that way.

Or better yet... fork a few threads.

Markbnj · Feb 13, 2009

When I was messing around with 3D stuff years ago everyone used either their own, or someone else's, hand-optimized fixed-point library. That was initially because only boards with a coprocessor supported floating point in hardware, and later just to avoid the error issues mentioned here. Has fixed-point math gone out of style in the graphics space?

degibson · Feb 14, 2009

Originally posted by: Markbnj
When I was messing around with 3D stuff years ago everyone used either their own, or someone else's, hand-optimized fixed-point library. That was initially because only boards with a coprocessor supported floating point in hardware, and later just to avoid the error issues mentioned here. Has fixed-point math gone out of style in the graphics space?

The latest-and-greatest stuff appears to be working on IEEE 32-bit floats for the most part, with some limited integer support. (I'm not sure about 64-bit float... its probably there)

I think the fixed point stuff has gone the way of the dodo -- graphics cards are fun like that, as they don't have to maintain backwards compatibility.

deveraux · Feb 15, 2009

As I understand it, the only time I still use fixed point is for 2D projection using OpenGL. I think (and don't hold me to this), it was called Orthographic projection.

Search

CPU vs RAM

deveraux

Senior member

chronodekar

Senior member

deveraux

Senior member

Ken g6

Programming Moderator, Elite Member

degibson

Golden Member

chronodekar

Senior member

deveraux

Senior member

recoil80

Member

degibson

Golden Member

Markbnj

Elite Member <br>Moderator Emeritus

degibson

Golden Member

deveraux

Senior member

TRENDING THREADS