How Are Objects Stored in RAM?

chrstrbrts · Dec 21, 2014

Hello,

In object oriented languages like Java, I have an idea of how primitives are stored in RAM.

My theory is that things like ints, chars, booleans, etc. are stored in latch circuits within RAM chips as simple binary along with some format bytes that tell the computer that the information is to be interpreted as a pure number or a unicode character, etc.

But, I have no idea whatsoever how objects are stored. What if I create an object called "car" from a Car class that I defined at some point in my program. Further, I give it attributes.

How is this information stored in RAM?

I know that everything is stored as 0's and 1's but what's the actual format for storing objects in RAM?

Thanks.

Berliner · Dec 21, 2014

A kind of pointer.

Ken g6 · Dec 21, 2014

chrstrbrts said:
My theory is that things like ints, chars, booleans, etc. are stored in latch circuits within RAM chips as simple binary along with some format bytes that tell the computer that the information is to be interpreted as a pure number or a unicode character, etc.

Close. In general, primitives are stored as a byte or sequence of bytes storing a number. It might be a character's number, but it's still a number. How the bytes are arranged is called their endianness. There are no format bytes for primitives - I believe that's all handled at compile time. In C or C++ there's no type checking, so you can do interesting tricks like sorting floating-point numbers as integers. (Which works, by the way.)

A C++ class is very similar to a C struct. In an instance, each primitive, or array of primitives, is entered, one after the other, in a block of RAM. The compiler basically turns element names into static offsets, like array offsets. Method names are handled only within the compiler; they're not stored in each class instance.

Now, Java has to be a little different, and I have to speculate some here. The first thing to realize is that when you declare an Object you get a reference to that object. A reference is like a pointer in C, except that you can't modify it. That reference points to something similar to a C struct or a C++ class. The only difference I know of is that Java has an instanceof operator. That means each Object instance has to know what Class it's an instance of. I would guess each instance has a pointer to its immediate Class definition, and that instanceof traverses a linked list of pointers to get its result. But I'm not 100% sure on that.

chrstrbrts · Dec 21, 2014

Berliner said:
A kind of pointer.

I appreciate your input, but can you elaborate a little more?

Ken g6 said:
Close. In general, primitives are stored as a byte or sequence of bytes storing a number. It might be a character's number, but it's still a number. How the bytes are arranged is called their endianness. There are no format bytes for primitives - I believe that's all handled at compile time. In C or C++ there's no type checking, so you can do interesting tricks like sorting floating-point numbers as integers. (Which works, by the way.)

A C++ class is very similar to a C struct. In an instance, each primitive, or array of primitives, is entered, one after the other, in a block of RAM. The compiler basically turns element names into static offsets, like array offsets. Method names are handled only within the compiler; they're not stored in each class instance.

Now, Java has to be a little different, and I have to speculate some here. The first thing to realize is that when you declare an Object you get a reference to that object. A reference is like a pointer in C, except that you can't modify it. That reference points to something similar to a C struct or a C++ class. The only difference I know of is that Java has an instanceof operator. That means each Object instance has to know what Class it's an instance of. I would guess each instance has a pointer to its immediate Class definition, and that instanceof traverses a linked list of pointers to get its result. But I'm not 100% sure on that.

Thanks. But, what I mean is, if I'm storing an int, boolean, etc. I'll simply store some number that translates naturally to whatever information I'm trying to store.

Example: If I want to store int i = 6, I'll go to a RAM address and store 110. If I want to store boolean x = true, I'll go to a RAM address and store 1.

But if I want to store Car car = new Car( arguments here ), I store a pointer in RAM that points to another location also in RAM that stores the actual object. But what is the form? How many bits long is it? What's the scheme?

6 equals 110 in binary. True is stored as 1 quite naturally. But what about objects? I don't see a natural binary form for something as abstract as an instance of a class.

Thanks.

Ken g6 · Dec 21, 2014

Technically an int is stored in 32 bits. As for objects:

http://en.m.wikipedia.org/wiki/C++_classes#Memory_consumption

Markbnj · Dec 21, 2014

chrstrbrts said:
6 equals 110 in binary. True is stored as 1 quite naturally. But what about objects? I don't see a natural binary form for something as abstract as an instance of a class.

Probably one of the simplest forms of this is a C struct. Suppose you have a struct like this:

Code:

struct myStruct {
    int a;
    int b;
    char c[5];
}

So you can see that this struct consists of three members which are all primitives, or in the latter case an array of a primitive. You can visualize how each of the primitives would be stored in ram, so what is the most likely conjecture you can come to about how this whole struct will be stored in ram?

Code:

s = new myStruct();

You probably already know that new returns a pointer, in this case a pointer of type myStruct*. If you dereference that pointer and go look at the ram what would you expect to see? Take a guess.

chrstrbrts · Dec 22, 2014

Markbnj said:
Probably one of the simplest forms of this is a C struct. Suppose you have a struct like this:

Code:

struct myStruct { int a; int b; char c[5]; }

So you can see that this struct consists of three members which are all primitives, or in the latter case an array of a primitive. You can visualize how each of the primitives would be stored in ram, so what is the most likely conjecture you can come to about how this whole struct will be stored in ram?

Code:

s = new myStruct();

You probably already know that new returns a pointer, in this case a pointer of type myStruct*. If you dereference that pointer and go look at the ram what would you expect to see? Take a guess.

My newbie guess is that the pointer holds within it an address that points to another RAM location where the first field is stored, in this case an int. Then, in addition to the int being stored, there must be some kind of linker to where the next field is stored and so on.

I don't see how you could smash all the field data into one RAM latch. My theory is that the computer has some kind of default data structure that it employs to link field data together into a bundle that constitutes the object state.

LOL......How wrong am I?

Cogman · Dec 22, 2014

chrstrbrts said:
My newbie guess is that the pointer holds within it an address that points to another RAM location where the first field is stored, in this case an int. Then, in addition to the int being stored, there must be some kind of linker to where the next field is stored and so on.

I don't see how you could smash all the field data into one RAM latch. My theory is that the computer has some kind of default data structure that it employs to link field data together into a bundle that constitutes the object state.

LOL......How wrong am I?

Semi wrong.

First, drop the notion of Latch. You are using it incorrectly here (ram isn't latches).

I won't discuss how RAM actually works, but suffice it to say it isn't a simple latch, it is (currently typically for DRAM) a capacitor. How that is changed from capacitance to bytes is something you simply aren't ready for yet. But as always you can find a decent description of RAM structure here http://en.wikipedia.org/wiki/Dynamic_random-access_memory .

Everything in your computer is stored in bytes. C and C++ are pretty good about telling you exactly how things will be stored in memory. An example of how it looks is something like this.

Lets say I have a object with two integer fields. The fields have the values of 4 and 5. In memory it looks (not exactly like, but close enough) something like this.

00040005

Notice that there is padding in front of the 4 and five. This is so you can later say "Hey, I really want that first value to be 309" at that point the memory representation would look like this

03090005

All of this information is stored at some place in memory. Memory address refer to where the object starts. When you ask "Hey, get me the second field" you compiler is doing the extra work to say "Ok, there are 2 fields, the first field is 4 digits big so the second field must start 4 digits from the start of the object".

That is the basics. But things get much more complex when talking about things like java. In java you have the references, but on the fly the locations they are pointing at may change at any moment. This is because Java is GCed. So, as a result, stuff can, and will, be shifted around in memory. This is part of the reason java doesn't really have the concept of a memory address or pointer, because java manages the memory.

Even the question "How big are objects" is somewhat complex and involved with Java because it isn't just storing off the fields of an object. It includes extra meta data on the objects to help with things like inheritance.

Even in C and C++ it isn't as straight forward as you might like. This is because of things like packing rules. What are those you might ask? Well. The simple case is the boolean. You might ask "How is a boolean stored" Well, to go back to my simple example above if that first field was a boolean set to true instead of an integer then the object might look like this in memory

00010005

What, you might ask? 4 digits assigned to 1 value! Crazy! Why would they do that?

The answer is simple, CPUs never work against 1 bit. In fact, they pretty rarely work against a byte. A CPU will work against (at a minimum) its word size, or the size of its general registers. It does this for performance reasons. It takes the same amount of time for a CPU to load up 64 bits as it takes it to load 1 bit. In fact, when talking to memory, the CPU generally can't load single bits out memory. It is only allowed to load things in chunks. This is done for performance reasons.

So how are objects stored in memory? Typically something like this
[Object metadata][field1][field2][fieldn]

Depending on the language, the Object metadata may or may not exist with the object data.

Spungo · Dec 22, 2014

chrstrbrts said:
Thanks. But, what I mean is, if I'm storing an int, boolean, etc. I'll simply store some number that translates naturally to whatever information I'm trying to store.

Nothing about computers is natural. You can't even tell it what a 3 is. It only understands 0 and 1. You could say it needs some kind of translator to store a 3. If it can store a 3 as a bunch of binary numbers, why not store letters and objects as binary numbers? That's not much different than converting your name from letters to numbers. As long as you know how the conversion was done, you can always convert it back. It's just like saying today's date is 2014-12-22. As long as you know what each number means, you can translate that into English or French or some other numbered notation.

Example: If I want to store int i = 6, I'll go to a RAM address and store 110. If I want to store boolean x = true, I'll go to a RAM address and store 1.

Then you would need some kind of way to know if something is bool or int. is. Is something "true" or is it "1"? Or does that number actually represent a letter?

Think of it like telling someone what your address is. You can't say "I live on 4." 4 what? Is that a street? Is that an apartment number? Maybe you live at memory address 4. You need some kind of way to identify what that number refers to. With that in mind, what difference is there between bool = 1, integer = 1, and object = 1? They're really not all that different.

How many bits long is it? What's the scheme?

Probably depends on the language and compiler. If you're making something for a Nintendo, it can't be larger than 8 bits. Today, I doubt any object is 8 bits. We're still in the win32 era, so it's probably 32 bits.

Merad · Dec 23, 2014

Spungo said:
Probably depends on the language and compiler. If you're making something for a Nintendo, it can't be larger than 8 bits. Today, I doubt any object is 8 bits. We're still in the win32 era, so it's probably 32 bits.

First, working on an 8 bit computer doesn't limit you to 8 bit types. Memory has no concept of the "bitness" of its processor. On an 8 bit machine the registers, ALU, etc are typically only 8 bits wide, but it's trivial to perform larger operations by chaining them together. Well, technically not trivial, because it causes all of your basic operations (add, subtract, etc) to have an O(n) complexity where n is the size in bytes. But it isn't hard to implement.

Second, in C/C++ land char types are always 1 byte, and a byte is 8 bits in 99.99% of modern compilers. A class that has no members and no virtual functions will also have a size of 1 byte (since it can't have size of 0 bytes), as well as a class that has a single char member and no virtual functions.

The thing about smaller types, especially when used in objects, is that the padding needed to meet alignment requirements usually destroys any space savings that you get from the smaller type. Consider

Code:

struct foo {
  char c;
  int i;
};

Assuming that int is 4 bytes, the compiler is going to effectively implement that struct as

Code:

struct foo {
  char c;
  char _pad0[3];
  int i;
};

because the int as well as the overall struct both need to align to 4 bytes. You could flip the order to place the int first, which would (IIRC) eliminate padding in the object itself, however if you have multiple foo's in contiguous memory (such as an array) you'll still end up with padding between the objects.

uclabachelor · Dec 23, 2014

chrstrbrts said:
Hello,

In object oriented languages like Java, I have an idea of how primitives are stored in RAM.

My theory is that things like ints, chars, booleans, etc. are stored in latch circuits within RAM chips as simple binary along with some format bytes that tell the computer that the information is to be interpreted as a pure number or a unicode character, etc.

But, I have no idea whatsoever how objects are stored. What if I create an object called "car" from a Car class that I defined at some point in my program. Further, I give it attributes.

How is this information stored in RAM?

I know that everything is stored as 0's and 1's but what's the actual format for storing objects in RAM?

Thanks.

Think of your city as the memory space of a PC. How do get "data" from a certain location? Well, you would need the address of that location. The address can be something that you know directly off the top of your head, such as your home address or an address that you need to look up based on a name (ie, via a pointer).

Once you have the address then you can retrieve "data" or set new data, or do both.

What happens when you run out of addresses in a city? You can always create new addresses but you will run out of physical space as you can have a billion addresses created but they are not mapped to a physical space and are unusable.

The same concept occurs in processors. Generally speaking, 8-bit processors have an address space of 2^8, 16-bits have 2^16, etc, etc that can be accessible by the processor. These bits are your 0s and 1s.

Now you can define certain sections of the address space to have specific functionality, for example, addresses 0 - 1024 reserved for hardware registers, 1025 - 2048 for flash memory, etc etc. The RAM space fits somewhere in this address space and is usually defined by the processor's manufacturer if it's internal RAM, and defined by hardware configuration if it's external RAM.

As for compilers and languages, they are nothing more than a translater that takes human readable language and converts that into machine language that can be used by processors. During this translation process, it also maps where compiled code sits in the address space. Within that compiled code, there are blueprints on these so called objects that you speak of. These objects have a variable size whose size is determined by you when you defined them in the programming language and are calculated by the compiler and linker scripts when you compile.

Spungo · Dec 23, 2014

Merad said:
First, working on an 8 bit computer doesn't limit you to 8 bit types. Memory has no concept of the "bitness" of its processor. On an 8 bit machine the registers, ALU, etc are typically only 8 bits wide, but it's trivial to perform larger operations by chaining them together. Well, technically not trivial, because it causes all of your basic operations (add, subtract, etc) to have an O(n) complexity where n is the size in bytes. But it isn't hard to implement.

I shouldn't have worded my post so strongly. What I mean is that one expects variables to be loosely based on available hardware. Is it possible to have a 256 bit integer? Absolutely, but I wouldn't expect to find that today. In the 8 bit NES days, you would expect to find a lot of 8 bit things. Letters were 8 bit. Colors were 8 bit. Sounds were 8 bit. Objects would probably be around 8 bits as well.

Second, in C/C++ land char types are always 1 byte, and a byte is 8 bits in 99.99% of modern compilers. A class that has no members and no virtual functions will also have a size of 1 byte (since it can't have size of 0 bytes), as well as a class that has a single char member and no virtual functions.

What happens if you try making 257 classes that have no members and no functions?

Merad · Dec 23, 2014

Objects would probably be around 8 bits as well.

I'm fairly certain that all NES games were written in assembly, so I doubt they were using "objects" in anything close to the modern sense. Conceptually, though, the point of objects is usually to bundle multiple related values together, so most objects will be >= 2 bytes.

What happens if you try making 257 classes that have no members and no functions?

You have 257 classes with no members and no functions? I don't really understand what you're asking...

Cogman · Dec 23, 2014

Spungo said:
What happens if you try making 257 classes that have no members and no functions?

Nothing.

That one byte minimum for an empty class does nothing for the class. It is simply there to ensure that the class occupies SOME memory so that C++ can guarantee that all objects have a unique memory address.

Ask the question "What should new EmptyObject return in C++?" Would you ever expect it to return the same address of an already allocated object?

The byte allocated has nothing to do with the object, it is essentially wasted space. You could have 1 billion empty objects and it wouldn't make any difference.

chrstrbrts · Dec 23, 2014

Guys, thanks for all your input, but I don't know if anyone really answered my question.

Well, half of it has been answered: I now know that objects are represented in RAM through a bundling of RAM locations each holding the value of the objects' instance variables. That is, an object is essentially a data structure.

But how do all of these RAM locations find each other?

Example:

If the word size of a RAM cell is 64 bits, and I want to save 2 pieces of information that are both 64 bits together in an object, then they would both take up the entirety of a RAM cell and have nothing left over to point to the next cell.

The way I see it, to implement a date structure in RAM, in each RAM cell you need space for the information itself and a pointer that holds the address of the next item in the structure.

But if the information is too big, it squeezes out the pointer.

Unless, you purposely only store half the information in a RAM cell and save the rest for a pointer that points to the second half.

Then, in the second half of the first item, hold a pointer to the first half of the next item and so on.

Merad · Dec 23, 2014

If the word size of a RAM cell is 64 bits, and I want to save 2 pieces of information that are both 64 bits together in an object, then they would both take up the entirety of a RAM cell and have nothing left over to point to the next cell.

You're talking computer engineering now, not software. Software does not know or care how data is stored in RAM cells. Software knows that you have N bytes of byte addressable RAM, so you can ask to read any address between 0 and N (exclusive). How the CPU and memory controller access the data is not relevant to us. In reality the system is even more abstract because of the virtual memory system. Say, a 32 bit program maybe be running on a system that has only 512 MB of physical RAM, but (on windows) the program can actually use up to 2 GB of "memory".

The details of how all this works is decidedly non-trivial and is enough material to fill several advanced college courses.

Edit:

The way I see it, to implement a date structure in RAM, in each RAM cell you need space for the information itself and a pointer that holds the address of the next item in the structure.

Objects are referenced either by a pointer directly to the object, or an offset from a known location. Member variables within an object are an offset from the starting address of the object. Items in an array are an offset from the start of the array. Local variables on the stack are usually an offset from the function's frame pointer. Etc. All of these details are handled by the compiler.

chrstrbrts · Dec 23, 2014

Merad said:
Objects are referenced either by a pointer directly to the object, or an offset from a known location. Member variables within an object are an offset from the starting address of the object. Items in an array are an offset from the start of the array. Local variables on the stack are usually an offset from the function's frame pointer. Etc. All of these details are handled by the compiler.

OK. So, instead of pointers to everything, we have one pointer to a starting point and then use some kind of algorithm to place the other items in relation to that starting location.

Cogman · Dec 23, 2014

chrstrbrts said:
Guys, thanks for all your input, but I don't know if anyone really answered my question.

Well, half of it has been answered: I now know that objects are represented in RAM through a bundling of RAM locations each holding the value of the objects' instance variables. That is, an object is essentially a data structure.

But how do all of these RAM locations find each other?

Example:

If the word size of a RAM cell is 64 bits, and I want to save 2 pieces of information that are both 64 bits together in an object, then they would both take up the entirety of a RAM cell and have nothing left over to point to the next cell.

The way I see it, to implement a date structure in RAM, in each RAM cell you need space for the information itself and a pointer that holds the address of the next item in the structure.

But if the information is too big, it squeezes out the pointer.

Unless, you purposely only store half the information in a RAM cell and save the rest for a pointer that points to the second half.

Then, in the second half of the first item, hold a pointer to the first half of the next item and so on.

Merad is right, you are trying to pull out how hardware does stuff, and that is a complex and ever changing topic which is really not all that important for a newbie to understand.

Suffice it to say, giving a memory address to the memory controller will result in it going out to memory and retrieving the data at that address (as well as all the data around that address). How that address is changed from 0x00123 to Memory cell 2, row 5, block 6 is a really complex topic. The CPU and the OS are doing a lot with memory addresses in the program to make sure that two applications running on the same machine aren't able to access each other's data. But beyond that, they are handling things like "What happens when the ram fills up? Can I move some of this data on the hard drive? Should I just claim this block for the application even though it hasn't requested it yet?" All this and more are being done behind the scenes from the application, and you don't and shouldn't worry about it (unless your end goal is to write OSes or design memory controllers).

So with that in mind. When you make an object in C++, it is simply saying to the OS/memory controller "Hey, I need to allocate x bytes of data". When it does this, it gets back a pointer to the first byte of the object. When you say "object.field7" the compiler is doing all of the math for you. It has already allocated all of the memory you need for object x in a contiguous (maybe... virtual tables are wonderful

) block of memory. So when you say "object.field3" the compiler says "Ok, field1 was 3 bytes long and field2 was 6 bytes long, and I added 2 bytes of padding in front of field1 because I hate you, so field3 must be at the object's starting address + 11".

So the program constructs this magic address "object starting address + 11". It hands it off to the CPU which hands it off to the memory controller, which may hand it back to the OS which would then hand it back to the memory controller, which hands it off to the ram controller, which does the actual physical lookup in the ram to find out where the address given to it actually exists. At every one of these steps, the memory address is changed to represent a new memory address that gets closer and closer to the actual memory address where the ram holds the data. Each layer is reinterpreting the address in some form or another to correctly locate the actual data.

Now, you don't need to know any of what I just said here. I only know this because I majored in computer engineering and have an interest in how computers work. On the job, I work as a java programmer and have never used one bit of the information I just told you. It is, for the most part, worthless to me as a programmer.

Honestly, I've barely scratched the surface of what is going on here. If you really want to know what is going on I would suggest just sitting down and reading through a good operating system book.

Here is a free one
http://pages.cs.wisc.edu/~remzi/OSTEP/

and here is the one I used for college
http://www.amazon.com/Operating-Syst...=UTF8&sr=&qid=

Either should give you a good understanding on how memory is managed in a modern machine (and then some).

Cogman · Dec 23, 2014

chrstrbrts said:
OK. So, instead of pointers to everything, we have one pointer to a starting point and then use some kind of algorithm to place the other items in relation to that starting location.

See my above post. But basically it is starting address + offset. The offset is computed by the compiler. It can do this because it knows how big each of the fields are.

Merad · Dec 23, 2014

If you're really interested in the hardware and implementation details I strongly recommend this text: http://www.amazon.com/Computer-Syste.../dp/0136108040

Spungo · Dec 23, 2014

chrstrbrts said:
Well, half of it has been answered: I now know that objects are represented in RAM through a bundling of RAM locations each holding the value of the objects' instance variables. That is, an object is essentially a data structure.

I want to say yes, but you can't use the term data structure. A data structure is something that actually stores data. An object is a reference type. It stores references to data.

The object is like a treasure map saying where to find a chest full of gold. The gold is very heavy, but the map itself is very light. Objects are small, but they can point to very large things.

Then, in the second half of the first item, hold a pointer to the first half of the next item and so on.

You mean this?

Markbnj · Dec 23, 2014

Spungo said:
I want to say yes, but you can't use the term data structure. A data structure is something that actually stores data. An object is a reference type. It stores references to data.

I think you're confusing objects and references, and in any case I don't think the distinction is terribly useful at OP's stage of the game. At a pretty basic level it's just necessary to understand how primitives can be grouped together into higher level abstractions, and that those abstractions are a view of regions of memory that contain the values of those primitives (plus packing, plus vtable pointers, plus whatever other also distracting stuff is found there).

Markbnj · Dec 23, 2014

The first part of this paper has a great explanation of how DRAM works, if OP is still interested in that level.

https://www.ece.cmu.edu/~safari/pubs/kim-isca14.pdf

It's conclusions aren't that comforting, however

.

Spungo · Dec 23, 2014

I probably am confusing the two. I know I've already confused memory address with the size of the thing that address is pointing to. I'm trying to remember what a C# book said about the data/reference matter. There was a very clear distinction between data types and reference types. "struct" was a data type while "class" was a reference type.

Suppose you have struct1 and struct2. And you do something like:
struct2 = struct1;
Now both are exactly the same, but they're completely independent. They have their own data. Changing one does not change the other.

Now suppose you have object1 and object2 and you do this:
object2 = object1.
Now they're exactly the same, but they're actually linked together. Changing the value of object2.someString will also change the value of object1.someString.

Think of it like taking gold to a vault and the vault guy gives you a receipt. The gold is the data. The receipt is the reference. Or you could say "receipt" is a class, and this particular receipt is an object of that class. Imagine you photocopy the receipt. Now you have two receipts, but they both point to the same piece of gold with the same serial number. You didn't actually copy the gold.

Schmide · Dec 23, 2014

I'm pretty sure there is no 1-byte lower limit for classes. You can have a zero size object that is nothing more than static maps to functions. I do it all the time.

How Are Objects Stored in RAM?

Senior member

Senior member

Programming Moderator, Elite Member

Senior member

Programming Moderator, Elite Member

Elite Member <br>Moderator Emeritus

Senior member

Lifer

Diamond Member

Platinum Member

Senior member

Diamond Member

Platinum Member

Lifer

Senior member

Platinum Member

Senior member

Lifer

Lifer

Platinum Member

Diamond Member

Elite Member <br>Moderator Emeritus

Elite Member <br>Moderator Emeritus

Diamond Member

Diamond Member