question about RAM and word size

Special K

Diamond Member
Jun 18, 2000
7,098
0
76
I was reading in a logic design book that a word is the smallest collection of bits that is treated as a single unit. It said that in RAM, a word is the smallest unit of data represented by a single address in memory. I thought that the current x86 CPUs define a 32 bit word, yet I also thought that each address in RAM refers to 1 byte. Which is correct?
 

kt

Diamond Member
Apr 1, 2000
6,032
1,348
136
This question should be in the Highly Technical forum. I am sure the guys over there will give you a better and most helpful answer.
 

Nothinman

Elite Member
Sep 14, 2001
30,672
0
0
Each memory access has to be to a full memory page, on 32-bit x86 this is 4K. Memory accesses have to be aligned to 4K page boundaries, so if you want to see just the 3rd byte the CPU reads the whole page anyway. Most processors enforce this in hardware generating an exception and letting the OS deal with it how it wants (normally adjusting the access to be aligned properly, but this is very slow) but I'm not sure about x86.

If you ever run Linux on Alpha you'll see a some 'unaligned trap' messages from apps that are only built and tested on x86 boxes and don't align correctly on 64-bit ones.
 

Special K

Diamond Member
Jun 18, 2000
7,098
0
76
If I create an array of chars (1 byte each), the address of each element will be 1 higher than the one before it. Are these addresses not the same addresses that the book is referring to, since they are 1 byte apart, but according to word size, they should be 4 bytes apart?
 

Nothinman

Elite Member
Sep 14, 2001
30,672
0
0
Technically those addresses could be anywhere in memory. Even though the OS allocated you X amount of virtually contiguous memory doesn't mean it's physically contiguous (and on some platforms like NUMA it might not even be on the same host).
 

KillerCow

Member
Jun 25, 2001
142
0
0
No, RAM is logically divided into words, which on 32bit arcitectures, are 32bits

The reason for this is that RAM is used to store program data AND the program itself. The program is composed of machine code instructions which are executed by the CPU, and these instructions are 32bits wide... to facilitate this, all data transport and storage systems are 32bits wide. In addition, the registers in the CPU are 32bits wide, so when you do a LOAD or STORE operation, the entire register is copied to RAM... so it makes sence to use 32bit partitions. There are no 8 or 16 bit registers.

You are correct that each address in RAM refers to 1byte... that is that each address is 1byte apart, BUT you will note that whenever you use the address, it is a multiple of 4 (1byte * 4 = 32bits). I believe that if you try to use an address that is not a multiple of 4, you will get some kind of error... I can't remember what it is called at the moment.

If you read some of the assembly code in your design book you will see these concepts in practice. Take a close look at the JUMP operations in relation to the Program Counter (PC), and also the LOAD LOW and LOAD HIGH operations... as these demonstrate limitations of a 32bit command word with 32bit registers.

If I create an array of chars (1 byte each), the address of each element will be 1 higher than the one before it. Are these addresses not the same addresses that the book is referring to, since they are 1 byte apart, but according to word size, they should be 4 bytes apart?

That is an entierly different can of worms. It depends on how your compiler and OS handle arrays and memory alocation. Arrays do not exist in assembly language; physical processors are only designed to run machine code.

When you load one of your chars into a register... it will take up all of the register's 32bits, and when you store it, it is easiest for the compiler to just take up 32bits in RAM. Imagine trying to do it with 4chars stored in each word. When you did a STORE, it would have to load all 4 bytes into one register, calculate the segment that it is replacing, mask it with an AND, shift the value of the char that it wants to write to the correct location, merge it with an OR, then write it. That is at lease 6 extra operations... assuming that it only takes one to make the mask.

In Java, an array of BYTES (8bits) takes up the same amount of space as an array of INTS (32bits).
 

Special K

Diamond Member
Jun 18, 2000
7,098
0
76
You are correct that each address in RAM refers to 1byte... that is that each address is 1byte apart, BUT you will note that whenever you use the address, it is a multiple of 4 (1byte * 4 = 32bits). I believe that if you try to use an address that is not a multiple of 4, you will get some kind of error... I can't remember what it is called at the moment.

Could you explain what you mean by this? I seem to remember writing programs that could increment pointers and give you addresses that were not a multiple of 4. I think that is what is confusing me: what is the difference between the logical address (which must refer to a 32 bit word) and the addresses that a program works with - for example, the one that would be displayed if I did the following:

char x;
printf("The address of x is %#X\n",&x)

where the address of x as reported by the program would only refer to 1 byte of memory, not 32 bits. And if I declared another char right after the first one, it's address would most likely be one byte higher than the one before it.
 

Nothinman

Elite Member
Sep 14, 2001
30,672
0
0
As a userland application all you get to see are virtual addresses, you can never see the physical address in memory.
 

KillerCow

Member
Jun 25, 2001
142
0
0
This is operating at a much higher level than assembly language. You don't know how the OS mapps addresses. The address that you see will not be the real one (the one in physical memory).

Have a look at this program. It uses an array and offsets, and is well documented. Notice addi r3, r3, 4 ; increment the string pointer

I am not a C expert, so I connot speak to how it alocates memory.
 

Special K

Diamond Member
Jun 18, 2000
7,098
0
76
So the logical address is something entirely different than the address I see in a program, and even though a char appears to take up 1 byte, it takes up 32 bits in the hardware itself. Thanks for clarifying that. The book (so far) has not specifically pointed out that the addresses they were referring to (logical) were not the same as the ones that a program would use.