Writing Dynamic Storage Allocator

Gamingphreek · Nov 7, 2009

*This is homework. Please do not provide source code*

In my Operating Systems class, our most recent project is to write an allocator.

I'm in the process of reading my notes and the book before I start, but I was wondering if anyone here had any tips, suggestions, or interesting features that I should look to implement.

Thanks
-Kevin (Gamingphreek)

Oyster · Nov 7, 2009

More details... kind of data, has to be transactional, etc.

Gamingphreek · Nov 7, 2009

Well I am implementing malloc, free, and realloc, so there really isn't any wiggle room on the parameters they take 😉

I never found a concrete answer, but xmalloc() is simply a malloc that incorporates the case where malloc == 0 internally right?

-Kevin

xtknight · Nov 7, 2009

A little confused. When you say you are "implementing" malloc, free, and realloc, do you mean you are using these functions or replacing them? If you are replacing them...wait, how do you that?? :O

Gamingphreek · Nov 9, 2009

xtknight said:
A little confused. When you say you are "implementing" malloc, free, and realloc, do you mean you are using these functions or replacing them? If you are replacing them...wait, how do you that?? :O

We are replacing them with our own versions. My professor supplied us 2 header files and a very very basic skeleton, we have to implement the rest.

-Kevin

degibson · Nov 9, 2009

Gamingphreek said:
We are replacing them with our own versions. My professor supplied us 2 header files and a very very basic skeleton, we have to implement the rest.

-Kevin

I'm not sure how far your professor is having you 'implement' you allocator, but I'd recommend you use mmap() to map anonymous memory instead of sbrk(), as it is more portable and knowing mmap() will serve you better in the long run.

Edit: Also, take some time to google around for heap managers as examples. Start simple -- start with 'works correctly', then add features.

Chris27 · Nov 9, 2009

Use sbrk() to extend the heap. It is now the job of your memory allocator to carve out chunks from this block of memory. Basically, what is typically done is to make a linked list that spans the heap. Each element of this linked list has a header, footer, and a payload. The payload is the chunk of memory the user of malloc() gets. The header and footer carry information such as the size of the payload and whether its in use or not.

Some terms that you may want to google:
implicit free list
explicit free list
segregated free list
coalescing
splitting
external fragmentation
internal fragmentation
byte alignment

I have done a near identical assignment before. I have code for some naive implementations if that would help you get started.

Gamingphreek · Nov 9, 2009

Chris27 said:
Use sbrk() to extend the heap. It is now the job of your memory allocator to carve out chunks from this block of memory. Basically, what is typically done is to make a linked list that spans the heap. Each element of this linked list has a header, footer, and a payload. The payload is the chunk of memory the user of malloc() gets. The header and footer carry information such as the size of the payload and whether its in use or not.

Some terms that you may want to google:
implicit free list
explicit free list
segregated free list
coalescing
splitting
external fragmentation
internal fragmentation
byte alignment

I have done a near identical assignment before. I have code for some naive implementations if that would help you get started.

I'll definitely read up on that - thank you so much.

I will say that, for byte alignment, we have been directed to align to 8 byte boundaries (I'm not sure why 8 byte as opposed to 4 byte boundaries)

@Degibson - the header files we have been directed to use are the mmap headers 🙂 - Looks like I'll be using that implementation.

Can anyone comment on what xmalloc officially does though? If my assumption earlier was correct, I think that would be very useful for me to include in my allocator implementation.

The one thing, I really want to know is, what are the shortcomings of current allocators. I hope to address some problems with existing allocators in my implementation.

As of right now, I have only missed an 'A' on a programming project 1x (In which case I got a 'B-') - I'd like to keep it that way 😉

-Kevin

Apathetic · Nov 9, 2009

Gamingphreek said:
I'll definitely read up on that - thank you so much.

I will say that, for byte alignment, we have been directed to align to 8 byte boundaries (I'm not sure why 8 byte as opposed to 4 byte boundaries)

@Degibson - the header files we have been directed to use are the mmap headers 🙂 - Looks like I'll be using that implementation.

Can anyone comment on what xmalloc officially does though? If my assumption earlier was correct, I think that would be very useful for me to include in my allocator implementation.

The one thing, I really want to know is, what are the shortcomings of current allocators. I hope to address some problems with existing allocators in my implementation.

As of right now, I have only missed an 'A' on a programming project 1x (In which case I got a 'B-') - I'd like to keep it that way 😉

-Kevin

Since 8 bytes = 64 bits, I'm guessing he's having you optimize your code for a 64 bit CPU.

Your allocator will be going through some type of list to find a block of memory to fill the current request. The two simplest algorithms are generally called "first fit" (find the first block big enough to be allocated) and "best fit" (find the smallest block which meets the need) - each with their own adantages and disadvantages. I'm sure you can find lots of examples via Google. One small hint: if you make this list a doubly-linked list, merging of free blocks (coalescing) becomes pretty trivial.

As others have suggested: Make it work PROPERLY first and THEN add additional features!

Dave

degibson · Nov 10, 2009

Can anyone comment on what xmalloc officially does though? If my assumption earlier was correct, I think that would be very useful for me to include in my allocator implementation.

http://publib.boulder.ibm.com/infoc...bm.aix.kerneltechref/doc/ktechrf1/xmalloc.htm

Seems pretty self-explanatory to me.

EDIT: WTF? [Q] and [/Q] don't work anymore??

brandonb · Nov 10, 2009

If its operating system class, I believe you will want to do more with the VirtualAlloc, etc.

Such as if something hasn't been used for awhile, move it over to the pagefile, etc. 😉

http://msdn.microsoft.com/en-us/library/aa366781(VS.85).aspx

degibson · Nov 10, 2009

Gamingphreek said:
The one thing, I really want to know is, what are the shortcomings of current allocators. I hope to address some problems with existing allocators in my implementation.

I can't emphasize enough that you should get it correct before you start worrying about expanding the state of the art (after all, doing so would require original research).

Allocators in general have two shortcomings: they can be slow, and they can waste memory. However, there are implementations of allocators that are fast, and don't waste memory, so there might not be a lot of room left to improve. If you're still interested, read on.

In allocators, it all boils down to speed. They really don't do a heck of a lot. malloc() and free(), in various flavors. But the whole point is to go fast -- feature richness is fine, so long as its fast and requires no changes to the interface.

Allocators can be slow in many ways. They can take a long time to do an individual malloc() or free(), which is bad, or they can lay out memory in strange ways which can screw up locality in caches at various levels, which is way worse. For the latter, maintain alignment whenever possible, watch for thread affinity, try to organize similar sized allocations into semi-contiguous memory, and keep allocator metadata out of small allocations if possible.

The other major shortcoming of bad allocators is wasted space. Allocators waste space in two basic ways. Fragmentation arises because allocators don't allocate and recycle space efficiently and cannot or do not reorganize space into contiguous chunks (if you use mmap(), you can! -- arguably harder with sbrk()). If your allocator itself is a memory hog, your allocator might use too much metadata, the bane of allocators everywhere.

As a reference, tcmalloc is pretty fast, and quite popular. Once your allocator works, you should go read about it. If you think you can improve on it, give it a try. And post here on your progress.

heymrdj · Nov 10, 2009

degibson said:
http://publib.boulder.ibm.com/infoc...bm.aix.kerneltechref/doc/ktechrf1/xmalloc.htm

Seems pretty self-explanatory to me.

EDIT: WTF? [Q] and [/Q] don't work anymore??

No, it's QUOTE and /QUOTE in brackets as almost every other forum out there except puketalk used/uses.

😱 :hmm:

degibson · Nov 10, 2009

heymrdj said:
No, it's QUOTE and /QUOTE in brackets as almost every other forum out there except puketalk used/uses.

😱 :hmm:

I noticed, but I grew used to manually inserting [q]'s and [/q]'s. Bah.

Gamingphreek · Nov 18, 2009

Well given the time crunch, we decided that besting the Professor's score was not really feasible.

The best score in our performance benchmarks that were given was achieved from 2 people at CMU who, instead of using a linked list, used a R/B Tree (<-I don't know how this differs from a B or B+ tree).

Furthermore, because of the time crunch, we found that there was an acceptable performance loss in using a single free list instead of a segregated list. In planning, it wasn't implementing the segregated list that we thought would present the problem, it was coalescing between lists and the case that a coalesce may need to move to one of the larger lists.

I am wondering if it would be any faster to thread the program. While it would require more memory, if I spawned n threads (n being the number of cores on a given system) and had them iterate through separate sets of blocks in the list, would this be at all faster?

For instance, assume n = 2:
Thread1------------------>MedianValue<---------------------Thread2

Outside of guarding against the obvious race conditions, would this provide any performance benefit?

Our header and footer are already at the absolute minimum size that is humanly possible. (2 list pointers, and a size_t which, when using bitmasking, also gives the status since the heap is 8-byte aligned).

-Kevin

Markbnj · Nov 18, 2009

Pretty good description here of red/black trees and comparisons to B-trees and other structures...

http://en.wikipedia.org/wiki/Red-black_tree

Gamingphreek · Nov 25, 2009

Ok I E-Mailed my teacher and he didn't see anything obvious; however, when I implement realloc() it fails. If I leave realloc as the default procedure (free, malloc) it runs fine.

Since this is homework, please no answers; however, does anyone have time to look over my code and find out why in the world it is failing.

I've double and triple checked my pointer arithmetic and everything is correct.

Thanks,
-GP

degibson · Dec 16, 2009

Found this. It was a pretty good read... if anyone is still interested in this thread.
http://www.ibm.com/developerworks/linux/library/l-memory/

tfinch2 · Dec 16, 2009

degibson said:
Found this. It was a pretty good read... if anyone is still interested in this thread.
http://www.ibm.com/developerworks/linux/library/l-memory/

I read that article some time ago for no good reason. It is a good read.

Apathetic · Dec 17, 2009

degibson said:
Found this. It was a pretty good read... if anyone is still interested in this thread.
http://www.ibm.com/developerworks/linux/library/l-memory/

Thanks for the link!

Dave

Writing Dynamic Storage Allocator

Lifer

Member

Lifer

Elite Member

Lifer

Golden Member

Member

Lifer

Platinum Member

Golden Member

Diamond Member

Golden Member

Diamond Member

Golden Member

Lifer

Elite Member <br>Moderator Emeritus

Lifer

Golden Member

Lifer

Platinum Member