• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

Writing Dynamic Storage Allocator

*This is homework. Please do not provide source code*

In my Operating Systems class, our most recent project is to write an allocator.

I'm in the process of reading my notes and the book before I start, but I was wondering if anyone here had any tips, suggestions, or interesting features that I should look to implement.

Thanks
-Kevin (Gamingphreek)
 
Well I am implementing malloc, free, and realloc, so there really isn't any wiggle room on the parameters they take 😉

I never found a concrete answer, but xmalloc() is simply a malloc that incorporates the case where malloc == 0 internally right?

-Kevin
 
A little confused. When you say you are "implementing" malloc, free, and realloc, do you mean you are using these functions or replacing them? If you are replacing them...wait, how do you that?? :O
 
A little confused. When you say you are "implementing" malloc, free, and realloc, do you mean you are using these functions or replacing them? If you are replacing them...wait, how do you that?? :O

We are replacing them with our own versions. My professor supplied us 2 header files and a very very basic skeleton, we have to implement the rest.

-Kevin
 
We are replacing them with our own versions. My professor supplied us 2 header files and a very very basic skeleton, we have to implement the rest.

-Kevin

I'm not sure how far your professor is having you 'implement' you allocator, but I'd recommend you use mmap() to map anonymous memory instead of sbrk(), as it is more portable and knowing mmap() will serve you better in the long run.

Edit: Also, take some time to google around for heap managers as examples. Start simple -- start with 'works correctly', then add features.
 
Use sbrk() to extend the heap. It is now the job of your memory allocator to carve out chunks from this block of memory. Basically, what is typically done is to make a linked list that spans the heap. Each element of this linked list has a header, footer, and a payload. The payload is the chunk of memory the user of malloc() gets. The header and footer carry information such as the size of the payload and whether its in use or not.

Some terms that you may want to google:
implicit free list
explicit free list
segregated free list
coalescing
splitting
external fragmentation
internal fragmentation
byte alignment

I have done a near identical assignment before. I have code for some naive implementations if that would help you get started.
 
Last edited:
Use sbrk() to extend the heap. It is now the job of your memory allocator to carve out chunks from this block of memory. Basically, what is typically done is to make a linked list that spans the heap. Each element of this linked list has a header, footer, and a payload. The payload is the chunk of memory the user of malloc() gets. The header and footer carry information such as the size of the payload and whether its in use or not.

Some terms that you may want to google:
implicit free list
explicit free list
segregated free list
coalescing
splitting
external fragmentation
internal fragmentation
byte alignment

I have done a near identical assignment before. I have code for some naive implementations if that would help you get started.

I'll definitely read up on that - thank you so much.

I will say that, for byte alignment, we have been directed to align to 8 byte boundaries (I'm not sure why 8 byte as opposed to 4 byte boundaries)

@Degibson - the header files we have been directed to use are the mmap headers 🙂 - Looks like I'll be using that implementation.

Can anyone comment on what xmalloc officially does though? If my assumption earlier was correct, I think that would be very useful for me to include in my allocator implementation.

The one thing, I really want to know is, what are the shortcomings of current allocators. I hope to address some problems with existing allocators in my implementation.

As of right now, I have only missed an 'A' on a programming project 1x (In which case I got a 'B-') - I'd like to keep it that way 😉

-Kevin
 
I'll definitely read up on that - thank you so much.

I will say that, for byte alignment, we have been directed to align to 8 byte boundaries (I'm not sure why 8 byte as opposed to 4 byte boundaries)

@Degibson - the header files we have been directed to use are the mmap headers 🙂 - Looks like I'll be using that implementation.

Can anyone comment on what xmalloc officially does though? If my assumption earlier was correct, I think that would be very useful for me to include in my allocator implementation.

The one thing, I really want to know is, what are the shortcomings of current allocators. I hope to address some problems with existing allocators in my implementation.

As of right now, I have only missed an 'A' on a programming project 1x (In which case I got a 'B-') - I'd like to keep it that way 😉

-Kevin

Since 8 bytes = 64 bits, I'm guessing he's having you optimize your code for a 64 bit CPU.

Your allocator will be going through some type of list to find a block of memory to fill the current request. The two simplest algorithms are generally called "first fit" (find the first block big enough to be allocated) and "best fit" (find the smallest block which meets the need) - each with their own adantages and disadvantages. I'm sure you can find lots of examples via Google. One small hint: if you make this list a doubly-linked list, merging of free blocks (coalescing) becomes pretty trivial.

As others have suggested: Make it work PROPERLY first and THEN add additional features!

Dave
 
The one thing, I really want to know is, what are the shortcomings of current allocators. I hope to address some problems with existing allocators in my implementation.

I can't emphasize enough that you should get it correct before you start worrying about expanding the state of the art (after all, doing so would require original research).

Allocators in general have two shortcomings: they can be slow, and they can waste memory. However, there are implementations of allocators that are fast, and don't waste memory, so there might not be a lot of room left to improve. If you're still interested, read on.

In allocators, it all boils down to speed. They really don't do a heck of a lot. malloc() and free(), in various flavors. But the whole point is to go fast -- feature richness is fine, so long as its fast and requires no changes to the interface.

Allocators can be slow in many ways. They can take a long time to do an individual malloc() or free(), which is bad, or they can lay out memory in strange ways which can screw up locality in caches at various levels, which is way worse. For the latter, maintain alignment whenever possible, watch for thread affinity, try to organize similar sized allocations into semi-contiguous memory, and keep allocator metadata out of small allocations if possible.

The other major shortcoming of bad allocators is wasted space. Allocators waste space in two basic ways. Fragmentation arises because allocators don't allocate and recycle space efficiently and cannot or do not reorganize space into contiguous chunks (if you use mmap(), you can! -- arguably harder with sbrk()). If your allocator itself is a memory hog, your allocator might use too much metadata, the bane of allocators everywhere.

As a reference, tcmalloc is pretty fast, and quite popular. Once your allocator works, you should go read about it. If you think you can improve on it, give it a try. And post here on your progress.
 
Well given the time crunch, we decided that besting the Professor's score was not really feasible.

The best score in our performance benchmarks that were given was achieved from 2 people at CMU who, instead of using a linked list, used a R/B Tree (<-I don't know how this differs from a B or B+ tree).

Furthermore, because of the time crunch, we found that there was an acceptable performance loss in using a single free list instead of a segregated list. In planning, it wasn't implementing the segregated list that we thought would present the problem, it was coalescing between lists and the case that a coalesce may need to move to one of the larger lists.

I am wondering if it would be any faster to thread the program. While it would require more memory, if I spawned n threads (n being the number of cores on a given system) and had them iterate through separate sets of blocks in the list, would this be at all faster?

For instance, assume n = 2:
Thread1------------------>MedianValue<---------------------Thread2

Outside of guarding against the obvious race conditions, would this provide any performance benefit?

Our header and footer are already at the absolute minimum size that is humanly possible. (2 list pointers, and a size_t which, when using bitmasking, also gives the status since the heap is 8-byte aligned).

-Kevin
 
Ok I E-Mailed my teacher and he didn't see anything obvious; however, when I implement realloc() it fails. If I leave realloc as the default procedure (free, malloc) it runs fine.

Since this is homework, please no answers; however, does anyone have time to look over my code and find out why in the world it is failing.

I've double and triple checked my pointer arithmetic and everything is correct.

Thanks,
-GP
 
Back
Top