The one thing, I really want to know is, what are the shortcomings of current allocators. I hope to address some problems with existing allocators in my implementation.
I can't emphasize enough that you should get it
correct before you start worrying about expanding the state of the art (after all, doing so would require original research).
Allocators in general have two shortcomings: they can be slow, and they can waste memory. However, there are implementations of allocators that are fast, and don't waste memory, so there might not be a lot of room left to improve. If you're still interested, read on.
In allocators, it all boils down to speed. They really don't
do a heck of a lot. malloc() and free(), in various flavors. But the whole point is to go fast -- feature richness is fine, so long as its fast and requires no changes to the interface.
Allocators can be slow in many ways. They can take a long time to do an individual malloc() or free(), which is bad, or they can lay out memory in strange ways which can screw up locality in caches at various levels, which is way worse. For the latter, maintain alignment whenever possible, watch for thread affinity, try to organize similar sized allocations into semi-contiguous memory, and keep allocator metadata out of small allocations if possible.
The other major shortcoming of bad allocators is wasted space. Allocators waste space in two basic ways.
Fragmentation arises because allocators don't allocate and recycle space efficiently and cannot or do not reorganize space into contiguous chunks (if you use mmap(), you can! -- arguably harder with sbrk()). If your allocator itself is a memory hog, your allocator might use
too much metadata, the bane of allocators everywhere.
As a reference, tcmalloc is pretty fast, and quite popular. Once your allocator works, you should go read about it. If you think you can improve on it, give it a try. And post here on your progress.