To understand how GLibc manages memory internally, a few key concepts/data structures should be introduced.
Arena: Arena is the top level memory management entity. There are two types of arenas. Main arena covers the traditional heap area: the space between start_brk and brk for a process from kernel point of view, only one main arena exists for a process. Non-main arena manages the memory fetched from kernel via mmap() system call, there could be 0 to 2*(number of cpu cores) such arenas based on process threads usage.
chunk: chunk is the bottom entity in the management hierarchy. It could have different sizes. Roughly there are 4 types of chunk: regular chunk under bin, top chunk directly controlled by arena, last remainder chunk which is actually stored in unordered bin and is a technique to improve cache hit rate, and mmapped chunk which bypass bin and arena and is standalone.
Bin: For regular chunks in free state, they are stored in linked list and managed via bin. We’ll explore how fast bins, unsorted bin, small bins and large bins are used in more detail a little later.
Sub heap: For non-main arena, each time it gets memory from kernel in block of 1M bytes on 32bit system (HEAP_MAX_SIZE), and the block is called sub heap. Note that one non-main arena could have many number of sub heaps.
With the management hierarchy in mind, let’s first check how arena data structure is defined:
As expected, it needs to support multi-threaded application, and has “mutex” to serialize the access. The “flags” indicates whether this arena is contiguous or not and the availability of fast bin chunks, non-main arena is composed of sub-heaps and always has the NONCONTIGUOUS_BIT set. References to top and last_remainder chunks and fast/normal bins could be found in the structure. It is worth mentioning that with “next” pointer, all arenas are linked together, while “next_free” is for linking all currently idle arenas together so a thread in need may quickly fetch one.
Chunk data structure itself looks simple, but it is probably the most flexible one in malloc function and has a very subtle design in the representation of each field under different states.
Chunk is at least 8 bytes aligned, it leaves us the three lowest significant bits in “size” field as states flag. Bit 0 (P) indicates whether the previous chunk is in use. If bit 1 (M)is set, this chunk is allocated from mmap region directly instead of arena. Bit 2 (A) indicates which type of arena the chunk belongs, 1: non-main arena, 0:main arena.
`boundary tag’ method is being used by chunk structure. Example: the prev_size is only meaningful when the previous chunk is free (bit 0 unset) and the prev_size field is located at the end of previous chunk memory region. The prev_size field should be accessed by chunk structure when bit 0 is set, it could already have real user data owned by the previous chunk at that address. Similar arrangement for fd and bk fields, they are valid only when current chunk is in free state. While size is the only field in chunk structure that matters when chunk is in allocated state.
Bin: The free chunks are stored with bins which are pointers to malloc_chunk linked lists. Top chunk is the only exception which is under arena directly. Fast bins woks as cache, it has default max size of 64 bytes starting at 16 bytes with stride of 8 bytes, and the max size could extend to 88 bytes in maximum. Whenever a free memory api is called and the memory block size is within that range, the memory block is put into fast bin. For malloc, fast bins are also the first choice. Fast bins works in last in first out mode and CAS atomic operation is used for the linked list push/pop processing. unsorted bin is the second one to be checked for free and malloc processing, it doesn’t have size constraint, memory blocks are put there and rearranged into other normal sized bins or distributed to application directly in different scenarios.
sub-heap: In malloc function, it is represented by heap_info data structure. The comments with it should be clear enough.