In this pwn, we get a binary,
children_tcache and the remote libc. The description reads:
Try more tcache :)
This is related to another challenge, Baby Tcache, which is almost the same binary, but provides no easy way to leak addresses, so I decided to start with this one.
Where is the bug?
Analysing the binary, we see that it is a typical heap challenge. We can
malloc chunks of any size up to
free them and
show their contents using
puts. Chunk contents can only be written right after allocation, which makes things a little more difficult. All hardening is enabled -
In the program itself, pointers in the global allocations array are properly zeroed, so we don’t have UAF. What’s more, the authors, expressing their appreciation for Russian, decided to clear the memory (using
memset) after every deallocation using
0xda bytes instead of plain zeroes.
The default installation of libc 2.27 from Ubuntu 18.04 is used. For us, this means that tcache is present and a lot of security checks are enabled. For debugging, I quickly spun up a VM with the right environment.
Let’s look a bit closer at
new_heap to find our bug:
read_into reads up to
size bytes of our input. The stack buffer is zeroed, so we don’t have an arbitrary overflow, but because
strcpy always appends a null byte at the end, we can zero the last byte of the next chunk’s
size field and unset the
PREV_INUSE bit. We also control the previous qword, so we can set
prev_size. See here for a visual guide to the heap layout.
To exploit the null overflow, we create this heap layout:
using the following code:
Some things to note: - The maximum size of the free chunk list in any tcache bin is 7. I fill the tcache for
SIZE_A to put valid
bk pointers into
A1. I thought this was necessary for backwards consolidation to work, but it seems it is not. - Because of the above, there are a bunch of tcached chunks on the heap after
A2 (not shown). This is good, because we don’t want it to merge into
top. - To deal with the
0xda bytes (which would make
A2->prev_size incorrect), we can recreate
B several times, decreasing the size by one byte each time. This will zero the region.
A2->PREV_INUSE not being set is not a problem, because
B goes into tcache immediately and doesn’t check
A2. The sizes of
B are chosen so that they all fall within the same bin due to alignment. - The user size of
0xf8, resulting in a chunk size of
0x100, which has a null byte at the end. Thanks to this, the null overflow doesn’t change the chunk size of
Having forged a layout where
A2->prev_size goes all the way back to
A1, we can free
A2, triggering backwards consolidation:
and have the new, free chunk in a smallbin overlap
Here comes the cool part. We don’t know any addresses (
PIE, remember), so we need a way to print one. The large chunk is free, so we can retrieve it. Since
B is still allocated and available in slot
0, we can
show its contents.
My first idea was to overflow enough bytes from the large chunk to reach
B interpreted as a tcache entry. Freeing
B would make
B->next point to the heap - and then print the large chunk. Unfortunately, this overwrites the size of
B and is generally unwieldy.
A much better idea is to split the large chunk in two at a precisely calculated boundary by allocating part of it. Because it’s in a smallbin, the split will put
bk pointers into the remaining part, and since
libc prefers reusing the latest remainder chunk (due to data locality), it will be linked directly to
main_arena, which we can leak with
Here’s another illustration for your viewing pleasure:
and the code to create the corresponding heap state:
After leaking the libc base, we can perform a tcache poisoning attack. Put simply, we trick
malloc into returning a pointer to an arbitrary address, in this case
__malloc_hook. By overwriting it with a shell gadget, we make the next
malloc invocation spawn a shell.
Obtaining arbitrary pointers from smallbins is generally quite difficult and requires forging fake chunks. Luckily for us,
glibc developers like their code fast, not secure, resulting in a complete lack of checks in the tcache implementation.
Each tcache bin is a singly linked list (of length 7 at most), with
tcache_entry::next residing where
free_chunk::fd would be in a free smallbin chunk. By overwriting this address and leveraging a double free (using the split remainder chunk, which starts at the same point as
B), we can put arbitrary data in the list and have it returned later.
Finally, the poisoned tcache will give us
&__malloc_hook as a new chunk, which we can overwrite with a gadget.
I mentioned in the beginning that Baby Tcache is very similar. Indeed, the only differences are that
show doesn’t exist and that for reading user data into new chunks
read is used instead of
buf[size] is always zeroed, but it doesn’t matter too much). The former means that to do anything useful we probably have to do a partial overwrite of an existing libc pointer. The latter means that we can read in null bytes and perform partial overwrites! So it seems the plan could work. The existing exploit largely works with some small changes - exercise for you :) - until the leak part. My idea was to similarly use a pointer to
main_arena and partially overwrite it into a
__malloc_hook address. Unfortunately, when allocating
B in my PoC,
malloc clears the first qword, zeroing the precious pointer. I tried to work around this, but ran out of time - so I’m hoping for writeups from teams who did solve it!