Yesterday, I encountered a very subtle bug that did not make sense and obviously was an effect of a memory fault somewhere else in my code.
rlnc_multipath: src/buffer_pkt.hpp:62: void buffer_pkt::reset(): Assertion `&m_buffer[0]' failed.
How can a reference to the first element of a std::vector be NULL? Symptoms derived from unrelated bugs are very hard to debug. I turned to one of my newest tools in the box: Address Sanitizer.
Address Sanitizer (from now on: asan) shadows every byte in your process memory with a bit in a dedicated shadow memory, and keeps track of your allocated memory and your access to the same. Should you try to access memory not yet allocated or already freed, it will punch you in the head with a very descriptive report.
Running an application with asan will slow it down to approximately half speed, and use approximately double the memory, so do this only when debugging. However, the price is cheap compared to tools like valgrind, that will slow your application even more.
Using Asan
Adding asan to your application is dead simple - just add a few compile flags:
$ clang++ -g -fsanitize=address -O1 -fno-omit-frame-pointer my_program.cpp -o my_program
Or, if you are using a Makefile:
$ CXX=clang++ CXXFLAGS="-g -fsanitize=address -O1 -fno-omit-frame-pointer" make
Now you just run the application and wait for the memory corruption. Once this happens, you will get a report similar to this:
=================================================================
==24557==ERROR: AddressSanitizer: heap-use-after-free on address 0x61a00001d4e0 at pc 0x4b6d4a bp 0x7fff593464b0 sp 0x7fff593464a8
WRITE of size 1350 at 0x61a00001d4e0 thread T0
#0 0x4b6d49 (rlnc_multipath+0x4b6d49)
#1 0x4ac406 (rlnc_multipath+0x4ac406)
#2 0x4ad9a3 (rlnc_multipath+0x4ad9a3)
#3 0x4ad645 (rlnc_multipath+0x4ad645)
#4 0x4a8a70 (rlnc_multipath+0x4a8a70)
#5 0x4a8699 (rlnc_multipath+0x4a8699)
#6 0x4a8443 (rlnc_multipath+0x4a8443)
#7 0x4a7ef2 (rlnc_multipath+0x4a7ef2)
#8 0x7f5a44c40bc4 (/usr/lib/libc-2.18.so+0x21bc4)
#9 0x4a79cc (rlnc_multipath+0x4a79cc)
0x61a00001d9f8 is located 0 bytes to the right of 1400-byte region [0x61a00001d480,0x61a00001d9f8)
freed by thread T0 here:
#0 0x499eb4 (rlnc_multipath+0x499eb4)
#1 0x4c0550 (rlnc_multipath+0x4c0550)
#2 0x4c0397 (rlnc_multipath+0x4c0397)
#3 0x4c0317 (rlnc_multipath+0x4c0317)
#4 0x4b77a1 (rlnc_multipath+0x4b77a1)
#5 0x4b676f (rlnc_multipath+0x4b676f)
#6 0x4ac406 (rlnc_multipath+0x4ac406)
#7 0x4ad9a3 (rlnc_multipath+0x4ad9a3)
#8 0x4ad645 (rlnc_multipath+0x4ad645)
#9 0x4a8a70 (rlnc_multipath+0x4a8a70)
#10 0x4a8699 (rlnc_multipath+0x4a8699)
#11 0x4a8443 (rlnc_multipath+0x4a8443)
#12 0x4a7ef2 (rlnc_multipath+0x4a7ef2)
#13 0x7f5a44c40bc4 (/usr/lib/libc-2.18.so+0x21bc4)
previously allocated by thread T0 here:
#0 0x499cf4 (rlnc_multipath+0x499cf4)
#1 0x4b3e31 (rlnc_multipath+0x4b3e31)
#2 0x4b3bdf (rlnc_multipath+0x4b3bdf)
#3 0x4b3ba1 (rlnc_multipath+0x4b3ba1)
#4 0x4b3929 (rlnc_multipath+0x4b3929)
#5 0x4c0331 (rlnc_multipath+0x4c0331)
#6 0x4b77a1 (rlnc_multipath+0x4b77a1)
#7 0x4c42c8 (rlnc_multipath+0x4c42c8)
#8 0x4ad9a3 (rlnc_multipath+0x4ad9a3)
#9 0x4ad645 (rlnc_multipath+0x4ad645)
#10 0x4a8a70 (rlnc_multipath+0x4a8a70)
#11 0x4a8699 (rlnc_multipath+0x4a8699)
#12 0x4a8443 (rlnc_multipath+0x4a8443)
#13 0x4a7ef2 (rlnc_multipath+0x4a7ef2)
#14 0x7f5a44c40bc4 (/usr/lib/libc-2.18.so+0x21bc4)
Shadow bytes around the buggy address:
0x0c347fffba40: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c347fffba50: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c347fffba60: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c347fffba70: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c347fffba80: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
=>0x0c347fffba90: fd fd fd fd fd fd fd fd fd fd fd fd[fd]fd fd fd
0x0c347fffbaa0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
0x0c347fffbab0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
0x0c347fffbac0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
0x0c347fffbad0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
0x0c347fffbae0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Heap right redzone: fb
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack partial redzone: f4
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
ASan internal: fe
==24557==ABORTING
I am actually not sure why asan doen't symbolize the backtraces, since I do have llvm-symbolizer in my path, but I found this neat little python script to do the job for me: asan_symbolize.py.
So if I run the report through the python script like this:
$ python2 asan_symbolize.py < asan_report.txt
I will get the following symbolized output:
=================================================================
==24557==ERROR: AddressSanitizer: heap-use-after-free on address 0x61a00001d4e0 at pc 0x4b6d4a bp 0x7fff593464b0 sp 0x7fff593464a8
WRITE of size 1350 at 0x61a00001d4e0 thread T0
#0 0x4b6d49 in _ZN5coderI8countersI7tcp_hdrI15tcp_sock_clientI11buffer_poolI10buffer_pkt11final_layerEEEEE16rlnc_enc_processERSt10shared_ptrIS4_Ei rlnc_multipath.cpp:352
#1 0x4ac406 in _ZN5coderI8countersI7tcp_hdrI15tcp_sock_clientI11buffer_poolI10buffer_pkt11final_layerEEEEE9recv_peerEi rlnc_multipath.cpp:404
#2 0x4ad9a3 in _ZNSt5_BindIFSt7_Mem_fnIM5coderI8countersI7tcp_hdrI15tcp_sock_clientI11buffer_poolI10buffer_pkt11final_layerEEEEEFviEEPSC_St12_PlaceholderILi1EEEEclIJiEvEET0_DpOT_ include/c++/4.8.2/functional:1354
#3 0x4ad645 in _ZNSt17_Function_handlerIFviESt5_BindIFSt7_Mem_fnIM5coderI8countersI7tcp_hdrI15tcp_sock_clientI11buffer_poolI10buffer_pkt11final_layerEEEEEFviEEPSE_St12_PlaceholderILi1EEEEE9_M_invokeERKSt9_Any_datai include/c++/4.8.2/functional:2072
#4 0x4a8a70 in _ZNKSt8functionIFviEEclEi include/c++/4.8.2/functional:2464
#5 0x4a8699 in _ZN2io4waitEi io.hpp:183
#6 0x4a8443 in _ZN5coderI8countersI7tcp_hdrI15tcp_sock_clientI11buffer_poolI10buffer_pkt11final_layerEEEEE3runEm rlnc_multipath.cpp:504
#7 0x4a7ef2 in main rlnc_multipath.cpp:632
#8 0x7f5a44c40bc4 in __libc_start_main ??:?
#9 0x4a79cc in _start ??:?
0x61a00001d9f8 is located 0 bytes to the right of 1400-byte region [0x61a00001d480,0x61a00001d9f8)
freed by thread T0 here:
#0 0x499eb4 in _ZdlPv ??:?
#1 0x4c0550 in _ZNSt6vectorIhSaIhEE17_M_default_appendEm include/c++/4.8.2/bits/vector.tcc:570
#2 0x4c0397 in _ZN10buffer_pkt5resetEm buffer_pkt.hpp:75
#3 0x4c0317 in _ZN11buffer_poolI10buffer_pkt11final_layerE4pool3getEm buffer_pool.hpp:41
#4 0x4b77a1 in _ZN11buffer_poolI10buffer_pkt11final_layerE6bufferEm buffer_pool.hpp:64
#5 0x4b676f in _ZN5coderI8countersI7tcp_hdrI15tcp_sock_clientI11buffer_poolI10buffer_pkt11final_layerEEEEE16rlnc_enc_processERSt10shared_ptrIS4_Ei rlnc_multipath.cpp:316
#6 0x4ac406 in _ZN5coderI8countersI7tcp_hdrI15tcp_sock_clientI11buffer_poolI10buffer_pkt11final_layerEEEEE9recv_peerEi rlnc_multipath.cpp:404
#7 0x4ad9a3 in _ZNSt5_BindIFSt7_Mem_fnIM5coderI8countersI7tcp_hdrI15tcp_sock_clientI11buffer_poolI10buffer_pkt11final_layerEEEEEFviEEPSC_St12_PlaceholderILi1EEEEclIJiEvEET0_DpOT_ include/c++/4.8.2/functional:1354
#8 0x4ad645 in _ZNSt17_Function_handlerIFviESt5_BindIFSt7_Mem_fnIM5coderI8countersI7tcp_hdrI15tcp_sock_clientI11buffer_poolI10buffer_pkt11final_layerEEEEEFviEEPSE_St12_PlaceholderILi1EEEEE9_M_invokeERKSt9_Any_datai include/c++/4.8.2/functional:2072
#9 0x4a8a70 in _ZNKSt8functionIFviEEclEi include/c++/4.8.2/functional:2464
#10 0x4a8699 in _ZN2io4waitEi io.hpp:183
#11 0x4a8443 in _ZN5coderI8countersI7tcp_hdrI15tcp_sock_clientI11buffer_poolI10buffer_pkt11final_layerEEEEE3runEm rlnc_multipath.cpp:504
#12 0x4a7ef2 in main rlnc_multipath.cpp:632
#13 0x7f5a44c40bc4 in __libc_start_main ??:?
previously allocated by thread T0 here:
#0 0x499cf4 in _Znwm ??:?
#1 0x4b3e31 in _ZNSt12_Vector_baseIhSaIhEE17_M_create_storageEm include/c++/4.8.2/bits/stl_vector.h:181
#2 0x4b3bdf in _Vector_base include/c++/4.8.2/bits/stl_vector.h:136
#3 0x4b3ba1 in vector include/c++/4.8.2/bits/stl_vector.h:271
#4 0x4b3929 in buffer_pkt buffer_pkt.hpp:25
#5 0x4c0331 in _ZN11buffer_poolI10buffer_pkt11final_layerE4pool3getEm buffer_pool.hpp:37
#6 0x4b77a1 in _ZN11buffer_poolI10buffer_pkt11final_layerE6bufferEm buffer_pool.hpp:64
#7 0x4c42c8 in _ZN5coderI8countersI7tcp_hdrI15tcp_sock_clientI11buffer_poolI10buffer_pkt11final_layerEEEEE8recv_tunEi rlnc_multipath.cpp:381
#8 0x4ad9a3 in _ZNSt5_BindIFSt7_Mem_fnIM5coderI8countersI7tcp_hdrI15tcp_sock_clientI11buffer_poolI10buffer_pkt11final_layerEEEEEFviEEPSC_St12_PlaceholderILi1EEEEclIJiEvEET0_DpOT_ include/c++/4.8.2/functional:1354
#9 0x4ad645 in _ZNSt17_Function_handlerIFviESt5_BindIFSt7_Mem_fnIM5coderI8countersI7tcp_hdrI15tcp_sock_clientI11buffer_poolI10buffer_pkt11final_layerEEEEEFviEEPSE_St12_PlaceholderILi1EEEEE9_M_invokeERKSt9_Any_datai include/c++/4.8.2/functional:2072
#10 0x4a8a70 in _ZNKSt8functionIFviEEclEi include/c++/4.8.2/functional:2464
#11 0x4a8699 in _ZN2io4waitEi io.hpp:183
#12 0x4a8443 in _ZN5coderI8countersI7tcp_hdrI15tcp_sock_clientI11buffer_poolI10buffer_pkt11final_layerEEEEE3runEm rlnc_multipath.cpp:504
#13 0x4a7ef2 in main rlnc_multipath.cpp:632
#14 0x7f5a44c40bc4 in __libc_start_main ??:?
Shadow bytes around the buggy address:
0x0c347fffba40: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c347fffba50: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c347fffba60: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c347fffba70: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c347fffba80: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
=>0x0c347fffba90: fd fd fd fd fd fd fd fd fd fd fd fd[fd]fd fd fd
0x0c347fffbaa0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
0x0c347fffbab0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
0x0c347fffbac0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
0x0c347fffbad0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
0x0c347fffbae0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Heap right redzone: fb
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack partial redzone: f4
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
ASan internal: fe
==24557==ABORTING
This tells me that in rlnc_multipath.cpp:352
I write to a memory address that was allocated in buffer_pkt.hpp:25
, but also freed later in buffer_pkt.hpp:75
.
From this it was actually quite easy to see, that I made the rookie-mistake of keep using pointers to a std::vector that was reallocated due to a resize. Whoops...