What do most programmers do when they find out that their program is leaking memory? Nothing, let the user buy more RAM.I would dare to assume that they take a reliable time-tested tool such as valgrind or libasan, run and watch the report. It usually says that objects created on such and such a line of the program of such and such file were not released. And why? This is not written anywhere.
This post focuses on the topleaked leak finder, the underlying statistical analysis concept, and how it can be applied.
I already wrote about topleaked on Habré, but still I will repeat the main idea in general terms. If some objects are not freed, then they are accumulated in memory. This means that we have many homogeneous, similar sequences. If there is more leakage than is actually used, then the most frequent of them are parts of leaked objects. Typically, C ++ programs contain pointers to vtbl classes. Thus, we can find out what type of objects we forget to free. It is clear that there is a lot of garbage in the top, frequently occurring lines, and the same valgrind will tell us what and where flowed much better. But topleaked was not originally created in order to compete with technologies worked out over the years. It was conceived as a tool for solving a problem that cannot be solved by anything else - the analysis of non-reproducible leaks. If you cannot repeat the problem in a test environment,then any dynamic analysis is useless. If the error occurs only "in battle", and even unstable, then the maximum that we can get is logs and a memory dump. This dump can be analyzed in topleaked.
C++ , - abort()
#include <iostream>
#include <assert.h>
#include <unistd.h>
class A {
size_t val = 12345678910;
virtual ~A(){}
};
int main() {
for (size_t i =0; i < 1000000; i++) {
new A();
}
std::cout << getpid() << std::endl;
abort();
}
topleaked
./toleaked leak.core
— .
0x0000000000000000 : 1050347
0x0000000000000021 : 1000003
0x00000002dfdc1c3e : 1000000
0x0000558087922d90 : 1000000
0x0000000000000002 : 198
0x0000000000000001 : 180
0x00007f4247c6a000 : 164
0x0000000000000008 : 160
0x00007f4247c5c438 : 153
0xffffffffffffffff : 141
, 0x2dfdc1c3e, 12345678910, . , . , , gdb gdb . -ogdb — , gdb.
$ ./topleaked -n10 -ogdb /home/core/leak.1002.core | gdb leak /home/core/leak.1002.core
...< gdb >
#0 0x00007f424784e6f4 in __GI___nanosleep (requested_time=requested_time@entry=0x7ffcfffedb50, remaining=remaining@entry=0x7ffcfffedb50) at ../sysdeps/unix/sysv/linux/nanosleep.c:28
28 ../sysdeps/unix/sysv/linux/nanosleep.c: No such file or directory.
(gdb) $1 = 1050347
(gdb) 0x0: Cannot access memory at address 0x0
(gdb) No symbol matches 0x0000000000000000.
(gdb) $2 = 1000003
(gdb) 0x21: Cannot access memory at address 0x21
(gdb) No symbol matches 0x0000000000000021.
(gdb) $3 = 1000000
(gdb) 0x2dfdc1c3e: Cannot access memory at address 0x2dfdc1c3e
(gdb) No symbol matches 0x00000002dfdc1c3e.
(gdb) $4 = 1000000
(gdb) 0x558087922d90 <_ZTV1A+16>: 0x87721bfa
(gdb) vtable for A + 16 in section .data.rel.ro of /home/g.smorkalov/dlang/topleaked/leak
(gdb) $5 = 198
(gdb) 0x2: Cannot access memory at address 0x2
(gdb) No symbol matches 0x0000000000000002.
(gdb) $6 = 180
(gdb) 0x1: Cannot access memory at address 0x1
(gdb) No symbol matches 0x0000000000000001.
(gdb) $7 = 164
(gdb) 0x7f4247c6a000: 0x47ae6000
(gdb) No symbol matches 0x00007f4247c6a000.
(gdb) $8 = 160
(gdb) 0x8: Cannot access memory at address 0x8
(gdb) No symbol matches 0x0000000000000008.
(gdb) $9 = 153
(gdb) 0x7f4247c5c438 <_ZTVN10__cxxabiv120__si_class_type_infoE+16>: 0x47b79660
(gdb) vtable for __cxxabiv1::__si_class_type_info + 16 in section .data.rel.ro of /usr/lib/x86_64-linux-gnu/libstdc++.so.6
(gdb) $10 = 141
(gdb) 0xffffffffffffffff: Cannot access memory at address 0xffffffffffffffff
(gdb) No symbol matches 0xffffffffffffffff.
(gdb) quit
, . $4 = 1000000 . x info symbol . , vtable for A, A.
. , , 15. , .
, ?
— ? , , . topleaked . , , , . ? , , . , , .
…
. , . — . 3 . , . 3 , - . , 2-3 — . . , — , . C++ . , . C, D, Rust, Go NodeJS. , js .
. , , , , close. , . ( ), , fd (512000 ) . . . , , .
topleaked — . , , . , , . : . state — enum, . : , , websocket, . , , .
. Topleaked , , 8 8- . - , , , - . - , . , vtbl, . , , “ ”. vtbl - state. , . .
C++ — ABI - . POD trivial C. , , . . , linux gcc , vtbl — . offsetof(state) . :
struct Base {
virtual void foo() = 0;
};
struct Der : Base {
size_t a = 15;
void foo() override {
}
};
int main()
{
for (size_t i = 0; i < 10000; ++i) {
new Der;
}
auto d = new Der;
cout << offsetof(Der, a) << endl;
abort();
return 0;
}
offsetof Der::a, “” 10000 . topleaked
topleaked my_core.core
0x0000000000000000 : 50124
0x000000000000000f : 10005
0x0000000000000021 : 10004
0x000055697c45cd78 : 10002
0x0000000000000002 : 195
0x0000000000000001 : 182
0x00007fe9cbd6c000 : 167
0x0000000000000008 : 161
0x00007fe9cbd5e438 : 154
0x0000000000001000 : 112
0x000055697c45cd78 vtbl Der. offsetof 8. , 8 . topleaked — . -f , , --memberOffset — -f, --memberType — . uint8, uint16, uint32 uint64.
topleaked my_core.core -f0x55697c45cd78 --memberOffset=8 --memberType=uint64
:
0x000000000000000f : 10001
0x000055697ccaa080 : 1
10000 0x0f, , .
Happy End
. , , , . , . , . , , . , TCP, — websocket upgrade, - . . — , . , (, ) TCP. , , . . , , — TCP Keep Alive. https://blog.cloudflare.com/when-tcp-sockets-refuse-to-die/
, . , . websocket . , .
D
readFile(name, offset, limit)
.findMember!uint64_t(pattern, memberOffset)
.findMostFrequent(size).printResult(format);
findMember — , , findMostFrequent — , . (ranges) . , , , .
P.S. Crazy Panda , . , topleaked.