Finding memory leaks

One of our programs was leaking memory. Not much, but enough that Tech Ops were not going to allow us to put it into production. Fair enough, I wouldn’t allow it either, if I were on-call.

So I did the obvious: started looking for the leak. This is not as easy as I’d like.

First I tried Test::LeakTrace, which gives lots of information, but:

  1. It gives too much information
  2. It slows things down unbearably

For an example of the slowness, a test that usually runs in less than one minute, took about a week when run with Test::LeakTrace. Since I planned to run several tests multiple times, it was clearly not a viable option.

Second thing I tried: look at /proc/self/stat to see how much memory the process is using. The plan of attack was:

  1. Run some test code 10 times
  2. Measure memory
  3. Run some test code 20 times
  4. Measure memory
  5. Etc…

This did not work: I was expecting to see a linear increase of used memory, but in fact I saw random numbers. Perl‘s allocator is clever, and the kernel’s allocator is clever, and I’m not clever enough to figure out what they’re doing.

So I started looking at perlguts, perldebguts, perlhacktips, and other scary documentation files. They talk about “SV allocation logging”, “memory profiling”, and so on. But, getting those requires re-compiling a Perl. Was I brave enough?

Well, normally I wouldn’t be, but PerlBrew makes compiling a Perl almost easy. I’ll save you the three failed attempts (I found the configuration switches difficult to understand), and show a compressed version of the script I ended up using:

#!/bin/bash

perlbrew switch perl-5.14.2
perlbrew uninstall debug-perl
perlbrew install perl-5.14.2 -n -j5 --as debug-perl 
   -DDEBUGGING -DPERL_MEM_LOG -DDEBUG_LEAKING_SCALARS 
   -DPERL_MEM_LOG -Dusedebugging -Dusemymalloc
perlbrew switch debug-perl

perlbrew install-cpanm

cpanm -n <<EOF
Acme::MetaSyntactic
Alien::ActiveMQ
App::Ack
…
parent
true
version
EOF

cd /tmp
rm -rf Data-Rx*
tar zxvf ~/src/CPAN/Data-Rx-0.007.tar.gz
cd Data-Rx*
patch -p1 < ~/src/CPAN_distroprefs/Data-Rx-0.007.patch
perl Makefile.PL
make install

cd ~/src/catalyst-engine-stomp/
perl Makefile.PL
make install

cd ~/src/Data-MultiValued/
dzil install

# etc etc, for our in-house modules

cd

This allowed me to have a working Perl with all the dependencies I needed. Still, things like PERL_MEM_LOG were not working, and the values returned by Devel::Peek were not exactly clear to me.

Asking on #london.pm revealed that the memory logging facilities were removed from Perl a long time ago, and that nobody knows how to properly read the values from Devel::Peek. So I took some guesses, and wrote this program:

#!/usr/bin/env perl
use strict;
use warnings;
use Devel::Peek;
use MyTest;

{
# pre-alloc some memory
my %report;my @diffs=(100)x100;
sub measure {
    my (%args) = @_;
    my $code = $args{code} // sub {};
    my $cleanup = $args{cleanup} // sub {};
    my $loops = $args{loops} // [1];

    $code->();

    mstats_fillhash(%report);
    $diffs[0]=$report{total}-$report{totfree};

    keys @$loops;
    while (my ($i,$count) = each @$loops) {
        say "$i: looping $count times";

        $code->() for 1..$count;
        $cleanup->();

        mstats_fillhash(%report);
        $diffs[$i+1]=$report{total}-$report{totfree};

        say " diff: ",$diffs[$i+1]-$diffs[$i];
        say '';
    }

    for my $i (1..@$loops) {
        printf "% 3d (% 5d times): % 10d % 10.1fn",
            $i,$loops->[$i-1],
            $diffs[$i]-$diffs[$i-1],
            ($diffs[$i]-$diffs[$i-1])/$loops->[$i-1];
    }
}
}

measure
    code => sub {
      MyTest->test_it,
    },
    loops => [ 10, 20, 30, 40 ];

This, finally, got me a roughly linear increase in memory usage. Then, it was a matter of bisecting the code paths inside the test, checking which changes made the diffs go to 0.

In the end, it was Benchmark::Timer that was allocating memory. Yes, I know, it’s designed to work that way, and I have no-one to blame but myself for using a library without reading all its code.

Anyway, I’ve removed Benchmark::Timer from the code, I wasn’t using its results anyway, and now the program can go to production. It only took me a week…

This entry was posted in Perl, Software Engineering by dakkar. Bookmark the permalink.

About dakkar

Gianni is a Perl Architect at NAP. His code from previous lives runs in universities administration software, inside ask.com news system, and even in Antarctica. He's currently busy writing libraries to make "the right thing" be "the easy thing".

One thought on “Finding memory leaks

  1. To the contrary -DPERL_MEM_LOG works fine, and was recently improved by Jim Cromie. See e.g.
    git log -p -S PERL_MEM_LOG util.c
    => 1cd8acb500c6fd96bf025feb0647211c271b7e2e

    But since you only want to check simple leaks, DEBUGGING is enough.
    DEBUG_LEAKING_SCALARS gives you the source who allocated a SV, which you do not need.

    Without DEBUGGING and recompiling perl you just use valgrind to check for leaks in production code. (10-20x slower). With a new clang you can compile it with msan, -fsanitize=memory and -O2, which is much faster.

Leave a Reply