Valgrind

Valgrind is our tool of choice for hunting nasty memory errors and/or profiling. Then again, if you just want to compare the performance of your cells using callgrind, ecto provides runtime statistics as explained in Command line helpers: getting stats, GUIS, a shell and more automagically !.

Suppressions

First you’ll need to have a quick look at what the Python devs say about this... in short, for performance reasons python deliberately does some things that look suspicious to valgrind but in fact are not. So you’ll need to use the suppressions file that comes with python, called valgrind-python.supp. Download that via that link and READ THE FILE, there are some suppressions you need to uncomment in there, if you don’t intend to rebuild python itself.

Running ecto under valgrind

You must run valgrind on the python interpreter... not the script! That is, not:

valgrind --tool=memcheck --suppressions=valgrind-python.supp myscript.py

but:

valgrind --tool=memcheck --suppressions=valgrind-python.supp python myscript.py

This is basically the same situation/mechanism as when running ecto scripts under gdb.

Profiling

To find the performance bottlenecks in your application you’ll probably want to start with the graph that the threadpool scheduler prints out at the end of each run, then narrow down your script to the bits that take the longest time. You can run under a Threadpool or Singlethreaded scheduler, but valgrind will (on purpose) interfere with your threads and ensure that things run essentially singlethreaded, so the Singlethreaded scheduler probably makes more sense: just less mechanics to go wrong.

I like to use some of callgrinds’ nifty features to take a sample of a plasm while it is running in a steady state, i.e. I start my plasm like this:

% valgrind --tool=callgrind --instr-atstart=no python ./colorize_clusters.py
==5702== Callgrind, a call-graph generating cache profiler
==5702== Copyright (C) 2002-2010, and GNU GPL'd, by Josef Weidendorfer et al.
==5702== Using Valgrind-3.6.1 and LibVEX; rerun with -h for copyright info
==5702== Command: python ../src/pcl/samples/ros/colorize_clusters.py
==5702==
==5702== For interactive control, run 'callgrind_control -h'.
[ INFO] [1313713285.708789618]: Initialied ros. node_name: /colorize_clusters_1313713285457983466
[ INFO] [1313713288.407112127]: Subscribed to topic:/camera/depth_registered/points with queue size of 2

Where the –instr-atstart=no means that valgrind won’t actually do anything. When my app is running in a steady state, in another window execute:

% callgrind_control --instr=on
PID 5702: python ./colorize_clusters.py [requesting '+Instrumentation'...]
OK.

Now you’ve started sampling. Things will slow down... a lot. After I think callgrind has had time to collect some statistics (say twenty runs through the graph), provoke a dump:

% callgrind_control --dump=tenframes_colorized
PID 5702: python ./colorize_clusters.py [requesting 'Dump tenframes_colorized'...]
OK.

Now you can run that most awesome of tools kcachegrind on the files that it outputs.

Debugging

Same as above if you’re using valgrind’s tool memcheck: run valgrind on the python interpreter, passing the script name as an argument. The