Logging and debugging in OpenStack
Debugging a piece of code is an activity as important as writing it. Usually this software development process stage is not well known among end users, but… believe me, they tend to notice when a program was not properly debugged!
The importance of debugging is much more noticeable on large projects like OpenStack, which require both developers and common users contribute and report when something is not going well. Two pairs of eyes are better than one… or well, in this case, several thousands.
In #openstack-101 I had the pleasure to meet many people wanting to start contributing to OpenStack, but I noticed that a lot of them didn’t know how to provide the right information to see where the problem was when they were facing a blocker. With this in mind I decided to write this post :)
The first thing to do when we see a strange behavior that is repeated with the same input data, is to look at logs.
Logs are files, usually plain text and arranged chronologically, that contain information about the code that is executed to perform an action.
Analyzing the information found there, you may be able to detect what the problem is or, at least, where the error occurs.
The location of the logs in the system may vary depending on the implementation, but in GNU/Linux they usually are stored in
In OpenStack the location of these logs varies according how have you deployed it: using DevStack or by hand, whether with the distribution repository packages you are using or from source.
Also, some services won’t log all the info you need to diagnose failures unless you configure section DEFAULT with DEBUG=true. Check out the configuration files for the services you want to debug (e.g. in Nova, you have to look for nova.conf under /etc/nova) and add this configuration (thanks lxsli for the heads up!).
By default, DevStack doesn’t log. To enable this option you will need to add the following lines to the
LOGFILE=$DEST/stack.sh.log – Name and location of the stack.sh script logs. Without this line the output of this script will appear in the terminal where you run it, without being stored in any location.
SCREEN_LOGDIR=$DEST/logs/screen – Location of the screen logs. DevStack runs OpenStack services under GNU Screen, thus providing the facility to view its current execution status. The logs simply store the output of these screens..
LOGDAYS=1 – Number of days to be logged. The old logs are replaced with the latest runs.
$DEST is the location in which the logs will be located. You can set this value as you wish.
As I mentioned, screen logs are just a record of the screens initiated by DevStack but you can check them anytime during the exectution – now I will tell you how -. The information provided is very useful to debug any OpenStack service.
GNU Screen provides the same functions as the DEC VT100 terminal, plus other aggregates. If you have no experience with this terminal, make sure to write down the following shortcuts.
|Attach to a screen||
|Scroll in a screen||
|Detach from the screen||
Here you will find most services logs in
/var/log/service, where service is the name of the desired service. For example, in
/var/log/keystone you will find Keystone logs.
In my experience, the only service that differs from this default location is Horizon. You can find Horizon logs in
/var/log/apache2 on Ubuntu and in
/var/log/httpd on Fedora – other distro? let me know! -.
For now this is all I can say about the log use on manual deployments. I will add more details about that after experimenting a bit more with my new OpenStack deployment in Fedora 18 ;) – if I can manage to make it work! -.
Once detected where the problem is, or at least approached the context where it could be, you should determine what is happening. For that we use a debugger.
A debugger, or debugging tool, is a program used to test or debug a desired code.
To use it, simply import it and set a breakpoint where you consider the problem is. This is done by adding just two lines of code.
|Import the debugger||
|Set the breakpoint||
Done this, you will need to run the program from the Python interpreter and wait for the execution to reach the breakpoint. There you may be able to inspect variables, analyze the context and fix the error.
On a small scale program debugging is as simply as that, but in something like OpenStack this procedure is not so straightforward. First you will need to figure out how to set up the environment and get to run the piece of code you want to debug.
For most services, simply open the Python interpreter, import the necessary libraries, connect to the service and reproduce the bug you want to inspect.
Each service has different APIs, so you will probably need to do some research about the code before trying to debug it. This is because what I said about “importing libraries” and “connecting to the service”.
I guess that if you are still reading is because you’re already familiar with the code, but worth mentioning it.
You can check great application examples in Who Are You and What Are You Going to Do? where Anita let us know her experience debugging Nova’s image management, and in Debugging nova — a small illustration with pdb where Kashyap show us step by step how he discovered that keypairs in Nova don’t support non-ascii characters.
Once again the special case here is Horizon. As it is built with Django, the way to debug this service is different from the rest.
You will need to run a developing webserver and, from the browser, reproduce the bug.
In the interpreter where you ran the server you will see how the different calls are executed up to the breakpoint, and once there you can explore the context to look for for a solution.
Like other services, you may import Pdb and set the breakpoint. The development webserver is run simply with
python -m pdb manage.py runserver within the folder in which manage.py is placed – this is in Horizon root folder -.
You will see that the interpreter automatically hangs waiting for an input. Continue the execution typing
c and you will get the development webserver ip address and the port. The rest is just a matter of using the Dashboard as you are used to until reproducing the bug and reaching the breakpoint.
You can restart the server pressing ctrl-c – since the usual Pdb behavior is to restart the script once it finish -, and you can kill it hitting ctrl-c a few more times.
For more details you can visit Pdb: Using the Python debugger in Django, a Mike Tigas’ post whose simplicity and completeness dazzled me.
A trick: Debugging with the tests
For those readers who are not so internalized with OpenStack code, I wanted you to know that in every OpenStack service you will find files, called tests, in which some use cases are tried and its output is checked to match an expected result ensuring that way the correct operation of a feature. Technically speaking, OpenStack uses Nose, Mox and Selenium – in Horizon – to test its services.
This usually don’t work for all cases since there aren’t tests for everything and the accessed environment is constrained, but at least is a good wait to start using pdb.
As always I get excited and I end trying to cover really extensive topics in just a single post, and because of that I leave many things out. So, I prefer sharing some links in which you will learn more details about logging and debugging.
Pdb – Official Pdb manual, the Python debugger. Pdb has a lot of features more than the ones I mentioned here! Highly recommended, as all Python official docs.
Debugging OpenStack with pycharm and pydevd
Configuring Logging – OpenStack Config Reference
Bug collection 2560×1440 wallpaper – In case you liked the featured image ;) I had it in my collection, I don’t know the author… if someone know who is, please let me know!
XKCD comics – Yeah, I spent part of my weekend with this comics