Archive for November, 2009

I maintain a few open source projects that use SVN, so notes like “Fixes bug #123, patch by J. Random Hacker” in commit messages are more than usual. When I started using Bazaar for Picard, I thought it would be nice to handle these natively. Bazaar could store bug metadata since version 0.16, using the bzr commit --fixes option, so that was nice. It kind of inspired me to add the other part, the author name, which was a little more important for me than bug numbers. I wanted contributors who send plain patches to be equally credited for their work in the default branch viewing tools. I knew that Git had the concept of separated “committer” and “change author” and I really liked the idea, so I submitted a patch to Bazaar to add something similar (that was in bzr 0.91). The change allows you specify the author name on commit, that would be stored along in the revision along with the committer name. So you can run a command like this:

bzr commit --author "J. Random Hacker <jr@example.com>" --fixes project:123 -m "Blah, blah, ..."

And then see the author name in other tools like bzr log or bzr annotate:

------------------------------------------------------------
revno: 1
author: J. Random Hacker <jr@example.com>
committer: Lukáš Lalinský <lalinsky@gmail.com>
[...]

Naturally, commit, log and annotate from QBzr also supported this since the day I wrote the patch. It’s a shame that bzr log only displays author names, not the bug information, because that makes the useful feature quite hidden if you are not using any GUI plugin. I think QBzr users tend to use these features more often, because the commit dialog make it very visible that there is a possibility to do so, but also because bzr qlog will then nicely present the metadata (labels in the revision graph, clickable links, search for bug numbers, etc.). Here you can see an example with revision that fixes two bugs and the committer is different from the change author:

qlog-picard

I often have to turn on my laptop in a situation when I need it to be quiet. It’s easy to disable the login sound in GNOME, but in the new Ubuntu release it became quite hard to disable the GDM startup sound. Previously it was possible to simply use gdmsetup to change the sound, themes, etc. However, in recent versions of GDM (like the one included in Ubuntu 9.10), the window was reduced to a question whether it should log me in automatically or ask for the password. The old configuration file gdm.conf is also gone, replaced with GConf-based configuration. The GDM documentation says, as an example, that sound can be disabled by changing the /apps/gdm/simple-greeter/settings-manager-plugins/sound/active GConf key, so I tried to set it to false, but that didn’t help. I’ve managed to fix it eventually, thanks to a Ubuntu bug, where somebody mentions the /desktop/gnome/sound/event_sounds key.

sudo -u gdm gconftool-2 --set /desktop/gnome/sound/event_sounds --type bool false

I’m still not sure why does it work and why does this key affect GDM, but as long as it’s quiet…

I got a new laptop last week and it came with an extra RAM module. I thought it would be fun to have more RAM in laptop than I have in my desktop machine, so I put it in and to my surprise Ubuntu was reporting only 3GB of RAM, even though the machine had 2×2GB modules. I checked BIOS and it correctly said the machine has 4GB of RAM. It turns out that on a 32-bit machine you can address only 3GB using the standard addressing method.  There is an extension to work it around, called PAE, but the default Linux kernel in Ubuntu has it disabled. I was afraid I’d have to compile my own kernel, but fortunately there is a package with PAE enabled, so I only had to do:

sudo apt-get install linux-generic-pae

Reboot and woohoo, /proc/meminfo now shows the full 4 gigabytes.

Since I’ve started using Qt, I loved the “implicit sharing” concept it uses for it’s strings and container types. It become so much easier to pass these data around. I wasn’t aware that some STL implementations have copy-on-write semantics for strings as well. When I saw some recommendations for std::string on Stack Overflow, I’ve decided to check out the implementation in GCC and discovered that it indeed does some reference counting.

So the next step was comparing the implementations. I wrote a little program today to check how QString and std::wstring compare in terms of copy-on-write performance. Since QChar is 2 bytes and wchar_t is 4 bytes on my machine, it wouldn’t be completely fair comparison, so I’ve included also std::string. The results were quite surprising for me. STL does almost always better, but for some reason I wasn’t able to make not dereference the string on read-only operations.

wchar_t* QString std::wstring std::string
Read 0 ms 0 ms 2143 ms 2224 ms
Write 0 ms 5588 ms 2621 ms 2570 ms
Copy 1618 ms 601 ms 116 ms 117 ms
Copy + read - 601 ms 6161 ms 5079 ms
Copy + write - 11036 ms 6822 ms 6843 ms
Copy + append - 5801 ms 4650 ms 3482 ms

The table shows times in milliseconds for 10000000 repeated operations on a 200-character long string.

  • Read” just reads the string one character at a time, using the default [] operator. I’ve tried hard to find a cheaper way to do this for STL strings, but I failed (I wasn’t interested in using s.data() and then working with the primitive array, I wanted to work with the object directly).
  • Write” writes to all characters of the string, again one character at a time.
  • Copy” assigns one string to another, using the default = operator for string classes and memcpy for wchar_t*.
  • Copy + read” is the same as “Copy“, followed by “Read” performed on the copy.
  • Copy + write” is the same as “Copy“, followed by “Write” performed on the copy.
  • Copy + append” is again the same as “Copy“, followed by appending a short string to the copy.

I guess I should note that this wasn’t meant to be a generic benchmark of the string classes. I just wanted to know performance details about the copy-on-write implementations in them. The conclusion for me is that the STL strings in GCC are better than I always thought, but the fact that they dereference the data on read-only operations is not very nice.

I have written a few Twisted scripts at work that parse incoming data from a socket and save it in a MySQL database, using the MySQLdb package. It’s a well-known fact that the MySQL server will close connections that are inactive for some time and yet I forgot to handle it in last script I wrote. Previously I solved the problem by remembering the last time I used the connection and forcing a reconnect based on this value or the recycle option in SQLAlchemy’s connection pool when I needed a connection pool (which does basically the same as the former). But when I found the problem in the latest script today, I thought I should finally solved it properly, so I started Googling…

I found out about the mysql_ping() function, which seemed perfect for this, especially in combination with the MYSQL_OPT_RECONNECT option. The MySQLdb User’s Guide mentions a wrapper for mysql_ping(), but nothing about the MYSQL_OPT_RECONNECT option. It wouldn’t be me if I didn’t download the source code to check if there really isn’t any way to set the option.

It turns out that the wrapper for mysql_ping() accepts a boolean argument, to set the option locally. It’s even nicely documented in the docstring for the method. Too bad I didn’t look at the API documentation before reading the source code. :)

Anyway, I ended up with code like this and it seems to be working nicely:

if self.db is None:
    self.db = MySQLdb.connect(...)
else:
    self.db.ping(True)