⚉ DWIM Atom feed

DWIM — Trying to make the computer Do What I Mean

The case of the magically appearing mmap window

The moral of this story is to make sure you always have your function declarations available wherever you use a function, even if you're sure that you're using the function properly. In libgit2 we access pack files by mapping parts of them, and those mmap windows are stored in a singly-linked list like so:

typedef struct git_mwindow {
    struct git_mwindow *next;
    ...
} git_mwindow;

and a function that accepts a git_mwindow *w parameter. As I'd been looking at the mwindow code and was used to seeing &w in many places (that code often works with pointers to pointers), I wrote that instead of simply w. Interestingly enough, the code didn't break immediately but gave wrong results a bit later on. I suspected some odd record-keeping in the mwindow code and so I wrote a loop to dump the window list to the console whenever we tried to locate an open window for a particular range. What I saw confused me even more: a new window had appeared in front of the one we should be using! Not only was there a rogue window open, but it also contained completely wrong values except for the next pointer. So, where was this window coming from? I had instrumented the rest of the code and the only place where this could happen is during the function call, which is when I finally manged to see that I was passing a pointer to my pointer instead of my pointer. Had I moved the function declaration to a header earlier, the compiler would have told me.

But why did it seem that there was an extra window appearing? This is a good way to show how structs in C work. When I passed the pointer to the pointer, the function (or more importantly, my function to dump the list) though it was a pointer to a struct and treated it as such. In C, the first field in a struct must have the same address as the struct itself (that is to say, there is no padding allowed before the first field). Thus, as the pointer was actually to a pointer to the real struct and w and w->next have the same address, when looking at the value of w->next, the function was reading the value of the pointer, which is the only thing that was right (reading the rest of the values would be reading values from the caller's stack, which have no meaning in our context).

And there we have it.

    git_mwindow *w;
    some_function(some_var, &w);

can make code think that there is an extra entry in the linked list.

--
Carlos Martín Nieto <cmn@dwim.me>