Linked lists, pointer tricks, and good taste

In an interview at TED 2016 (2:10 pm), Linus Torvalds talks about good coding style . As an example, he gives two options for removing elements from singly linked lists (see below). The first option has a special case, while the other does not. Linus prefers the latter.



His comment:



[...] No need to think about why there is no if statement here. It's important to look at the problem from a different perspective and rewrite it so that the special case disappears and becomes a regular case, and that's good code . - L. Torvalds


Linus shows a fairly simple C-style pseudocode as an example. But does not provide a conceptual explanation. Therefore, it is not immediately clear how an indirect pointer works.



Let's take a closer look at this solution and its advantages. As a bonus, not only is the deletion shown, but also the insertion of an element via indirect addressing.



The code



The basic data structure for a singly linked list of integers is shown in Fig. 1.





Fig. 1. A singly linked list of integers



Numbers are randomly selected integer values, and arrows correspond to pointers: head



 - this is a type pointer IntListItem*



, all blocks are instances of a structure IntListItem



, each with its own variable ( next



in the code) of the type IntListItem*



, which points to the next element.



Data structure implementation in C:



struct IntListItem {
    int value;
    struct IntListItem* next;
};
typedef struct IntListItem IntListItem;

struct IntList {
    IntListItem* head;
};
typedef struct IntList IntList;
      
      





Also the (minimal) API:



/* The textbook version */
void remove_cs101(IntList* l, IntListItem* target);
/* A more elegant solution */
void remove_elegant(IntList* l, IntListItem* target);
      
      





Now let's look at implementations remove_cs101()



and remove_elegant()



. The code of the examples does not contradict the pseudocode from Linus's example, but it compiles and runs.



Basic version





Figure: 2. Conceptual model of the list data structure in the basic algorithm



void remove_cs101(IntList *l, IntListItem *target)
{
    IntListItem *cur = l->head, *prev = NULL;
    while (cur != target) {
        prev = cur;
        cur = cur->next;
    }
    if (prev) {
        prev->next = cur->next;
    } else {
        l->head = cur->next;
    }
}
      
      





In the standard approach, two traversal pointers cur



and prev



, which mark the current and previous traversal position in the list, respectively. cur



starts at the head of the list head



and moves forward until the target is found. In turn, it prev



starts with NULL



and is subsequently updated to the previous value cur



each time it moves forward. When the target is found, the algorithm checks if it is not prev



zero. If so, it cur



points to the first item in the list, in which case deleting means moving the head of the list forward.



A more elegant solution



The more elegant version has less code and does not require a separate branch to remove the first item in the list.



void remove_elegant(IntList *l, IntListItem *target)
{
    IntListItem **p = &l->head;
    while ((*p) != target) {
        p = &(*p)->next;
    }
    *p = target->next;
}
      
      





The code uses an indirect pointer p



that contains the address of the pointer to the list item, starting at the address head



. In each iteration, this pointer is expanded to include the address of the pointer to the next item in the list, that is, the address of the item next



in the current one IntListItem



. When the pointer to the list item (*p)



is equal target



, we exit the search loop and remove the item from the list.



How it works?



An indirect pointer has p



two conceptual advantages:



  1. Allows you to interpret a linked list in such a way that the pointer head



    becomes an integral part of the data structure. This eliminates the need for a special case to remove the first item.

  2. It also allows you to evaluate the state of the loop while



    without having to release the pointer pointing to target



    . This allows you to change the pointer to target



    and get around with a single iterator, unlike prev



    and cur



    .


Let's take a look at each item in turn.



Head pointer integration



The Standard Model interprets a linked list as a sequence of instances IntListItem



. The start of the sequence can be obtained using a pointer head



. This leads to the conceptual model shown in Fig. 2. The pointer is head



simply treated as a handle to access the beginning of the list. prev



and cur



are type pointers IntListItem*



and always point to or NULL



.



The elegant implementation uses an indirect addressing scheme that gives a different view of the data structure:





Fig. 3. Conceptual model of the list data structure in a more elegant solution



Here p



represents the typeIntListItem**



and contains the address of the pointer to the current list item. When the pointer moves forward, its address changes to the next element.



In code, it looks like p = &(*p)->next



:



  1. (*p)



    : dereference the address of the pointer to the current list item.

  2. ->next



    : dereference this pointer again and select the field with the address of the next element.

  3. &



    : take the address of this field (that is, get a pointer to it).


This is in line with the interpretation of a data structure, where a list is a sequence of pointers to elements IntListItem



(Figure 3).



Finishing touch



An additional advantage of this particular interpretation is that it allows the pointer next



to be edited for the current element's predecessor throughout the traversal .



If p



contains the address of a pointer to an element of the list, the comparison in the search loop becomes:



while ((*p) != target) {
    ...
}
      
      





The search cycle will end if it (*p)



is equal target



, and as soon as this happens, we can still change (*p)



, since we hold its address p



. Thus, even though the loop iterates to the end, a descriptor (field address next



or pointer head



) is stored that can be used to directly change the pointer to an element.



That's why we can change the input to the element pointer to point to a different location, using *p = target->next



, and therefore we do not need bypass pointers prev



and cur



to remove the item.



New application



It turns out that the same idea can be applied to an extremely laconic implementation of another function in linked lists:, insert_before()



that is, inserting this element before another.



Insertion before existing element



First, let's add the following declaration to list.h



:



void insert_before(IntList *l, IntListItem *before, IntListItem *item);
      
      





The function will take a pointer to the list l, a pointer before an item in this list, and a pointer to a new list item that the function will insert before it.



Fast refactoring



Before moving on, let's make the search loop a separate function:



static inline IntListItem **find_indirect(IntList *l, IntListItem *target)
{
    IntListItem **p = &l->head;
    while ((*p) && (*p) != target) {
        p = &(*p)->next;
    }
    return p;
}
      
      





and use it in remove_elegant()



:



void remove_elegant(IntList *l, IntListItem *target)
{
    IntListItem **p = find_indirect(l, target);
    *p = target->next;
}
      
      





Insert_before () implementation



Using find_indirect()



it is easy to implement insert_before()



:



void insert_before(IntList *l, IntListItem *before, IntListItem *item)
{
    IntListItem **p = find_indirect(l, before);
    *p = item;
    item->next = before;
}
      
      





Particularly pleasing is the solid semantics for edge cases: if it before



points to the head of the list, a new element will be inserted at the beginning, if it before



is null or invalid (that is, does not exist in l



), a new element will be added at the end.



Conclusion



The prerequisite for a more elegant solution for deleting elements is one simple change: an indirect pointer IntListItem**



to iterate over pointers to list elements. Everything else flows from there: there is no need for special cases or branches. One iterator is enough to find and remove the target element. And it turns out that the same approach provides an elegant solution for inserting in general and inserting before an existing element in particular.



So, going back to Linus's initial remark, is this an indicator of good taste? Hard to say. But there is clearly a creative and very elegant solution to a well-known problem.



All Articles