His comment:
[...] No need to think about why there is no if statement here. It's important to look at the problem from a different perspective and rewrite it so that the special case disappears and becomes a regular case, and that's good code . - L. Torvalds
Linus shows a fairly simple C-style pseudocode as an example. But does not provide a conceptual explanation. Therefore, it is not immediately clear how an indirect pointer works.
Let's take a closer look at this solution and its advantages. As a bonus, not only is the deletion shown, but also the insertion of an element via indirect addressing.
The code
The basic data structure for a singly linked list of integers is shown in Fig. 1.
Fig. 1. A singly linked list of integers
Numbers are randomly selected integer values, and arrows correspond to pointers:
head
- this is a type pointer
IntListItem*
, all blocks are instances of a structure
IntListItem
, each with its own variable (
next
in the code) of the type
IntListItem*
, which points to the next element.
Data structure implementation in C:
struct IntListItem {
int value;
struct IntListItem* next;
};
typedef struct IntListItem IntListItem;
struct IntList {
IntListItem* head;
};
typedef struct IntList IntList;
Also the (minimal) API:
/* The textbook version */
void remove_cs101(IntList* l, IntListItem* target);
/* A more elegant solution */
void remove_elegant(IntList* l, IntListItem* target);
Now let's look at implementations
remove_cs101()
and
remove_elegant()
. The code of the examples does not contradict the pseudocode from Linus's example, but it compiles and runs.
Basic version
Figure: 2. Conceptual model of the list data structure in the basic algorithm
void remove_cs101(IntList *l, IntListItem *target)
{
IntListItem *cur = l->head, *prev = NULL;
while (cur != target) {
prev = cur;
cur = cur->next;
}
if (prev) {
prev->next = cur->next;
} else {
l->head = cur->next;
}
}
In the standard approach, two traversal pointers
cur
and
prev
, which mark the current and previous traversal position in the list, respectively.
cur
starts at the head of the list
head
and moves forward until the target is found. In turn, it
prev
starts with
NULL
and is subsequently updated to the previous value
cur
each time it moves forward. When the target is found, the algorithm checks if it is not
prev
zero. If so, it
cur
points to the first item in the list, in which case deleting means moving the head of the list forward.
A more elegant solution
The more elegant version has less code and does not require a separate branch to remove the first item in the list.
void remove_elegant(IntList *l, IntListItem *target)
{
IntListItem **p = &l->head;
while ((*p) != target) {
p = &(*p)->next;
}
*p = target->next;
}
The code uses an indirect pointer
p
that contains the address of the pointer to the list item, starting at the address
head
. In each iteration, this pointer is expanded to include the address of the pointer to the next item in the list, that is, the address of the item
next
in the current one
IntListItem
. When the pointer to the list item
(*p)
is equal
target
, we exit the search loop and remove the item from the list.
How it works?
An indirect pointer has
p
two conceptual advantages:
- Allows you to interpret a linked list in such a way that the pointer
head
becomes an integral part of the data structure. This eliminates the need for a special case to remove the first item.
- It also allows you to evaluate the state of the loop
while
without having to release the pointer pointing totarget
. This allows you to change the pointer totarget
and get around with a single iterator, unlikeprev
andcur
.
Let's take a look at each item in turn.
Head pointer integration
The Standard Model interprets a linked list as a sequence of instances
IntListItem
. The start of the sequence can be obtained using a pointer
head
. This leads to the conceptual model shown in Fig. 2. The pointer is
head
simply treated as a handle to access the beginning of the list.
prev
and
cur
are type pointers
IntListItem*
and always point to or
NULL
.
The elegant implementation uses an indirect addressing scheme that gives a different view of the data structure:
Fig. 3. Conceptual model of the list data structure in a more elegant solution
Here
p
represents the type
IntListItem**
and contains the address of the pointer to the current list item. When the pointer moves forward, its address changes to the next element.
In code, it looks like
p = &(*p)->next
:
(*p)
: dereference the address of the pointer to the current list item.
->next
: dereference this pointer again and select the field with the address of the next element.
&
: take the address of this field (that is, get a pointer to it).
This is in line with the interpretation of a data structure, where a list is a sequence of pointers to elements
IntListItem
(Figure 3).
Finishing touch
An additional advantage of this particular interpretation is that it allows the pointer
next
to be edited for the current element's predecessor throughout the traversal .
If
p
contains the address of a pointer to an element of the list, the comparison in the search loop becomes:
while ((*p) != target) {
...
}
The search cycle will end if it
(*p)
is equal
target
, and as soon as this happens, we can still change
(*p)
, since we hold its address
p
. Thus, even though the loop iterates to the end, a descriptor (field address
next
or pointer
head
) is stored that can be used to directly change the pointer to an element.
That's why we can change the input to the element pointer to point to a different location, using
*p = target->next
, and therefore we do not need bypass pointers
prev
and
cur
to remove the item.
New application
It turns out that the same idea can be applied to an extremely laconic implementation of another function in linked lists:,
insert_before()
that is, inserting this element before another.
Insertion before existing element
First, let's add the following declaration to
list.h
:
void insert_before(IntList *l, IntListItem *before, IntListItem *item);
The function will take a pointer to the list l, a pointer before an item in this list, and a pointer to a new list item that the function will insert before it.
Fast refactoring
Before moving on, let's make the search loop a separate function:
static inline IntListItem **find_indirect(IntList *l, IntListItem *target)
{
IntListItem **p = &l->head;
while ((*p) && (*p) != target) {
p = &(*p)->next;
}
return p;
}
and use it in
remove_elegant()
:
void remove_elegant(IntList *l, IntListItem *target)
{
IntListItem **p = find_indirect(l, target);
*p = target->next;
}
Insert_before () implementation
Using
find_indirect()
it is easy to implement
insert_before()
:
void insert_before(IntList *l, IntListItem *before, IntListItem *item)
{
IntListItem **p = find_indirect(l, before);
*p = item;
item->next = before;
}
Particularly pleasing is the solid semantics for edge cases: if it
before
points to the head of the list, a new element will be inserted at the beginning, if it
before
is null or invalid (that is, does not exist in
l
), a new element will be added at the end.
Conclusion
The prerequisite for a more elegant solution for deleting elements is one simple change: an indirect pointer
IntListItem**
to iterate over pointers to list elements. Everything else flows from there: there is no need for special cases or branches. One iterator is enough to find and remove the target element. And it turns out that the same approach provides an elegant solution for inserting in general and inserting before an existing element in particular.
So, going back to Linus's initial remark, is this an indicator of good taste? Hard to say. But there is clearly a creative and very elegant solution to a well-known problem.