Moving toward a standarized qsort_r(3)
First introduced in Research Unix V3 and eventually standardized as part of C Standard Library as of ANSI/ISO C89, qsort(3) provided an abstract interface where programmers can perform sort operation over an array of objects, by supplying a pointer to that array, the total number of the member objects, the size of individual member object, as well as a compare function. The compare function is expected to take two parameters, both pointers to a member object, and it shall return a negative number, 0, or a positive number, if the first object is considered smaller, equal, or greater respectively.
The standard did not require a specific algorithm to be used for
qsort(3). On BSD derived systems, there were several major rewrites
of the function. It was initially introduced in 1980, possibly by
Bill Joy. In 1983,
qsort(3) was first rewritten
by Earl Cohen for better performance; Keith Bostic reimplemented it in 1990
which in turn was replaced in 1992 by the Bentley & McIlroy implementation,
which was later described in detail in the Engineering a Sort Function paper
by the two authors, which is the basis of
qsort(3) implementation in modern
This interface does have some limitations. In some use cases, the caller may want to have a context associated with the data being compared, for example. Because the comparator function only takes two parameters, which both represents pointers to different data members, it is tricky to have the compare function to figure out what that context was, so usually programmers would end up using a global variable to store the context, which can become a problem later, because the code is no longer reentrant/thread safe.
To address this, in 2002, Garrett Wollman
created a new interface called
This new interface added a new parameter,
thunk, to the standard
interface. The parameter is passed down to the comparator function, which can
be used to represent the context for the comparator.
Now we have an extension which is not yet standardized, and unfortunately and unsurprisingly, some people have different opinions when it comes to adding new parameters.
Although it is not entirely impossible to use C syntax to represent objects,
it is not very straightforward to make use of features that are native to
object oriented programming languages such as C++. For example, it was tricky
to support polymorphism in C without some effort, until the recent introduction
_Generic in the C11 standard that enabled selection of function at compile
time based on the parameters. And even with
_Generic, supporting existing
code can be a headache, because existing binaries would be expecting the old
symbol to exist.
Therefore, it is a big deal when introducing a new interface in the C library and decisions have to be taken seriously. It is especially important to not introduce something incompatible with others, because it would be a nightmare for software developers who develop software that support multiple platforms.
In 2004, there was a feature request for GNU libc to implement the same interface, which was initially rejected by glibc maintainers, citing that there are other ways to perform the same task and it was not a standardized interface.
I would say that the critics were exactly right for a C Runtime library that
is being used by many people, however, apparently they have changed their
mind later. In 2007, the glibc maintainers decided to invent their own,
slightly different and conflicting
Now, there are two different
qsort_r(3) APIs. The BSD
void qsort_r(void *base, size_t nmemb, size_t size, void *thunk, int (*compar)(void *, const void *, const void *));
And the GNU
void qsort_r(void *base, size_t nmemb, size_t size, int (*compar)(const void *, const void *, void *), void *thunk);
The BSD qsort_r(3) interface
Personally, I like the BSD
qsort_r(3) interface more, because for the
comparator function, the first parameter is the context pointer (
When implementing C Block API
qsort_b(3)) and when we want the context pointer (block literal) to
be passed to the comparator function, this way of parameter ordering has the
added benefit that it happens to match how C Block ABI passes the block literal
pointer. In fact, this is exactly why
qsort_b(3) is implemented using
qsort_r(3) currently. With the GNU
qsort_r(3) interface, this will
no longer be possible.
How third party software is supporting different
qsort_r(3) interfaces today?
Just like Dr. Malcolm said in the movie Jurassic Park: Life, uh, finds a way. Developers have come to different ways to solve this.
qsort_r(3) interface was first introduced in FreeBSD, it is
relatively easy to use the operating system as a signal for which variant
of interface is used. This works to some extent, and has the benefit of
not requiring a configure script and using a header file, but if the user
is using a different C runtime library, it is not accurate.
So it seems that it is better to have a configure script of some kind to detect and have the code to do things differently. There are many different ways of detecting the interface style, but they can be basically be categorized into three ways:
- detect the OS name, which shares the same problem of the approach above, and
- to have configure script to compile and link a small program that calls
- to have configure script to compile a small program that redefines
qsort_r(3)the same way of the desired style.
The last two methods are more reliable, and the first method can be changed to work in a not very future-proof way. I will discuss different options later.
What about other sort routines?
In 2020 (0d2fabfc0439), Edward Tomasz Napierala added a qsort_s(3) as defined in C11 standard in the hope that it would help make porting software from Linux easier.
Note that while
qsort_s(3) is now officially standardized, it is not
guaranteed that you will have the same interface on other platforms.
Windows, for example, have
its own version of qsort_s that was different.
GNU libc, on the other hand, did not implement
The standardization of
In 2014, Austin Group (the standardization body for POSIX) has a discussion
to make the GNU qsort_r(3) POSIX standard, and in 2021 the proposal was adopted.
As much as I dislike the GNU
qsort_r(3) interface as explained above, I still
see this as a positive move because, after years, there is some hope that one
can just write code in one way and expect it to work everywhere.
Our work toward standardized
There were several attempts to address the problems related to standardization
qsort_r(3) issue. Ed Schouten in 2018 started the latest effort of adopting
qsort_r(3), but the change was never landed because so many
ports were affected by the change. During the labor day vacation, I have decided
to take a look at the review queue and discovered that this is still in
progress. So I went ahead and take the responsibility of making this
happen based on Ed’s earlier work.
At FreeBSD, we take ABI stability very seriously and are proud of it. What is ABI (Application binary interface)? It is the interface between two binary modules, this includes system call interfaces and shared library interfaces, among other interfaces. ABI stability is maintained by keeping the old interfaces available (sometimes as an optional installable package, sometimes as an default-enabled option in official builds, and typically implemented as a thin shim wrapper around the new interface). By keeping ABI stability, binaries built for an earlier release of the operating system can continue to run without modification.
As of FreeBSD 7.0, versioned symbols were implemented for the C Runtime
Library, also known as
libc.so. Because of the availability of versioned
symbols, we can now support both interfaces in binaries by providing two
versions of the symbol: the default,
qsort_r@@FBSD_1.7 and the historical
For FreeBSD, the
qsort family of functions are sharing the same template
and instantiated for individual functions to support their function
signatures, because the bodies are largely identical. In Ed’s proposal,
qsort_r is implemented by creating a new type of
as well as
cmp_t. The historical
qsort_r is renamed to
qsort_r@FBSD_1.0 will reference it.
As stated before, the BSD version of
qsort_r(3) has a function signature
that is friendly to C Block ABI. Since we will keep the historical
qsort_r(3) implementation, there is no point in making an adapter
qsort_b(3) comparator function, because that would be inefficient.
qsort_b(3) has been modified to call the traditional
stdlib.h, we would be offering the POSIX (GNU) version of
definition, and convert all in-tree callers to use the new interface. However,
there are far too many software that already did their inaccurate detection
of BSD style
qsort_r(3) and some of them will use the historical interface
when they are built on FreeBSD, which will cause the application to crash.
The initial version of Ed’s proposal has defined a macro called
_Generic is available. What it did was to check the 4th parameter to
and if it looked like the comparator function pointer, it would convert the
call to call an integer, which is invalid in C, instead.
This approach will help us to detect applications that blindly use the
qsort_r function signature on FreeBSD and break these
applications, but it has two drawbacks:
- It would break configure scripts that redefines
qsort_ris now a macro; the fix is easy: add a pair of parentheses around the
- it would break existing code that was written exclusively for FreeBSD.
_Generic can also be used to redirect calls with the BSD style
comparator, an obvious alternative to this approach would be to just
redirect the call to the historical interface, as we are keeping it anyway.
Doing so comes with a price: if a GNU style caller has called it with a
function pointer that matches the BSD style comparator as thunk, the code
would be broken because now it would be calling the BSD interface.
The GNU libc defines the
thunk parameter as
void *, and from the
intention of the parameter, the supermajority of programmers shall be
using a data pointer for
thunk, so I have decided to just sacrifice
this usage in favor of compatibility with traditional BSD interface.
Those who really need to pass a function pointer as
explicitly cast it to
There is another limitation with
_Generic: it does not work with C++.
In my latest revision to the patch, I have created a
static inline function
qsort_r with the historical BSD function signature, which calls the
qsort_r(3) interface directly.
So in summary, C or C++ code calling the BSD style interface would call the
historical interface, and others would be calling the new, standardized (GNU)
interface. The only exception is that GNU callers who passes a function pointer
as thunk, and that function pointer happen to match the historical BSD style
interface, in which case, they should be casted to
void * (and they should)
The kernel ships with a version of
qsort_r, too. It would be replaced with
the standardized API at the same time, but no source level compatibility shims
will be provided, because the kernel
qsort_r was undocumented.
This change is mainly for consistency and code sharing between kernel and
configure scripts? In a recent survey performed by FreeBSD portmgr
team (kudos to Antoine Brodin who did it) with an
exp-run, there are many
applications that only checks if the system is *BSD, and in which case would
use the BSD interface without checking, and because our compatibility shims,
some of these scripts have to be patched to use the
(qsort_r) notion instead.
My plan is to fix these as much as possible, or to the extent that the popular
packages are patched, before the
qsort_r(3) change gets landed.
What would happen next? And what would be the best practices for application writers?
My plan is to make the new
qsort_r(3) interface available as of
FreeBSD 14.0-RELEASE. The change is disruptive, so it would not be
backported to earlier FreeBSD releases. The current progress can be seen
at the code review and the exp-run bug.
For application writers who target FreeBSD exclusively, no code modification is needed and the code should continue to work.
For application writers who target primarily Linux systems, the current
proposal will make the code work unless it happens to redefine
qsort_r to the
standardized form without parentheses, in which case the fix is fairly easy:
add a pair of parentheses around
qsort_r, or don’t redefine it in the code,
because stdlib.h should have done that for you and if it didn’t, it is fine to
complain and have the distro fix it.
For applications writers who target both Linux and FreeBSD platforms, we
recommend using redefinition to detect the variant, if the application’s
support for both variants are native (some libraries,
like Isaac Turner’s sort_r library
involves a translation adapter, which could add cost for one variant).
If your application does relies on an adapter, use compile + link detection,
and when both variants are available, use the one that directly calls
for optimal performance.
When no configure script is used, the detection of FreeBSD should be amended from:
#if (defined(__FreeBSD__) && !defined(qsort_r))
And the equivalent of:
Shall be amended to:
#if defined(__GLIBC__) || (defined (__FreeBSD__) && defined(qsort_r))
and if the code chooses to redefine
qsort_r, wrap it with parentheses like