Moving toward a standarized qsort_r(3)
History of qsort_r(3)
First introduced in Research Unix V3 and eventually standardized as part of C Standard Library as of ANSI/ISO C89, qsort(3) provided an abstract interface where programmers can perform sort operation over an array of objects, by supplying a pointer to that array, the total number of the member objects, the size of individual member object, as well as a compare function. The compare function is expected to take two parameters, both pointers to a member object, and it shall return a negative number, 0, or a positive number, if the first object is considered smaller, equal, or greater respectively.
The standard did not require a specific algorithm to be used for
qsort(3)
. On BSD derived systems, there were several major rewrites
of the function. It was initially introduced in 1980, possibly by
Bill Joy. In 1983, qsort(3)
was first rewritten
by Earl Cohen for better performance; Keith Bostic reimplemented it in 1990
which in turn was replaced in 1992 by the Bentley & McIlroy implementation,
which was later described in detail in the Engineering a Sort Function paper
by the two authors, which is the basis of qsort(3)
implementation in modern
BSD systems.
This interface does have some limitations. In some use cases, the caller may want to have a context associated with the data being compared, for example. Because the comparator function only takes two parameters, which both represents pointers to different data members, it is tricky to have the compare function to figure out what that context was, so usually programmers would end up using a global variable to store the context, which can become a problem later, because the code is no longer reentrant/thread safe.
To address this, in 2002, Garrett Wollman
created a new interface called qsort_r(3)
(eca67d510483).
This new interface added a new parameter, thunk
, to the standard qsort(3)
interface. The parameter is passed down to the comparator function, which can
be used to represent the context for the comparator.
Now we have an extension which is not yet standardized, and unfortunately and unsurprisingly, some people have different opinions when it comes to adding new parameters.
The problem
Although it is not entirely impossible to use C syntax to represent objects,
it is not very straightforward to make use of features that are native to
object oriented programming languages such as C++. For example, it was tricky
to support polymorphism in C without some effort, until the recent introduction
of _Generic
in the C11 standard that enabled selection of function at compile
time based on the parameters. And even with _Generic
, supporting existing
code can be a headache, because existing binaries would be expecting the old
symbol to exist.
Therefore, it is a big deal when introducing a new interface in the C library and decisions have to be taken seriously. It is especially important to not introduce something incompatible with others, because it would be a nightmare for software developers who develop software that support multiple platforms.
In 2004, there was a feature request for GNU libc to implement the same interface, which was initially rejected by glibc maintainers, citing that there are other ways to perform the same task and it was not a standardized interface.
I would say that the critics were exactly right for a C Runtime library that
is being used by many people, however, apparently they have changed their
mind later. In 2007, the glibc maintainers decided to invent their own,
slightly different and conflicting qsort_r(3)
interface.
Now, there are two different qsort_r(3)
APIs. The BSD qsort_r(3)
is:
void qsort_r(void *base, size_t nmemb, size_t size, void *thunk,
int (*compar)(void *, const void *, const void *));
And the GNU qsort_r(3)
is:
void qsort_r(void *base, size_t nmemb, size_t size,
int (*compar)(const void *, const void *, void *), void *thunk);
The BSD qsort_r(3) interface
Personally, I like the BSD qsort_r(3)
interface more, because for the
comparator function, the first parameter is the context pointer (thunk
).
When implementing C Block API
(namely, qsort_b(3)
) and when we want the context pointer (block literal) to
be passed to the comparator function, this way of parameter ordering has the
added benefit that it happens to match how C Block ABI passes the block literal
pointer. In fact, this is exactly why qsort_b(3)
is implemented using
qsort_r(3)
currently. With the GNU qsort_r(3)
interface, this will
no longer be possible.
How third party software is supporting different qsort_r(3)
interfaces today?
Just like Dr. Malcolm said in the movie Jurassic Park: Life, uh, finds a way. Developers have come to different ways to solve this.
Since the qsort_r(3)
interface was first introduced in FreeBSD, it is
relatively easy to use the operating system as a signal for which variant
of interface is used. This works to some extent, and has the benefit of
not requiring a configure script and using a header file, but if the user
is using a different C runtime library, it is not accurate.
So it seems that it is better to have a configure script of some kind to detect and have the code to do things differently. There are many different ways of detecting the interface style, but they can be basically be categorized into three ways:
- detect the OS name, which shares the same problem of the approach above, and
- to have configure script to compile and link a small program that calls
qsort_r(3)
, and - to have configure script to compile a small program that redefines
qsort_r(3)
the same way of the desired style.
The last two methods are more reliable, and the first method can be changed to work in a not very future-proof way. I will discuss different options later.
What about other sort routines?
In 2020 (0d2fabfc0439), Edward Tomasz Napierala added a qsort_s(3) as defined in C11 standard in the hope that it would help make porting software from Linux easier.
Note that while qsort_s(3)
is now officially standardized, it is not
guaranteed that you will have the same interface on other platforms.
Windows, for example, have
its own version of qsort_s that was different.
GNU libc, on the other hand, did not implement qsort_s(3)
.
The standardization of qsort_r(3)
In 2014, Austin Group (the standardization body for POSIX) has a discussion
to make the GNU qsort_r(3) POSIX standard, and in 2021 the proposal was adopted.
As much as I dislike the GNU qsort_r(3)
interface as explained above, I still
see this as a positive move because, after years, there is some hope that one
can just write code in one way and expect it to work everywhere.
Our work toward standardized qsort_r(3)
There were several attempts to address the problems related to standardization
of qsort_r(3)
issue. Ed Schouten in 2018 started the latest effort of adopting
the standardized qsort_r(3)
, but the change was never landed because so many
ports were affected by the change. During the labor day vacation, I have decided
to take a look at the review queue and discovered that this is still in
progress. So I went ahead and take the responsibility of making this
happen based on Ed’s earlier work.
At FreeBSD, we take ABI stability very seriously and are proud of it. What is ABI (Application binary interface)? It is the interface between two binary modules, this includes system call interfaces and shared library interfaces, among other interfaces. ABI stability is maintained by keeping the old interfaces available (sometimes as an optional installable package, sometimes as an default-enabled option in official builds, and typically implemented as a thin shim wrapper around the new interface). By keeping ABI stability, binaries built for an earlier release of the operating system can continue to run without modification.
As of FreeBSD 7.0, versioned symbols were implemented for the C Runtime
Library, also known as libc.so
. Because of the availability of versioned
symbols, we can now support both interfaces in binaries by providing two
versions of the symbol: the default, qsort_r@@FBSD_1.7
and the historical
qsort_r@FBSD_1.0
.
For FreeBSD, the qsort
family of functions are sharing the same template
and instantiated for individual functions to support their function
signatures, because the bodies are largely identical. In Ed’s proposal,
the new qsort_r
is implemented by creating a new type of CMP
macro
as well as cmp_t
. The historical qsort_r
is renamed to __qsort_r_compat
,
and qsort_r@FBSD_1.0
will reference it.
As stated before, the BSD version of qsort_r(3)
has a function signature
that is friendly to C Block ABI. Since we will keep the historical
qsort_r(3)
implementation, there is no point in making an adapter
around qsort_b(3)
comparator function, because that would be inefficient.
Instead, qsort_b(3)
has been modified to call the traditional qsort_r(3)
interface directly.
For stdlib.h
, we would be offering the POSIX (GNU) version of qsort_r(3)
definition, and convert all in-tree callers to use the new interface. However,
there are far too many software that already did their inaccurate detection
of BSD style qsort_r(3)
and some of them will use the historical interface
when they are built on FreeBSD, which will cause the application to crash.
The initial version of Ed’s proposal has defined a macro called qsort_r
when
_Generic
is available. What it did was to check the 4th parameter to qsort_r
,
and if it looked like the comparator function pointer, it would convert the
call to call an integer, which is invalid in C, instead.
This approach will help us to detect applications that blindly use the
historical BSD qsort_r
function signature on FreeBSD and break these
applications, but it has two drawbacks:
- It would break configure scripts that redefines
qsort_r
, becauseqsort_r
is now a macro; the fix is easy: add a pair of parentheses around theqsort_r
; and - it would break existing code that was written exclusively for FreeBSD.
Because _Generic
can also be used to redirect calls with the BSD style
comparator, an obvious alternative to this approach would be to just
redirect the call to the historical interface, as we are keeping it anyway.
Doing so comes with a price: if a GNU style caller has called it with a
function pointer that matches the BSD style comparator as thunk, the code
would be broken because now it would be calling the BSD interface.
The GNU libc defines the thunk
parameter as void *
, and from the
intention of the parameter, the supermajority of programmers shall be
using a data pointer for thunk
, so I have decided to just sacrifice
this usage in favor of compatibility with traditional BSD interface.
Those who really need to pass a function pointer as thunk
should
explicitly cast it to void *
.
There is another limitation with _Generic
: it does not work with C++.
In my latest revision to the patch, I have created a static inline
function
called qsort_r
with the historical BSD function signature, which calls the
historical qsort_r(3)
interface directly.
So in summary, C or C++ code calling the BSD style interface would call the
historical interface, and others would be calling the new, standardized (GNU)
interface. The only exception is that GNU callers who passes a function pointer
as thunk, and that function pointer happen to match the historical BSD style
interface, in which case, they should be casted to void *
(and they should)
when calling qsort_r
.
The kernel ships with a version of qsort_r
, too. It would be replaced with
the standardized API at the same time, but no source level compatibility shims
will be provided, because the kernel qsort_r
was undocumented.
This change is mainly for consistency and code sharing between kernel and
userland.
What about configure
scripts? In a recent survey performed by FreeBSD portmgr
team (kudos to Antoine Brodin who did it) with an exp-run
, there are many
applications that only checks if the system is *BSD, and in which case would
use the BSD interface without checking, and because our compatibility shims,
some of these scripts have to be patched to use the (qsort_r)
notion instead.
My plan is to fix these as much as possible, or to the extent that the popular
packages are patched, before the qsort_r(3)
change gets landed.
What would happen next? And what would be the best practices for application writers?
My plan is to make the new qsort_r(3)
interface available as of
FreeBSD 14.0-RELEASE. The change is disruptive, so it would not be
backported to earlier FreeBSD releases. The current progress can be seen
at the code review and the exp-run bug.
For application writers who target FreeBSD exclusively, no code modification is needed and the code should continue to work.
For application writers who target primarily Linux systems, the current
proposal will make the code work unless it happens to redefine qsort_r
to the
standardized form without parentheses, in which case the fix is fairly easy:
add a pair of parentheses around qsort_r
, or don’t redefine it in the code,
because stdlib.h should have done that for you and if it didn’t, it is fine to
complain and have the distro fix it.
For applications writers who target both Linux and FreeBSD platforms, we
recommend using redefinition to detect the variant, if the application’s
support for both variants are native (some libraries,
like Isaac Turner’s sort_r library
involves a translation adapter, which could add cost for one variant).
If your application does relies on an adapter, use compile + link detection,
and when both variants are available, use the one that directly calls qsort_r
for optimal performance.
When no configure script is used, the detection of FreeBSD should be amended from:
#if defined(__FreeBSD__)
To:
#if (defined(__FreeBSD__) && !defined(qsort_r))
And the equivalent of:
#if defined(__GLIBC__)
Shall be amended to:
#if defined(__GLIBC__) || (defined (__FreeBSD__) && defined(qsort_r))
and if the code chooses to redefine qsort_r
, wrap it with parentheses like
(qsort_r)
.