Commit 93a49594 authored by Tollef Fog Heen's avatar Tollef Fog Heen

Import jemalloc


git-svn-id: http://www.varnish-cache.org/svn/trunk/varnish-cache@3215 d4fa192b-c00b-0410-8231-f00ffab90ce4
parent 38368054
CFLAGS := -O3 -g
# See source code comments to avoid memory leaks when enabling MALLOC_MAG.
#CPPFLAGS := -DMALLOC_PRODUCTION -DMALLOC_MAG
CPPFLAGS := -DMALLOC_PRODUCTION
all: libjemalloc.so.0 libjemalloc_mt.so.0
jemalloc_linux_mt.o: jemalloc_linux.c
gcc $(CFLAGS) -c -DPIC -fPIC $(CPPFLAGS) -D__isthreaded=true -o $@ $+
jemalloc_linux.o: jemalloc_linux.c
gcc $(CFLAGS) -c -DPIC -fPIC $(CPPFLAGS) -D__isthreaded=false -o $@ $+
libjemalloc_mt.so.0: jemalloc_linux_mt.o
gcc -shared -lpthread -o $@ $+
ln -sf $@ libjemalloc_mt.so
libjemalloc.so.0: jemalloc_linux.o
gcc -shared -lpthread -o $@ $+
ln -sf $@ libjemalloc.so
clean:
rm -f *.o *.so.0 *.so
This is a minimal-effort stand-alone jemalloc distribution for Linux. The main
rough spots are:
* __isthreaded must be hard-coded, since the pthreads library really needs to
be involved in order to toggle it at run time. Therefore, this distribution
builds two separate libraries:
+ libjemalloc_mt.so.0 : Use for multi-threaded applications.
+ libjemalloc.so.0 : Use for single-threaded applications.
Both libraries link against libpthread, though with a bit more code hacking,
this dependency could be removed for the single-threaded version.
* MALLOC_MAG (thread-specific caching, using magazines) is disabled, because
special effort is required to avoid memory leaks when it is enabled. To make
cleanup automatic, we would need help from the pthreads library. If you
enable MALLOC_MAG, be sure to call _malloc_thread_cleanup() in each thread
just before it exits.
* The code that determines the number of CPUs is sketchy. The trouble is that
we must avoid any memory allocation during early initialization.
In order to build:
make
This generates two shared libraries, which you can either link against, or
pre-load.
Linking and running, where /path/to is the path to libjemalloc (-lpthread
required even for libjemalloc.so):
gcc app.o -o app -L/path/to -ljemalloc_mt -lpthread
LD_LIBRARY_PATH=/path/to app
Pre-loading:
LD_PRELOAD=/path/to/libjemalloc_mt.so.0 app
jemalloc has a lot of run-time tuning options. See the man page for details:
nroff -man malloc.3 | less
In particular, take a look at the B, F, and N options. If you enable
MALLOC_MAG, look at the G and R options.
If your application is crashing, or performance seems to be lacking, enable
assertions and statistics gathering by removing MALLOC_PRODUCTION from CPPFLAGS
in the Makefile. In order to print a statistics summary at program exit, run
your application like:
LD_PRELOAD=/path/to/libjemalloc_mt.so.0 MALLOC_OPTIONS=P app
Please contact Jason Evans <jasone@canonware.com> with questions, comments, bug
reports, etc.
This source diff could not be displayed because it is too large. You can view the blob instead.
.\" Copyright (c) 1980, 1991, 1993
.\" The Regents of the University of California. All rights reserved.
.\"
.\" This code is derived from software contributed to Berkeley by
.\" the American National Standards Committee X3, on Information
.\" Processing Systems.
.\"
.\" Redistribution and use in source and binary forms, with or without
.\" modification, are permitted provided that the following conditions
.\" are met:
.\" 1. Redistributions of source code must retain the above copyright
.\" notice, this list of conditions and the following disclaimer.
.\" 2. Redistributions in binary form must reproduce the above copyright
.\" notice, this list of conditions and the following disclaimer in the
.\" documentation and/or other materials provided with the distribution.
.\" 3. Neither the name of the University nor the names of its contributors
.\" may be used to endorse or promote products derived from this software
.\" without specific prior written permission.
.\"
.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
.\" SUCH DAMAGE.
.\"
.\" @(#)malloc.3 8.1 (Berkeley) 6/4/93
.\" $FreeBSD: head/lib/libc/stdlib/malloc.3 182225 2008-08-27 02:00:53Z jasone $
.\"
.Dd August 26, 2008
.Dt MALLOC 3
.Os
.Sh NAME
.Nm malloc , calloc , realloc , free , reallocf , malloc_usable_size
.Nd general purpose memory allocation functions
.Sh LIBRARY
.Lb libc
.Sh SYNOPSIS
.In stdlib.h
.Ft void *
.Fn malloc "size_t size"
.Ft void *
.Fn calloc "size_t number" "size_t size"
.Ft void *
.Fn realloc "void *ptr" "size_t size"
.Ft void *
.Fn reallocf "void *ptr" "size_t size"
.Ft void
.Fn free "void *ptr"
.Ft const char *
.Va _malloc_options ;
.Ft void
.Fo \*(lp*_malloc_message\*(rp
.Fa "const char *p1" "const char *p2" "const char *p3" "const char *p4"
.Fc
.In malloc_np.h
.Ft size_t
.Fn malloc_usable_size "const void *ptr"
.Sh DESCRIPTION
The
.Fn malloc
function allocates
.Fa size
bytes of uninitialized memory.
The allocated space is suitably aligned (after possible pointer coercion)
for storage of any type of object.
.Pp
The
.Fn calloc
function allocates space for
.Fa number
objects,
each
.Fa size
bytes in length.
The result is identical to calling
.Fn malloc
with an argument of
.Dq "number * size" ,
with the exception that the allocated memory is explicitly initialized
to zero bytes.
.Pp
The
.Fn realloc
function changes the size of the previously allocated memory referenced by
.Fa ptr
to
.Fa size
bytes.
The contents of the memory are unchanged up to the lesser of the new and
old sizes.
If the new size is larger,
the contents of the newly allocated portion of the memory are undefined.
Upon success, the memory referenced by
.Fa ptr
is freed and a pointer to the newly allocated memory is returned.
Note that
.Fn realloc
and
.Fn reallocf
may move the memory allocation, resulting in a different return value than
.Fa ptr .
If
.Fa ptr
is
.Dv NULL ,
the
.Fn realloc
function behaves identically to
.Fn malloc
for the specified size.
.Pp
The
.Fn reallocf
function is identical to the
.Fn realloc
function, except that it
will free the passed pointer when the requested memory cannot be allocated.
This is a
.Fx
specific API designed to ease the problems with traditional coding styles
for realloc causing memory leaks in libraries.
.Pp
The
.Fn free
function causes the allocated memory referenced by
.Fa ptr
to be made available for future allocations.
If
.Fa ptr
is
.Dv NULL ,
no action occurs.
.Pp
The
.Fn malloc_usable_size
function returns the usable size of the allocation pointed to by
.Fa ptr .
The return value may be larger than the size that was requested during
allocation.
The
.Fn malloc_usable_size
function is not a mechanism for in-place
.Fn realloc ;
rather it is provided solely as a tool for introspection purposes.
Any discrepancy between the requested allocation size and the size reported by
.Fn malloc_usable_size
should not be depended on, since such behavior is entirely
implementation-dependent.
.Sh TUNING
Once, when the first call is made to one of these memory allocation
routines, various flags will be set or reset, which affects the
workings of this allocator implementation.
.Pp
The
.Dq name
of the file referenced by the symbolic link named
.Pa /etc/malloc.conf ,
the value of the environment variable
.Ev MALLOC_OPTIONS ,
and the string pointed to by the global variable
.Va _malloc_options
will be interpreted, in that order, from left to right as flags.
.Pp
Each flag is a single letter, optionally prefixed by a non-negative base 10
integer repetition count.
For example,
.Dq 3N
is equivalent to
.Dq NNN .
Some flags control parameter magnitudes, where uppercase increases the
magnitude, and lowercase decreases the magnitude.
Other flags control boolean parameters, where uppercase indicates that a
behavior is set, or on, and lowercase means that a behavior is not set, or off.
.Bl -tag -width indent
.It A
All warnings (except for the warning about unknown
flags being set) become fatal.
The process will call
.Xr abort 3
in these cases.
.It B
Double/halve the per-arena lock contention threshold at which a thread is
randomly re-assigned to an arena.
This dynamic load balancing tends to push threads away from highly contended
arenas, which avoids worst case contention scenarios in which threads
disproportionately utilize arenas.
However, due to the highly dynamic load that applications may place on the
allocator, it is impossible for the allocator to know in advance how sensitive
it should be to contention over arenas.
Therefore, some applications may benefit from increasing or decreasing this
threshold parameter.
This option is not available for some configurations (non-PIC).
.It C
Double/halve the size of the maximum size class that is a multiple of the
cacheline size (64).
Above this size, subpage spacing (256 bytes) is used for size classes.
The default value is 512 bytes.
.It D
Use
.Xr sbrk 2
to acquire memory in the data storage segment (DSS).
This option is enabled by default.
See the
.Dq M
option for related information and interactions.
.It F
Double/halve the per-arena maximum number of dirty unused pages that are
allowed to accumulate before informing the kernel about at least half of those
pages via
.Xr madvise 2 .
This provides the kernel with sufficient information to recycle dirty pages if
physical memory becomes scarce and the pages remain unused.
The default is 512 pages per arena;
.Ev MALLOC_OPTIONS=10f
will prevent any dirty unused pages from accumulating.
.It G
When there are multiple threads, use thread-specific caching for objects that
are smaller than one page.
This option is enabled by default.
Thread-specific caching allows many allocations to be satisfied without
performing any thread synchronization, at the cost of increased memory use.
See the
.Dq R
option for related tuning information.
This option is not available for some configurations (non-PIC).
.It J
Each byte of new memory allocated by
.Fn malloc ,
.Fn realloc
or
.Fn reallocf
will be initialized to 0xa5.
All memory returned by
.Fn free ,
.Fn realloc
or
.Fn reallocf
will be initialized to 0x5a.
This is intended for debugging and will impact performance negatively.
.It K
Double/halve the virtual memory chunk size.
The default chunk size is 1 MB.
.It M
Use
.Xr mmap 2
to acquire anonymously mapped memory.
This option is enabled by default.
If both the
.Dq D
and
.Dq M
options are enabled, the allocator prefers the DSS over anonymous mappings,
but allocation only fails if memory cannot be acquired via either method.
If neither option is enabled, then the
.Dq M
option is implicitly enabled in order to assure that there is a method for
acquiring memory.
.It N
Double/halve the number of arenas.
The default number of arenas is two times the number of CPUs, or one if there
is a single CPU.
.It P
Various statistics are printed at program exit via an
.Xr atexit 3
function.
This has the potential to cause deadlock for a multi-threaded process that exits
while one or more threads are executing in the memory allocation functions.
Therefore, this option should only be used with care; it is primarily intended
as a performance tuning aid during application development.
.It Q
Double/halve the size of the maximum size class that is a multiple of the
quantum (8 or 16 bytes, depending on architecture).
Above this size, cacheline spacing is used for size classes.
The default value is 128 bytes.
.It R
Double/halve magazine size, which approximately doubles/halves the number of
rounds in each magazine.
Magazines are used by the thread-specific caching machinery to acquire and
release objects in bulk.
Increasing the magazine size decreases locking overhead, at the expense of
increased memory usage.
This option is not available for some configurations (non-PIC).
.It U
Generate
.Dq utrace
entries for
.Xr ktrace 1 ,
for all operations.
Consult the source for details on this option.
.It V
Attempting to allocate zero bytes will return a
.Dv NULL
pointer instead of
a valid pointer.
(The default behavior is to make a minimal allocation and return a
pointer to it.)
This option is provided for System V compatibility.
This option is incompatible with the
.Dq X
option.
.It X
Rather than return failure for any allocation function,
display a diagnostic message on
.Dv stderr
and cause the program to drop
core (using
.Xr abort 3 ) .
This option should be set at compile time by including the following in
the source code:
.Bd -literal -offset indent
_malloc_options = "X";
.Ed
.It Z
Each byte of new memory allocated by
.Fn malloc ,
.Fn realloc
or
.Fn reallocf
will be initialized to 0.
Note that this initialization only happens once for each byte, so
.Fn realloc
and
.Fn reallocf
calls do not zero memory that was previously allocated.
This is intended for debugging and will impact performance negatively.
.El
.Pp
The
.Dq J
and
.Dq Z
options are intended for testing and debugging.
An application which changes its behavior when these options are used
is flawed.
.Sh IMPLEMENTATION NOTES
Traditionally, allocators have used
.Xr sbrk 2
to obtain memory, which is suboptimal for several reasons, including race
conditions, increased fragmentation, and artificial limitations on maximum
usable memory.
This allocator uses both
.Xr sbrk 2
and
.Xr mmap 2
by default, but it can be configured at run time to use only one or the other.
If resource limits are not a primary concern, the preferred configuration is
.Ev MALLOC_OPTIONS=dM
or
.Ev MALLOC_OPTIONS=DM .
When so configured, the
.Ar datasize
resource limit has little practical effect for typical applications; use
.Ev MALLOC_OPTIONS=Dm
if that is a concern.
Regardless of allocator configuration, the
.Ar vmemoryuse
resource limit can be used to bound the total virtual memory used by a
process, as described in
.Xr limits 1 .
.Pp
This allocator uses multiple arenas in order to reduce lock contention for
threaded programs on multi-processor systems.
This works well with regard to threading scalability, but incurs some costs.
There is a small fixed per-arena overhead, and additionally, arenas manage
memory completely independently of each other, which means a small fixed
increase in overall memory fragmentation.
These overheads are not generally an issue, given the number of arenas normally
used.
Note that using substantially more arenas than the default is not likely to
improve performance, mainly due to reduced cache performance.
However, it may make sense to reduce the number of arenas if an application
does not make much use of the allocation functions.
.Pp
In addition to multiple arenas, this allocator supports thread-specific
caching for small objects (smaller than one page), in order to make it
possible to completely avoid synchronization for most small allocation requests.
Such caching allows very fast allocation in the common case, but it increases
memory usage and fragmentation, since a bounded number of objects can remain
allocated in each thread cache.
.Pp
Memory is conceptually broken into equal-sized chunks, where the chunk size is
a power of two that is greater than the page size.
Chunks are always aligned to multiples of the chunk size.
This alignment makes it possible to find metadata for user objects very
quickly.
.Pp
User objects are broken into three categories according to size: small, large,
and huge.
Small objects are smaller than one page.
Large objects are smaller than the chunk size.
Huge objects are a multiple of the chunk size.
Small and large objects are managed by arenas; huge objects are managed
separately in a single data structure that is shared by all threads.
Huge objects are used by applications infrequently enough that this single
data structure is not a scalability issue.
.Pp
Each chunk that is managed by an arena tracks its contents as runs of
contiguous pages (unused, backing a set of small objects, or backing one large
object).
The combination of chunk alignment and chunk page maps makes it possible to
determine all metadata regarding small and large allocations in constant time.
.Pp
Small objects are managed in groups by page runs.
Each run maintains a bitmap that tracks which regions are in use.
Allocation requests that are no more than half the quantum (8 or 16, depending
on architecture) are rounded up to the nearest power of two.
Allocation requests that are more than half the quantum, but no more than the
minimum cacheline-multiple size class (see the
.Dq Q
option) are rounded up to the nearest multiple of the quantum.
Allocation requests that are more than the minumum cacheline-multiple size
class, but no more than the minimum subpage-multiple size class (see the
.Dq C
option) are rounded up to the nearest multiple of the cacheline size (64).
Allocation requests that are more than the minimum subpage-multiple size class
are rounded up to the nearest multiple of the subpage size (256).
Allocation requests that are more than one page, but small enough to fit in
an arena-managed chunk (see the
.Dq K
option), are rounded up to the nearest run size.
Allocation requests that are too large to fit in an arena-managed chunk are
rounded up to the nearest multiple of the chunk size.
.Pp
Allocations are packed tightly together, which can be an issue for
multi-threaded applications.
If you need to assure that allocations do not suffer from cacheline sharing,
round your allocation requests up to the nearest multiple of the cacheline
size.
.Sh DEBUGGING MALLOC PROBLEMS
The first thing to do is to set the
.Dq A
option.
This option forces a coredump (if possible) at the first sign of trouble,
rather than the normal policy of trying to continue if at all possible.
.Pp
It is probably also a good idea to recompile the program with suitable
options and symbols for debugger support.
.Pp
If the program starts to give unusual results, coredump or generally behave
differently without emitting any of the messages mentioned in the next
section, it is likely because it depends on the storage being filled with
zero bytes.
Try running it with the
.Dq Z
option set;
if that improves the situation, this diagnosis has been confirmed.
If the program still misbehaves,
the likely problem is accessing memory outside the allocated area.
.Pp
Alternatively, if the symptoms are not easy to reproduce, setting the
.Dq J
option may help provoke the problem.
.Pp
In truly difficult cases, the
.Dq U
option, if supported by the kernel, can provide a detailed trace of
all calls made to these functions.
.Pp
Unfortunately this implementation does not provide much detail about
the problems it detects; the performance impact for storing such information
would be prohibitive.
There are a number of allocator implementations available on the Internet
which focus on detecting and pinpointing problems by trading performance for
extra sanity checks and detailed diagnostics.
.Sh DIAGNOSTIC MESSAGES
If any of the memory allocation/deallocation functions detect an error or
warning condition, a message will be printed to file descriptor
.Dv STDERR_FILENO .
Errors will result in the process dumping core.
If the
.Dq A
option is set, all warnings are treated as errors.
.Pp
The
.Va _malloc_message
variable allows the programmer to override the function which emits
the text strings forming the errors and warnings if for some reason
the
.Dv stderr
file descriptor is not suitable for this.
Please note that doing anything which tries to allocate memory in
this function is likely to result in a crash or deadlock.
.Pp
All messages are prefixed by
.Dq Ao Ar progname Ac Ns Li : (malloc) .
.Sh RETURN VALUES
The
.Fn malloc
and
.Fn calloc
functions return a pointer to the allocated memory if successful; otherwise
a
.Dv NULL
pointer is returned and
.Va errno
is set to
.Er ENOMEM .
.Pp
The
.Fn realloc
and
.Fn reallocf
functions return a pointer, possibly identical to
.Fa ptr ,
to the allocated memory
if successful; otherwise a
.Dv NULL
pointer is returned, and
.Va errno
is set to
.Er ENOMEM
if the error was the result of an allocation failure.
The
.Fn realloc
function always leaves the original buffer intact
when an error occurs, whereas
.Fn reallocf
deallocates it in this case.
.Pp
The
.Fn free
function returns no value.
.Pp
The
.Fn malloc_usable_size
function returns the usable size of the allocation pointed to by
.Fa ptr .
.Sh ENVIRONMENT
The following environment variables affect the execution of the allocation
functions:
.Bl -tag -width ".Ev MALLOC_OPTIONS"
.It Ev MALLOC_OPTIONS
If the environment variable
.Ev MALLOC_OPTIONS
is set, the characters it contains will be interpreted as flags to the
allocation functions.
.El
.Sh EXAMPLES
To dump core whenever a problem occurs:
.Pp
.Bd -literal -offset indent
ln -s 'A' /etc/malloc.conf
.Ed
.Pp
To specify in the source that a program does no return value checking
on calls to these functions:
.Bd -literal -offset indent
_malloc_options = "X";
.Ed
.Sh SEE ALSO
.Xr limits 1 ,
.Xr madvise 2 ,
.Xr mmap 2 ,
.Xr sbrk 2 ,
.Xr alloca 3 ,
.Xr atexit 3 ,
.Xr getpagesize 3 ,
.Xr memory 3 ,
.Xr posix_memalign 3
.Sh STANDARDS
The
.Fn malloc ,
.Fn calloc ,
.Fn realloc
and
.Fn free
functions conform to
.St -isoC .
.Sh HISTORY
The
.Fn reallocf
function first appeared in
.Fx 3.0 .
.Pp
The
.Fn malloc_usable_size
function first appeared in
.Fx 7.0 .
This source diff could not be displayed because it is too large. You can view the blob instead.
/******************************************************************************
*
* Copyright (C) 2008 Jason Evans <jasone@FreeBSD.org>.
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
* 1. Redistributions of source code must retain the above copyright
* notice(s), this list of conditions and the following disclaimer
* unmodified other than the allowable addition of one or more
* copyright notices.
* 2. Redistributions in binary form must reproduce the above copyright
* notice(s), this list of conditions and the following disclaimer in
* the documentation and/or other materials provided with the
* distribution.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDER(S) ``AS IS'' AND ANY
* EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
* PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER(S) BE
* LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
* CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
* SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
* BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
* WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE
* OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE,
* EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*
******************************************************************************
*
* cpp macro implementation of left-leaning red-black trees.
*
* Usage:
*
* (Optional, see assert(3).)
* #define NDEBUG
*
* (Required.)
* #include <assert.h>
* #include <rb.h>
* ...
*
* All operations are done non-recursively. Parent pointers are not used, and
* color bits are stored in the least significant bit of right-child pointers,
* thus making node linkage as compact as is possible for red-black trees.
*
* Some macros use a comparison function pointer, which is expected to have the
* following prototype:
*
* int (a_cmp *)(a_type *a_node, a_type *a_other);
* ^^^^^^
* or a_key
*
* Interpretation of comparision function return values:
*
* -1 : a_node < a_other
* 0 : a_node == a_other
* 1 : a_node > a_other
*
* In all cases, the a_node or a_key macro argument is the first argument to the
* comparison function, which makes it possible to write comparison functions
* that treat the first argument specially.
*
******************************************************************************/
#ifndef RB_H_
#define RB_H_
//__FBSDID("$FreeBSD: head/lib/libc/stdlib/rb.h 178995 2008-05-14 18:33:13Z jasone $");
/* Node structure. */
#define rb_node(a_type) \
struct { \
a_type *rbn_left; \
a_type *rbn_right_red; \
}
/* Root structure. */
#define rb_tree(a_type) \
struct { \
a_type *rbt_root; \
a_type rbt_nil; \
}
/* Left accessors. */
#define rbp_left_get(a_type, a_field, a_node) \
((a_node)->a_field.rbn_left)
#define rbp_left_set(a_type, a_field, a_node, a_left) do { \
(a_node)->a_field.rbn_left = a_left; \
} while (0)
/* Right accessors. */
#define rbp_right_get(a_type, a_field, a_node) \
((a_type *) (((intptr_t) (a_node)->a_field.rbn_right_red) \
& ((ssize_t)-2)))
#define rbp_right_set(a_type, a_field, a_node, a_right) do { \
(a_node)->a_field.rbn_right_red = (a_type *) (((uintptr_t) a_right) \
| (((uintptr_t) (a_node)->a_field.rbn_right_red) & ((size_t)1))); \
} while (0)
/* Color accessors. */
#define rbp_red_get(a_type, a_field, a_node) \
((bool) (((uintptr_t) (a_node)->a_field.rbn_right_red) \
& ((size_t)1)))
#define rbp_color_set(a_type, a_field, a_node, a_red) do { \
(a_node)->a_field.rbn_right_red = (a_type *) ((((intptr_t) \
(a_node)->a_field.rbn_right_red) & ((ssize_t)-2)) \
| ((ssize_t)a_red)); \
} while (0)
#define rbp_red_set(a_type, a_field, a_node) do { \
(a_node)->a_field.rbn_right_red = (a_type *) (((uintptr_t) \
(a_node)->a_field.rbn_right_red) | ((size_t)1)); \
} while (0)
#define rbp_black_set(a_type, a_field, a_node) do { \
(a_node)->a_field.rbn_right_red = (a_type *) (((intptr_t) \
(a_node)->a_field.rbn_right_red) & ((ssize_t)-2)); \
} while (0)
/* Node initializer. */
#define rbp_node_new(a_type, a_field, a_tree, a_node) do { \
rbp_left_set(a_type, a_field, (a_node), &(a_tree)->rbt_nil); \
rbp_right_set(a_type, a_field, (a_node), &(a_tree)->rbt_nil); \
rbp_red_set(a_type, a_field, (a_node)); \
} while (0)
/* Tree initializer. */
#define rb_new(a_type, a_field, a_tree) do { \
(a_tree)->rbt_root = &(a_tree)->rbt_nil; \
rbp_node_new(a_type, a_field, a_tree, &(a_tree)->rbt_nil); \
rbp_black_set(a_type, a_field, &(a_tree)->rbt_nil); \
} while (0)
/* Tree operations. */
#define rbp_black_height(a_type, a_field, a_tree, r_height) do { \
a_type *rbp_bh_t; \
for (rbp_bh_t = (a_tree)->rbt_root, (r_height) = 0; \
rbp_bh_t != &(a_tree)->rbt_nil; \
rbp_bh_t = rbp_left_get(a_type, a_field, rbp_bh_t)) { \
if (rbp_red_get(a_type, a_field, rbp_bh_t) == false) { \
(r_height)++; \
} \
} \
} while (0)
#define rbp_first(a_type, a_field, a_tree, a_root, r_node) do { \
for ((r_node) = (a_root); \
rbp_left_get(a_type, a_field, (r_node)) != &(a_tree)->rbt_nil; \
(r_node) = rbp_left_get(a_type, a_field, (r_node))) { \
} \
} while (0)
#define rbp_last(a_type, a_field, a_tree, a_root, r_node) do { \
for ((r_node) = (a_root); \
rbp_right_get(a_type, a_field, (r_node)) != &(a_tree)->rbt_nil; \
(r_node) = rbp_right_get(a_type, a_field, (r_node))) { \
} \
} while (0)
#define rbp_next(a_type, a_field, a_cmp, a_tree, a_node, r_node) do { \
if (rbp_right_get(a_type, a_field, (a_node)) \
!= &(a_tree)->rbt_nil) { \
rbp_first(a_type, a_field, a_tree, rbp_right_get(a_type, \
a_field, (a_node)), (r_node)); \
} else { \
a_type *rbp_n_t = (a_tree)->rbt_root; \
assert(rbp_n_t != &(a_tree)->rbt_nil); \
(r_node) = &(a_tree)->rbt_nil; \
while (true) { \
int rbp_n_cmp = (a_cmp)((a_node), rbp_n_t); \
if (rbp_n_cmp < 0) { \
(r_node) = rbp_n_t; \
rbp_n_t = rbp_left_get(a_type, a_field, rbp_n_t); \
} else if (rbp_n_cmp > 0) { \
rbp_n_t = rbp_right_get(a_type, a_field, rbp_n_t); \
} else { \
break; \
} \
assert(rbp_n_t != &(a_tree)->rbt_nil); \
} \
} \
} while (0)
#define rbp_prev(a_type, a_field, a_cmp, a_tree, a_node, r_node) do { \
if (rbp_left_get(a_type, a_field, (a_node)) != &(a_tree)->rbt_nil) {\
rbp_last(a_type, a_field, a_tree, rbp_left_get(a_type, \
a_field, (a_node)), (r_node)); \
} else { \
a_type *rbp_p_t = (a_tree)->rbt_root; \
assert(rbp_p_t != &(a_tree)->rbt_nil); \
(r_node) = &(a_tree)->rbt_nil; \
while (true) { \
int rbp_p_cmp = (a_cmp)((a_node), rbp_p_t); \
if (rbp_p_cmp < 0) { \
rbp_p_t = rbp_left_get(a_type, a_field, rbp_p_t); \
} else if (rbp_p_cmp > 0) { \
(r_node) = rbp_p_t; \
rbp_p_t = rbp_right_get(a_type, a_field, rbp_p_t); \
} else { \
break; \
} \
assert(rbp_p_t != &(a_tree)->rbt_nil); \
} \
} \
} while (0)
#define rb_first(a_type, a_field, a_tree, r_node) do { \
rbp_first(a_type, a_field, a_tree, (a_tree)->rbt_root, (r_node)); \
if ((r_node) == &(a_tree)->rbt_nil) { \
(r_node) = NULL; \
} \
} while (0)
#define rb_last(a_type, a_field, a_tree, r_node) do { \
rbp_last(a_type, a_field, a_tree, (a_tree)->rbt_root, r_node); \
if ((r_node) == &(a_tree)->rbt_nil) { \
(r_node) = NULL; \
} \
} while (0)
#define rb_next(a_type, a_field, a_cmp, a_tree, a_node, r_node) do { \
rbp_next(a_type, a_field, a_cmp, a_tree, (a_node), (r_node)); \
if ((r_node) == &(a_tree)->rbt_nil) { \
(r_node) = NULL; \
} \
} while (0)
#define rb_prev(a_type, a_field, a_cmp, a_tree, a_node, r_node) do { \
rbp_prev(a_type, a_field, a_cmp, a_tree, (a_node), (r_node)); \
if ((r_node) == &(a_tree)->rbt_nil) { \
(r_node) = NULL; \
} \
} while (0)
#define rb_search(a_type, a_field, a_cmp, a_tree, a_key, r_node) do { \
int rbp_se_cmp; \
(r_node) = (a_tree)->rbt_root; \
while ((r_node) != &(a_tree)->rbt_nil \
&& (rbp_se_cmp = (a_cmp)((a_key), (r_node))) != 0) { \
if (rbp_se_cmp < 0) { \
(r_node) = rbp_left_get(a_type, a_field, (r_node)); \
} else { \
(r_node) = rbp_right_get(a_type, a_field, (r_node)); \
} \
} \
if ((r_node) == &(a_tree)->rbt_nil) { \
(r_node) = NULL; \
} \
} while (0)
/*
* Find a match if it exists. Otherwise, find the next greater node, if one
* exists.
*/
#define rb_nsearch(a_type, a_field, a_cmp, a_tree, a_key, r_node) do { \
a_type *rbp_ns_t = (a_tree)->rbt_root; \
(r_node) = NULL; \
while (rbp_ns_t != &(a_tree)->rbt_nil) { \
int rbp_ns_cmp = (a_cmp)((a_key), rbp_ns_t); \
if (rbp_ns_cmp < 0) { \
(r_node) = rbp_ns_t; \
rbp_ns_t = rbp_left_get(a_type, a_field, rbp_ns_t); \
} else if (rbp_ns_cmp > 0) { \
rbp_ns_t = rbp_right_get(a_type, a_field, rbp_ns_t); \
} else { \
(r_node) = rbp_ns_t; \
break; \
} \
} \
} while (0)
/*
* Find a match if it exists. Otherwise, find the previous lesser node, if one
* exists.
*/
#define rb_psearch(a_type, a_field, a_cmp, a_tree, a_key, r_node) do { \
a_type *rbp_ps_t = (a_tree)->rbt_root; \
(r_node) = NULL; \
while (rbp_ps_t != &(a_tree)->rbt_nil) { \
int rbp_ps_cmp = (a_cmp)((a_key), rbp_ps_t); \
if (rbp_ps_cmp < 0) { \
rbp_ps_t = rbp_left_get(a_type, a_field, rbp_ps_t); \
} else if (rbp_ps_cmp > 0) { \
(r_node) = rbp_ps_t; \
rbp_ps_t = rbp_right_get(a_type, a_field, rbp_ps_t); \
} else { \
(r_node) = rbp_ps_t; \
break; \
} \
} \
} while (0)
#define rbp_rotate_left(a_type, a_field, a_node, r_node) do { \
(r_node) = rbp_right_get(a_type, a_field, (a_node)); \
rbp_right_set(a_type, a_field, (a_node), \
rbp_left_get(a_type, a_field, (r_node))); \
rbp_left_set(a_type, a_field, (r_node), (a_node)); \
} while (0)
#define rbp_rotate_right(a_type, a_field, a_node, r_node) do { \
(r_node) = rbp_left_get(a_type, a_field, (a_node)); \
rbp_left_set(a_type, a_field, (a_node), \
rbp_right_get(a_type, a_field, (r_node))); \
rbp_right_set(a_type, a_field, (r_node), (a_node)); \
} while (0)
#define rbp_lean_left(a_type, a_field, a_node, r_node) do { \
bool rbp_ll_red; \
rbp_rotate_left(a_type, a_field, (a_node), (r_node)); \
rbp_ll_red = rbp_red_get(a_type, a_field, (a_node)); \
rbp_color_set(a_type, a_field, (r_node), rbp_ll_red); \
rbp_red_set(a_type, a_field, (a_node)); \
} while (0)
#define rbp_lean_right(a_type, a_field, a_node, r_node) do { \
bool rbp_lr_red; \
rbp_rotate_right(a_type, a_field, (a_node), (r_node)); \
rbp_lr_red = rbp_red_get(a_type, a_field, (a_node)); \
rbp_color_set(a_type, a_field, (r_node), rbp_lr_red); \
rbp_red_set(a_type, a_field, (a_node)); \
} while (0)
#define rbp_move_red_left(a_type, a_field, a_node, r_node) do { \
a_type *rbp_mrl_t, *rbp_mrl_u; \
rbp_mrl_t = rbp_left_get(a_type, a_field, (a_node)); \
rbp_red_set(a_type, a_field, rbp_mrl_t); \
rbp_mrl_t = rbp_right_get(a_type, a_field, (a_node)); \
rbp_mrl_u = rbp_left_get(a_type, a_field, rbp_mrl_t); \
if (rbp_red_get(a_type, a_field, rbp_mrl_u)) { \
rbp_rotate_right(a_type, a_field, rbp_mrl_t, rbp_mrl_u); \
rbp_right_set(a_type, a_field, (a_node), rbp_mrl_u); \
rbp_rotate_left(a_type, a_field, (a_node), (r_node)); \
rbp_mrl_t = rbp_right_get(a_type, a_field, (a_node)); \
if (rbp_red_get(a_type, a_field, rbp_mrl_t)) { \
rbp_black_set(a_type, a_field, rbp_mrl_t); \
rbp_red_set(a_type, a_field, (a_node)); \
rbp_rotate_left(a_type, a_field, (a_node), rbp_mrl_t); \
rbp_left_set(a_type, a_field, (r_node), rbp_mrl_t); \
} else { \
rbp_black_set(a_type, a_field, (a_node)); \
} \
} else { \
rbp_red_set(a_type, a_field, (a_node)); \
rbp_rotate_left(a_type, a_field, (a_node), (r_node)); \
} \
} while (0)
#define rbp_move_red_right(a_type, a_field, a_node, r_node) do { \
a_type *rbp_mrr_t; \
rbp_mrr_t = rbp_left_get(a_type, a_field, (a_node)); \
if (rbp_red_get(a_type, a_field, rbp_mrr_t)) { \
a_type *rbp_mrr_u, *rbp_mrr_v; \
rbp_mrr_u = rbp_right_get(a_type, a_field, rbp_mrr_t); \
rbp_mrr_v = rbp_left_get(a_type, a_field, rbp_mrr_u); \
if (rbp_red_get(a_type, a_field, rbp_mrr_v)) { \
rbp_color_set(a_type, a_field, rbp_mrr_u, \
rbp_red_get(a_type, a_field, (a_node))); \
rbp_black_set(a_type, a_field, rbp_mrr_v); \
rbp_rotate_left(a_type, a_field, rbp_mrr_t, rbp_mrr_u); \
rbp_left_set(a_type, a_field, (a_node), rbp_mrr_u); \
rbp_rotate_right(a_type, a_field, (a_node), (r_node)); \
rbp_rotate_left(a_type, a_field, (a_node), rbp_mrr_t); \
rbp_right_set(a_type, a_field, (r_node), rbp_mrr_t); \
} else { \
rbp_color_set(a_type, a_field, rbp_mrr_t, \
rbp_red_get(a_type, a_field, (a_node))); \
rbp_red_set(a_type, a_field, rbp_mrr_u); \
rbp_rotate_right(a_type, a_field, (a_node), (r_node)); \
rbp_rotate_left(a_type, a_field, (a_node), rbp_mrr_t); \
rbp_right_set(a_type, a_field, (r_node), rbp_mrr_t); \
} \
rbp_red_set(a_type, a_field, (a_node)); \
} else { \
rbp_red_set(a_type, a_field, rbp_mrr_t); \
rbp_mrr_t = rbp_left_get(a_type, a_field, rbp_mrr_t); \
if (rbp_red_get(a_type, a_field, rbp_mrr_t)) { \
rbp_black_set(a_type, a_field, rbp_mrr_t); \
rbp_rotate_right(a_type, a_field, (a_node), (r_node)); \
rbp_rotate_left(a_type, a_field, (a_node), rbp_mrr_t); \
rbp_right_set(a_type, a_field, (r_node), rbp_mrr_t); \
} else { \
rbp_rotate_left(a_type, a_field, (a_node), (r_node)); \
} \
} \
} while (0)
#define rb_insert(a_type, a_field, a_cmp, a_tree, a_node) do { \
a_type rbp_i_s; \
a_type *rbp_i_g, *rbp_i_p, *rbp_i_c, *rbp_i_t, *rbp_i_u; \
int rbp_i_cmp = 0; \
rbp_i_g = &(a_tree)->rbt_nil; \
rbp_left_set(a_type, a_field, &rbp_i_s, (a_tree)->rbt_root); \
rbp_right_set(a_type, a_field, &rbp_i_s, &(a_tree)->rbt_nil); \
rbp_black_set(a_type, a_field, &rbp_i_s); \
rbp_i_p = &rbp_i_s; \
rbp_i_c = (a_tree)->rbt_root; \
/* Iteratively search down the tree for the insertion point, */\
/* splitting 4-nodes as they are encountered. At the end of each */\
/* iteration, rbp_i_g->rbp_i_p->rbp_i_c is a 3-level path down */\
/* the tree, assuming a sufficiently deep tree. */\
while (rbp_i_c != &(a_tree)->rbt_nil) { \
rbp_i_t = rbp_left_get(a_type, a_field, rbp_i_c); \
rbp_i_u = rbp_left_get(a_type, a_field, rbp_i_t); \
if (rbp_red_get(a_type, a_field, rbp_i_t) \
&& rbp_red_get(a_type, a_field, rbp_i_u)) { \
/* rbp_i_c is the top of a logical 4-node, so split it. */\
/* This iteration does not move down the tree, due to the */\
/* disruptiveness of node splitting. */\
/* */\
/* Rotate right. */\
rbp_rotate_right(a_type, a_field, rbp_i_c, rbp_i_t); \
/* Pass red links up one level. */\
rbp_i_u = rbp_left_get(a_type, a_field, rbp_i_t); \
rbp_black_set(a_type, a_field, rbp_i_u); \
if (rbp_left_get(a_type, a_field, rbp_i_p) == rbp_i_c) { \
rbp_left_set(a_type, a_field, rbp_i_p, rbp_i_t); \
rbp_i_c = rbp_i_t; \
} else { \
/* rbp_i_c was the right child of rbp_i_p, so rotate */\
/* left in order to maintain the left-leaning */\
/* invariant. */\
assert(rbp_right_get(a_type, a_field, rbp_i_p) \
== rbp_i_c); \
rbp_right_set(a_type, a_field, rbp_i_p, rbp_i_t); \
rbp_lean_left(a_type, a_field, rbp_i_p, rbp_i_u); \
if (rbp_left_get(a_type, a_field, rbp_i_g) == rbp_i_p) {\
rbp_left_set(a_type, a_field, rbp_i_g, rbp_i_u); \
} else { \
assert(rbp_right_get(a_type, a_field, rbp_i_g) \
== rbp_i_p); \
rbp_right_set(a_type, a_field, rbp_i_g, rbp_i_u); \
} \
rbp_i_p = rbp_i_u; \
rbp_i_cmp = (a_cmp)((a_node), rbp_i_p); \
if (rbp_i_cmp < 0) { \
rbp_i_c = rbp_left_get(a_type, a_field, rbp_i_p); \
} else { \
assert(rbp_i_cmp > 0); \
rbp_i_c = rbp_right_get(a_type, a_field, rbp_i_p); \
} \
continue; \
} \
} \
rbp_i_g = rbp_i_p; \
rbp_i_p = rbp_i_c; \
rbp_i_cmp = (a_cmp)((a_node), rbp_i_c); \
if (rbp_i_cmp < 0) { \
rbp_i_c = rbp_left_get(a_type, a_field, rbp_i_c); \
} else { \
assert(rbp_i_cmp > 0); \
rbp_i_c = rbp_right_get(a_type, a_field, rbp_i_c); \
} \
} \
/* rbp_i_p now refers to the node under which to insert. */\
rbp_node_new(a_type, a_field, a_tree, (a_node)); \
if (rbp_i_cmp > 0) { \
rbp_right_set(a_type, a_field, rbp_i_p, (a_node)); \
rbp_lean_left(a_type, a_field, rbp_i_p, rbp_i_t); \
if (rbp_left_get(a_type, a_field, rbp_i_g) == rbp_i_p) { \
rbp_left_set(a_type, a_field, rbp_i_g, rbp_i_t); \
} else if (rbp_right_get(a_type, a_field, rbp_i_g) == rbp_i_p) {\
rbp_right_set(a_type, a_field, rbp_i_g, rbp_i_t); \
} \
} else { \
rbp_left_set(a_type, a_field, rbp_i_p, (a_node)); \
} \
/* Update the root and make sure that it is black. */\
(a_tree)->rbt_root = rbp_left_get(a_type, a_field, &rbp_i_s); \
rbp_black_set(a_type, a_field, (a_tree)->rbt_root); \
} while (0)
#define rb_remove(a_type, a_field, a_cmp, a_tree, a_node) do { \
a_type rbp_r_s; \
a_type *rbp_r_p, *rbp_r_c, *rbp_r_xp, *rbp_r_t, *rbp_r_u; \
int rbp_r_cmp; \
rbp_left_set(a_type, a_field, &rbp_r_s, (a_tree)->rbt_root); \
rbp_right_set(a_type, a_field, &rbp_r_s, &(a_tree)->rbt_nil); \
rbp_black_set(a_type, a_field, &rbp_r_s); \
rbp_r_p = &rbp_r_s; \
rbp_r_c = (a_tree)->rbt_root; \
rbp_r_xp = &(a_tree)->rbt_nil; \
/* Iterate down the tree, but always transform 2-nodes to 3- or */\
/* 4-nodes in order to maintain the invariant that the current */\
/* node is not a 2-node. This allows simple deletion once a leaf */\
/* is reached. Handle the root specially though, since there may */\
/* be no way to convert it from a 2-node to a 3-node. */\
rbp_r_cmp = (a_cmp)((a_node), rbp_r_c); \
if (rbp_r_cmp < 0) { \
rbp_r_t = rbp_left_get(a_type, a_field, rbp_r_c); \
rbp_r_u = rbp_left_get(a_type, a_field, rbp_r_t); \
if (rbp_red_get(a_type, a_field, rbp_r_t) == false \
&& rbp_red_get(a_type, a_field, rbp_r_u) == false) { \
/* Apply standard transform to prepare for left move. */\
rbp_move_red_left(a_type, a_field, rbp_r_c, rbp_r_t); \
rbp_black_set(a_type, a_field, rbp_r_t); \
rbp_left_set(a_type, a_field, rbp_r_p, rbp_r_t); \
rbp_r_c = rbp_r_t; \
} else { \
/* Move left. */\
rbp_r_p = rbp_r_c; \
rbp_r_c = rbp_left_get(a_type, a_field, rbp_r_c); \
} \
} else { \
if (rbp_r_cmp == 0) { \
assert((a_node) == rbp_r_c); \
if (rbp_right_get(a_type, a_field, rbp_r_c) \
== &(a_tree)->rbt_nil) { \
/* Delete root node (which is also a leaf node). */\
if (rbp_left_get(a_type, a_field, rbp_r_c) \
!= &(a_tree)->rbt_nil) { \
rbp_lean_right(a_type, a_field, rbp_r_c, rbp_r_t); \
rbp_right_set(a_type, a_field, rbp_r_t, \
&(a_tree)->rbt_nil); \
} else { \
rbp_r_t = &(a_tree)->rbt_nil; \
} \
rbp_left_set(a_type, a_field, rbp_r_p, rbp_r_t); \
} else { \
/* This is the node we want to delete, but we will */\
/* instead swap it with its successor and delete the */\
/* successor. Record enough information to do the */\
/* swap later. rbp_r_xp is the a_node's parent. */\
rbp_r_xp = rbp_r_p; \
rbp_r_cmp = 1; /* Note that deletion is incomplete. */\
} \
} \
if (rbp_r_cmp == 1) { \
if (rbp_red_get(a_type, a_field, rbp_left_get(a_type, \
a_field, rbp_right_get(a_type, a_field, rbp_r_c))) \
== false) { \
rbp_r_t = rbp_left_get(a_type, a_field, rbp_r_c); \
if (rbp_red_get(a_type, a_field, rbp_r_t)) { \
/* Standard transform. */\
rbp_move_red_right(a_type, a_field, rbp_r_c, \
rbp_r_t); \
} else { \
/* Root-specific transform. */\
rbp_red_set(a_type, a_field, rbp_r_c); \
rbp_r_u = rbp_left_get(a_type, a_field, rbp_r_t); \
if (rbp_red_get(a_type, a_field, rbp_r_u)) { \
rbp_black_set(a_type, a_field, rbp_r_u); \
rbp_rotate_right(a_type, a_field, rbp_r_c, \
rbp_r_t); \
rbp_rotate_left(a_type, a_field, rbp_r_c, \
rbp_r_u); \
rbp_right_set(a_type, a_field, rbp_r_t, \
rbp_r_u); \
} else { \
rbp_red_set(a_type, a_field, rbp_r_t); \
rbp_rotate_left(a_type, a_field, rbp_r_c, \
rbp_r_t); \
} \
} \
rbp_left_set(a_type, a_field, rbp_r_p, rbp_r_t); \
rbp_r_c = rbp_r_t; \
} else { \
/* Move right. */\
rbp_r_p = rbp_r_c; \
rbp_r_c = rbp_right_get(a_type, a_field, rbp_r_c); \
} \
} \
} \
if (rbp_r_cmp != 0) { \
while (true) { \
assert(rbp_r_p != &(a_tree)->rbt_nil); \
rbp_r_cmp = (a_cmp)((a_node), rbp_r_c); \
if (rbp_r_cmp < 0) { \
rbp_r_t = rbp_left_get(a_type, a_field, rbp_r_c); \
if (rbp_r_t == &(a_tree)->rbt_nil) { \
/* rbp_r_c now refers to the successor node to */\
/* relocate, and rbp_r_xp/a_node refer to the */\
/* context for the relocation. */\
if (rbp_left_get(a_type, a_field, rbp_r_xp) \
== (a_node)) { \
rbp_left_set(a_type, a_field, rbp_r_xp, \
rbp_r_c); \
} else { \
assert(rbp_right_get(a_type, a_field, \
rbp_r_xp) == (a_node)); \
rbp_right_set(a_type, a_field, rbp_r_xp, \
rbp_r_c); \
} \
rbp_left_set(a_type, a_field, rbp_r_c, \
rbp_left_get(a_type, a_field, (a_node))); \
rbp_right_set(a_type, a_field, rbp_r_c, \
rbp_right_get(a_type, a_field, (a_node))); \
rbp_color_set(a_type, a_field, rbp_r_c, \
rbp_red_get(a_type, a_field, (a_node))); \
if (rbp_left_get(a_type, a_field, rbp_r_p) \
== rbp_r_c) { \
rbp_left_set(a_type, a_field, rbp_r_p, \
&(a_tree)->rbt_nil); \
} else { \
assert(rbp_right_get(a_type, a_field, rbp_r_p) \
== rbp_r_c); \
rbp_right_set(a_type, a_field, rbp_r_p, \
&(a_tree)->rbt_nil); \
} \
break; \
} \
rbp_r_u = rbp_left_get(a_type, a_field, rbp_r_t); \
if (rbp_red_get(a_type, a_field, rbp_r_t) == false \
&& rbp_red_get(a_type, a_field, rbp_r_u) == false) { \
rbp_move_red_left(a_type, a_field, rbp_r_c, \
rbp_r_t); \
if (rbp_left_get(a_type, a_field, rbp_r_p) \
== rbp_r_c) { \
rbp_left_set(a_type, a_field, rbp_r_p, rbp_r_t);\
} else { \
rbp_right_set(a_type, a_field, rbp_r_p, \
rbp_r_t); \
} \
rbp_r_c = rbp_r_t; \
} else { \
rbp_r_p = rbp_r_c; \
rbp_r_c = rbp_left_get(a_type, a_field, rbp_r_c); \
} \
} else { \
/* Check whether to delete this node (it has to be */\
/* the correct node and a leaf node). */\
if (rbp_r_cmp == 0) { \
assert((a_node) == rbp_r_c); \
if (rbp_right_get(a_type, a_field, rbp_r_c) \
== &(a_tree)->rbt_nil) { \
/* Delete leaf node. */\
if (rbp_left_get(a_type, a_field, rbp_r_c) \
!= &(a_tree)->rbt_nil) { \
rbp_lean_right(a_type, a_field, rbp_r_c, \
rbp_r_t); \
rbp_right_set(a_type, a_field, rbp_r_t, \
&(a_tree)->rbt_nil); \
} else { \
rbp_r_t = &(a_tree)->rbt_nil; \
} \
if (rbp_left_get(a_type, a_field, rbp_r_p) \
== rbp_r_c) { \
rbp_left_set(a_type, a_field, rbp_r_p, \
rbp_r_t); \
} else { \
rbp_right_set(a_type, a_field, rbp_r_p, \
rbp_r_t); \
} \
break; \
} else { \
/* This is the node we want to delete, but we */\
/* will instead swap it with its successor */\
/* and delete the successor. Record enough */\
/* information to do the swap later. */\
/* rbp_r_xp is a_node's parent. */\
rbp_r_xp = rbp_r_p; \
} \
} \
rbp_r_t = rbp_right_get(a_type, a_field, rbp_r_c); \
rbp_r_u = rbp_left_get(a_type, a_field, rbp_r_t); \
if (rbp_red_get(a_type, a_field, rbp_r_u) == false) { \
rbp_move_red_right(a_type, a_field, rbp_r_c, \
rbp_r_t); \
if (rbp_left_get(a_type, a_field, rbp_r_p) \
== rbp_r_c) { \
rbp_left_set(a_type, a_field, rbp_r_p, rbp_r_t);\
} else { \
rbp_right_set(a_type, a_field, rbp_r_p, \
rbp_r_t); \
} \
rbp_r_c = rbp_r_t; \
} else { \
rbp_r_p = rbp_r_c; \
rbp_r_c = rbp_right_get(a_type, a_field, rbp_r_c); \
} \
} \
} \
} \
/* Update root. */\
(a_tree)->rbt_root = rbp_left_get(a_type, a_field, &rbp_r_s); \
} while (0)
/*
* The rb_wrap() macro provides a convenient way to wrap functions around the
* cpp macros. The main benefits of wrapping are that 1) repeated macro
* expansion can cause code bloat, especially for rb_{insert,remove)(), and
* 2) type, linkage, comparison functions, etc. need not be specified at every
* call point.
*/
#define rb_wrap(a_attr, a_prefix, a_tree_type, a_type, a_field, a_cmp) \
a_attr void \
a_prefix##new(a_tree_type *tree) { \
rb_new(a_type, a_field, tree); \
} \
a_attr a_type * \
a_prefix##first(a_tree_type *tree) { \
a_type *ret; \
rb_first(a_type, a_field, tree, ret); \
return (ret); \
} \
a_attr a_type * \
a_prefix##last(a_tree_type *tree) { \
a_type *ret; \
rb_last(a_type, a_field, tree, ret); \
return (ret); \
} \
a_attr a_type * \
a_prefix##next(a_tree_type *tree, a_type *node) { \
a_type *ret; \
rb_next(a_type, a_field, a_cmp, tree, node, ret); \
return (ret); \
} \
a_attr a_type * \
a_prefix##prev(a_tree_type *tree, a_type *node) { \
a_type *ret; \
rb_prev(a_type, a_field, a_cmp, tree, node, ret); \
return (ret); \
} \
a_attr a_type * \
a_prefix##search(a_tree_type *tree, a_type *key) { \
a_type *ret; \
rb_search(a_type, a_field, a_cmp, tree, key, ret); \
return (ret); \
} \
a_attr a_type * \
a_prefix##nsearch(a_tree_type *tree, a_type *key) { \
a_type *ret; \
rb_nsearch(a_type, a_field, a_cmp, tree, key, ret); \
return (ret); \
} \
a_attr a_type * \
a_prefix##psearch(a_tree_type *tree, a_type *key) { \
a_type *ret; \
rb_psearch(a_type, a_field, a_cmp, tree, key, ret); \
return (ret); \
} \
a_attr void \
a_prefix##insert(a_tree_type *tree, a_type *node) { \
rb_insert(a_type, a_field, a_cmp, tree, node); \
} \
a_attr void \
a_prefix##remove(a_tree_type *tree, a_type *node) { \
rb_remove(a_type, a_field, a_cmp, tree, node); \
}
/*
* The iterators simulate recursion via an array of pointers that store the
* current path. This is critical to performance, since a series of calls to
* rb_{next,prev}() would require time proportional to (n lg n), whereas this
* implementation only requires time proportional to (n).
*
* Since the iterators cache a path down the tree, any tree modification may
* cause the cached path to become invalid. In order to continue iteration,
* use something like the following sequence:
*
* {
* a_type *node, *tnode;
*
* rb_foreach_begin(a_type, a_field, a_tree, node) {
* ...
* rb_next(a_type, a_field, a_cmp, a_tree, node, tnode);
* rb_remove(a_type, a_field, a_cmp, a_tree, node);
* rb_foreach_next(a_type, a_field, a_cmp, a_tree, tnode);
* ...
* } rb_foreach_end(a_type, a_field, a_tree, node)
* }
*
* Note that this idiom is not advised if every iteration modifies the tree,
* since in that case there is no algorithmic complexity improvement over a
* series of rb_{next,prev}() calls, thus making the setup overhead wasted
* effort.
*/
#define rb_foreach_begin(a_type, a_field, a_tree, a_var) { \
/* Compute the maximum possible tree depth (3X the black height). */\
unsigned rbp_f_height; \
rbp_black_height(a_type, a_field, a_tree, rbp_f_height); \
rbp_f_height *= 3; \
{ \
/* Initialize the path to contain the left spine. */\
a_type *rbp_f_path[rbp_f_height]; \
a_type *rbp_f_node; \
bool rbp_f_synced = false; \
unsigned rbp_f_depth = 0; \
if ((a_tree)->rbt_root != &(a_tree)->rbt_nil) { \
rbp_f_path[rbp_f_depth] = (a_tree)->rbt_root; \
rbp_f_depth++; \
while ((rbp_f_node = rbp_left_get(a_type, a_field, \
rbp_f_path[rbp_f_depth-1])) != &(a_tree)->rbt_nil) { \
rbp_f_path[rbp_f_depth] = rbp_f_node; \
rbp_f_depth++; \
} \
} \
/* While the path is non-empty, iterate. */\
while (rbp_f_depth > 0) { \
(a_var) = rbp_f_path[rbp_f_depth-1];
/* Only use if modifying the tree during iteration. */
#define rb_foreach_next(a_type, a_field, a_cmp, a_tree, a_node) \
/* Re-initialize the path to contain the path to a_node. */\
rbp_f_depth = 0; \
if (a_node != NULL) { \
if ((a_tree)->rbt_root != &(a_tree)->rbt_nil) { \
rbp_f_path[rbp_f_depth] = (a_tree)->rbt_root; \
rbp_f_depth++; \
rbp_f_node = rbp_f_path[0]; \
while (true) { \
int rbp_f_cmp = (a_cmp)((a_node), \
rbp_f_path[rbp_f_depth-1]); \
if (rbp_f_cmp < 0) { \
rbp_f_node = rbp_left_get(a_type, a_field, \
rbp_f_path[rbp_f_depth-1]); \
} else if (rbp_f_cmp > 0) { \
rbp_f_node = rbp_right_get(a_type, a_field, \
rbp_f_path[rbp_f_depth-1]); \
} else { \
break; \
} \
assert(rbp_f_node != &(a_tree)->rbt_nil); \
rbp_f_path[rbp_f_depth] = rbp_f_node; \
rbp_f_depth++; \
} \
} \
} \
rbp_f_synced = true;
#define rb_foreach_end(a_type, a_field, a_tree, a_var) \
if (rbp_f_synced) { \
rbp_f_synced = false; \
continue; \
} \
/* Find the successor. */\
if ((rbp_f_node = rbp_right_get(a_type, a_field, \
rbp_f_path[rbp_f_depth-1])) != &(a_tree)->rbt_nil) { \
/* The successor is the left-most node in the right */\
/* subtree. */\
rbp_f_path[rbp_f_depth] = rbp_f_node; \
rbp_f_depth++; \
while ((rbp_f_node = rbp_left_get(a_type, a_field, \
rbp_f_path[rbp_f_depth-1])) != &(a_tree)->rbt_nil) { \
rbp_f_path[rbp_f_depth] = rbp_f_node; \
rbp_f_depth++; \
} \
} else { \
/* The successor is above the current node. Unwind */\
/* until a left-leaning edge is removed from the */\
/* path, or the path is empty. */\
for (rbp_f_depth--; rbp_f_depth > 0; rbp_f_depth--) { \
if (rbp_left_get(a_type, a_field, \
rbp_f_path[rbp_f_depth-1]) \
== rbp_f_path[rbp_f_depth]) { \
break; \
} \
} \
} \
} \
} \
}
#define rb_foreach_reverse_begin(a_type, a_field, a_tree, a_var) { \
/* Compute the maximum possible tree depth (3X the black height). */\
unsigned rbp_fr_height; \
rbp_black_height(a_type, a_field, a_tree, rbp_fr_height); \
rbp_fr_height *= 3; \
{ \
/* Initialize the path to contain the right spine. */\
a_type *rbp_fr_path[rbp_fr_height]; \
a_type *rbp_fr_node; \
bool rbp_fr_synced = false; \
unsigned rbp_fr_depth = 0; \
if ((a_tree)->rbt_root != &(a_tree)->rbt_nil) { \
rbp_fr_path[rbp_fr_depth] = (a_tree)->rbt_root; \
rbp_fr_depth++; \
while ((rbp_fr_node = rbp_right_get(a_type, a_field, \
rbp_fr_path[rbp_fr_depth-1])) != &(a_tree)->rbt_nil) { \
rbp_fr_path[rbp_fr_depth] = rbp_fr_node; \
rbp_fr_depth++; \
} \
} \
/* While the path is non-empty, iterate. */\
while (rbp_fr_depth > 0) { \
(a_var) = rbp_fr_path[rbp_fr_depth-1];
/* Only use if modifying the tree during iteration. */
#define rb_foreach_reverse_prev(a_type, a_field, a_cmp, a_tree, a_node) \
/* Re-initialize the path to contain the path to a_node. */\
rbp_fr_depth = 0; \
if (a_node != NULL) { \
if ((a_tree)->rbt_root != &(a_tree)->rbt_nil) { \
rbp_fr_path[rbp_fr_depth] = (a_tree)->rbt_root; \
rbp_fr_depth++; \
rbp_fr_node = rbp_fr_path[0]; \
while (true) { \
int rbp_fr_cmp = (a_cmp)((a_node), \
rbp_fr_path[rbp_fr_depth-1]); \
if (rbp_fr_cmp < 0) { \
rbp_fr_node = rbp_left_get(a_type, a_field, \
rbp_fr_path[rbp_fr_depth-1]); \
} else if (rbp_fr_cmp > 0) { \
rbp_fr_node = rbp_right_get(a_type, a_field,\
rbp_fr_path[rbp_fr_depth-1]); \
} else { \
break; \
} \
assert(rbp_fr_node != &(a_tree)->rbt_nil); \
rbp_fr_path[rbp_fr_depth] = rbp_fr_node; \
rbp_fr_depth++; \
} \
} \
} \
rbp_fr_synced = true;
#define rb_foreach_reverse_end(a_type, a_field, a_tree, a_var) \
if (rbp_fr_synced) { \
rbp_fr_synced = false; \
continue; \
} \
if (rbp_fr_depth == 0) { \
/* rb_foreach_reverse_sync() was called with a NULL */\
/* a_node. */\
break; \
} \
/* Find the predecessor. */\
if ((rbp_fr_node = rbp_left_get(a_type, a_field, \
rbp_fr_path[rbp_fr_depth-1])) != &(a_tree)->rbt_nil) { \
/* The predecessor is the right-most node in the left */\
/* subtree. */\
rbp_fr_path[rbp_fr_depth] = rbp_fr_node; \
rbp_fr_depth++; \
while ((rbp_fr_node = rbp_right_get(a_type, a_field, \
rbp_fr_path[rbp_fr_depth-1])) != &(a_tree)->rbt_nil) {\
rbp_fr_path[rbp_fr_depth] = rbp_fr_node; \
rbp_fr_depth++; \
} \
} else { \
/* The predecessor is above the current node. Unwind */\
/* until a right-leaning edge is removed from the */\
/* path, or the path is empty. */\
for (rbp_fr_depth--; rbp_fr_depth > 0; rbp_fr_depth--) {\
if (rbp_right_get(a_type, a_field, \
rbp_fr_path[rbp_fr_depth-1]) \
== rbp_fr_path[rbp_fr_depth]) { \
break; \
} \
} \
} \
} \
} \
}
#endif /* RB_H_ */
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment