Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
S
slash
Project
Project
Details
Activity
Releases
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Board
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
uplex-varnish
slash
Commits
81a5c96c
Unverified
Commit
81a5c96c
authored
Aug 02, 2023
by
Nils Goroll
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
Add lru_exponent parameter
parent
9342b7c6
Changes
7
Hide whitespace changes
Inline
Side-by-side
Showing
7 changed files
with
69 additions
and
14 deletions
+69
-14
CHANGES.rst
CHANGES.rst
+5
-0
INSTALL.rst
INSTALL.rst
+6
-0
fellow_cache.c
src/fellow_cache.c
+4
-2
fellow_tune.c
src/fellow_tune.c
+1
-0
fellow_tunables.h
src/tbl/fellow_tunables.h
+1
-0
vmod_slash.man.rst
src/vmod_slash.man.rst
+28
-8
vmod_slash.vcc
src/vmod_slash.vcc
+24
-4
No files found.
CHANGES.rst
View file @
81a5c96c
...
...
@@ -21,6 +21,11 @@ fellow
.. https://gitlab.com/uplex/varnish/slash/-/commit/
* To cater for massively parallel systems with dozens of CPUs, the
parameter ``lru_exponent`` has been introduced to scale the number
of LRU lists (and corresponding eviction threads) between 1 and 64
(corresponding to ``lru_exponent = 0`` to ``lru_exponent = 6``).
* The allocation policy for disk regions has been improved. This
should reduce fragmentation and pressure on LRU as well as improve
response times (`a0e8e8f779f4ad8569ccc9c3b7eaee08dc79cfa4`_).
...
...
INSTALL.rst
View file @
81a5c96c
...
...
@@ -35,6 +35,12 @@ recommendations for optimal fellow storage performance
Note that a fellow storage using any of the `xxhash`_ hashes can
only be loaded by an instance with `xxhash`_ support compiled in.
* On big systems with many CPUs, ``lru_exponent`` can be tuned to
achieve maximum performance with hundreds of thousands of requests per
second.
Reasonable values are yet to be determined experimentally.
compiling
~~~~~~~~~
...
...
src/fellow_cache.c
View file @
81a5c96c
...
...
@@ -692,7 +692,6 @@ struct fellow_busy {
struct
fellow_cache_lrus
{
unsigned
magic
;
#define FELLOW_CACHE_LRUS_MAGIC 0xadad56fb
uint8_t
exponent
;
pthread_mutex_t
mtx
;
struct
fellow_cache_lru
*
lru
[
1
<<
MAX_NLRU_EXPONENT
];
};
...
...
@@ -750,6 +749,7 @@ fellow_cache_get_lru(struct fellow_cache *fc, uint64_t n)
{
struct
fellow_cache_lrus
*
lrus
;
struct
fellow_cache_lru
*
lru
;
struct
stvfe_tune
*
tune
;
uint8_t
exponent
;
pthread_t
thr
;
size_t
i
;
...
...
@@ -757,8 +757,10 @@ fellow_cache_get_lru(struct fellow_cache *fc, uint64_t n)
CHECK_OBJ_NOTNULL
(
fc
,
FELLOW_CACHE_MAGIC
);
lrus
=
fc
->
lrus
;
CHECK_OBJ_NOTNULL
(
lrus
,
FELLOW_CACHE_LRUS_MAGIC
);
tune
=
fc
->
tune
;
CHECK_OBJ_NOTNULL
(
tune
,
STVFE_TUNE_MAGIC
);
exponent
=
lrus
->
exponent
;
exponent
=
tune
->
lru_
exponent
;
assert
(
exponent
<=
MAX_NLRU_EXPONENT
);
i
=
exponent
?
fib
(
n
,
exponent
)
:
0
;
...
...
src/fellow_tune.c
View file @
81a5c96c
...
...
@@ -83,6 +83,7 @@ stvfe_tune_check(struct stvfe_tune *tune)
}
sz
=
tune
->
memsz
>>
(
tune
->
chunk_exponent
+
3
);
sz
>>=
tune
->
lru_exponent
;
assert
(
sz
<=
UINT_MAX
);
l
=
(
unsigned
)
sz
;
if
(
tune
->
mem_reserve_chunks
>
l
)
{
...
...
src/tbl/fellow_tunables.h
View file @
81a5c96c
...
...
@@ -42,6 +42,7 @@ TUNE(float, log_rewrite_ratio, 0.5, 0.001, FLT_MAX);
// reserve chunk is the larger of chunk_exponent and result from logbuffer size
TUNE
(
unsigned
,
chunk_exponent
,
20
/* 1MB*/
,
12
/* 4KB */
,
30
/* 1GB */
);
TUNE
(
uint8_t
,
wait_table_exponent
,
10
,
6
,
32
);
TUNE
(
uint8_t
,
lru_exponent
,
0
,
0
,
6
);
TUNE
(
unsigned
,
dsk_reserve_chunks
,
4
,
2
,
UINT_MAX
);
TUNE
(
unsigned
,
mem_reserve_chunks
,
1
,
0
,
UINT_MAX
);
TUNE
(
size_t
,
objsize_hint
,
256
*
1024
,
4096
,
SIZE_MAX
);
...
...
src/vmod_slash.man.rst
View file @
81a5c96c
...
...
@@ -481,8 +481,8 @@ will be used (which might fail of insufficient memory is available).
.. _xfellow.tune():
STRING xfellow.tune([INT logbuffer_size], [DURATION logbuffer_flush_interval], [REAL log_rewrite_ratio], [INT chunk_exponent], [BYTES chunk_bytes], [INT wait_table_exponent], [INT dsk_reserve_chunks], [INT mem_reserve_chunks], [BYTES objsize_hint], [BYTES objsize_max], [INT cram], [INT readahead], [BYTES discard_immediate], [INT io_batch_min], [INT io_batch_max], [ENUM hash_obj], [ENUM hash_log], [ENUM ioerr_obj], [ENUM ioerr_log], [ENUM allocerr_obj], [ENUM allocerr_log])
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
STRING xfellow.tune([INT logbuffer_size], [DURATION logbuffer_flush_interval], [REAL log_rewrite_ratio], [INT chunk_exponent], [BYTES chunk_bytes], [INT wait_table_exponent], [INT
lru_exponent], [INT
dsk_reserve_chunks], [INT mem_reserve_chunks], [BYTES objsize_hint], [BYTES objsize_max], [INT cram], [INT readahead], [BYTES discard_immediate], [INT io_batch_min], [INT io_batch_max], [ENUM hash_obj], [ENUM hash_log], [ENUM ioerr_obj], [ENUM ioerr_log], [ENUM allocerr_obj], [ENUM allocerr_log])
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
--------------------
::
...
...
@@ -493,6 +493,7 @@ STRING xfellow.tune([INT logbuffer_size], [DURATION logbuffer_flush_interval], [
[INT chunk_exponent],
[BYTES chunk_bytes],
[INT wait_table_exponent],
[INT lru_exponent],
[INT dsk_reserve_chunks],
[INT mem_reserve_chunks],
[BYTES objsize_hint],
...
...
@@ -589,6 +590,24 @@ fellow storage can be fine tuned:
disk. Once an object is read, its body data is read in parallel
independent of this limit.
* *lru_exponent*
TL;DR: 2-logarithm of number of LRU lists
- unit: number of LRU lists as a power of two
- default: 0
- minimum: 0
- maximum: 6
On large systems, with mostly memory bound access, the LRU
list becomes the main contender as segments are removed and
re-added from/to LRU frequently.
A single LRU (``lru_exponent=0``) is most fair, only the absolute
least recently used segment is eviced ever. But more LRUs reduce
contention on the LRU lists significantly and improve parallelism of
evictions.
* *dsk_reserve_chunks*
- unit: scalar
...
...
@@ -614,10 +633,10 @@ fellow storage can be fine tuned:
- minimum: 0
- maximum: memsize / 8 / chunk_bytes
specifies a number of chunks to reserve in memory
. The reserve is
used to provide memory for new objects or objects staged from disk
to memory when memory is otherwise full. It can help reduce
latencies in these situations at the expense of some memory
specifies a number of chunks to reserve in memory
per LRU. The
reserve is used to provide memory for new objects or objects staged
from disk to memory when memory is otherwise full. It can help
reduce
latencies in these situations at the expense of some memory
unavailable for caching.
The value is capped suck that the number of reserved chunks times
...
...
@@ -832,8 +851,8 @@ Restricted to: ``vcl_init``.
.. _slash.tune_fellow():
STRING tune_fellow(STEVEDORE storage, [INT logbuffer_size], [DURATION logbuffer_flush_interval], [REAL log_rewrite_ratio], [INT chunk_exponent], [BYTES chunk_bytes], [INT wait_table_exponent], [INT dsk_reserve_chunks], [INT mem_reserve_chunks], [BYTES objsize_hint], [BYTES objsize_max], [INT cram], [INT readahead], [BYTES discard_immediate], [INT io_batch_min], [INT io_batch_max], [ENUM hash_obj], [ENUM hash_log], [ENUM ioerr_obj], [ENUM ioerr_log], [ENUM allocerr_obj], [ENUM allocerr_log])
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
STRING tune_fellow(STEVEDORE storage, [INT logbuffer_size], [DURATION logbuffer_flush_interval], [REAL log_rewrite_ratio], [INT chunk_exponent], [BYTES chunk_bytes], [INT wait_table_exponent], [INT
lru_exponent], [INT
dsk_reserve_chunks], [INT mem_reserve_chunks], [BYTES objsize_hint], [BYTES objsize_max], [INT cram], [INT readahead], [BYTES discard_immediate], [INT io_batch_min], [INT io_batch_max], [ENUM hash_obj], [ENUM hash_log], [ENUM ioerr_obj], [ENUM ioerr_log], [ENUM allocerr_obj], [ENUM allocerr_log])
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
--------------------
::
...
...
@@ -845,6 +864,7 @@ STRING tune_fellow(STEVEDORE storage, [INT logbuffer_size], [DURATION logbuffer_
[INT chunk_exponent],
[BYTES chunk_bytes],
[INT wait_table_exponent],
[INT lru_exponent],
[INT dsk_reserve_chunks],
[INT mem_reserve_chunks],
[BYTES objsize_hint],
...
...
src/vmod_slash.vcc
View file @
81a5c96c
...
...
@@ -430,6 +430,7 @@ $Method STRING .tune(
[ INT chunk_exponent ],
[ BYTES chunk_bytes ],
[ INT wait_table_exponent ],
[ INT lru_exponent ],
[ INT dsk_reserve_chunks ],
[ INT mem_reserve_chunks ],
[ BYTES objsize_hint ],
...
...
@@ -525,6 +526,24 @@ fellow storage can be fine tuned:
disk. Once an object is read, its body data is read in parallel
independent of this limit.
* *lru_exponent*
TL;DR: 2-logarithm of number of LRU lists
- unit: number of LRU lists as a power of two
- default: 0
- minimum: 0
- maximum: 6
On large systems, with mostly memory bound access, the LRU
list becomes the main contender as segments are removed and
re-added from/to LRU frequently.
A single LRU (``lru_exponent=0``) is most fair, only the absolute
least recently used segment is eviced ever. But more LRUs reduce
contention on the LRU lists significantly and improve parallelism of
evictions.
* *dsk_reserve_chunks*
- unit: scalar
...
...
@@ -550,10 +569,10 @@ fellow storage can be fine tuned:
- minimum: 0
- maximum: memsize / 8 / chunk_bytes
specifies a number of chunks to reserve in memory
. The reserve is
used to provide memory for new objects or objects staged from disk
to memory when memory is otherwise full. It can help reduce
latencies in these situations at the expense of some memory
specifies a number of chunks to reserve in memory
per LRU. The
reserve is used to provide memory for new objects or objects staged
from disk to memory when memory is otherwise full. It can help
reduce
latencies in these situations at the expense of some memory
unavailable for caching.
The value is capped suck that the number of reserved chunks times
...
...
@@ -761,6 +780,7 @@ $Function STRING tune_fellow(
[ INT chunk_exponent ],
[ BYTES chunk_bytes ],
[ INT wait_table_exponent ],
[ INT lru_exponent ],
[ INT dsk_reserve_chunks ],
[ INT mem_reserve_chunks ],
[ BYTES objsize_hint ],
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment