Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
S
slash
Project
Project
Details
Activity
Releases
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Board
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
uplex-varnish
slash
Commits
db9aacca
Unverified
Commit
db9aacca
authored
Feb 06, 2024
by
Nils Goroll
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
doc: Add a details on sizing
Closes #46
parent
eb0d61cf
Changes
2
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
172 additions
and
0 deletions
+172
-0
vmod_slash.man.rst
src/vmod_slash.man.rst
+86
-0
vmod_slash.vcc
src/vmod_slash.vcc
+86
-0
No files found.
src/vmod_slash.man.rst
View file @
db9aacca
...
...
@@ -461,6 +461,11 @@ a global fellow storage. *Note* that this this kind of dynamic storage
removal is a new feature first introduced with `fellow` and might not
work perfectly yet.
When it comes to cache sizes, a "too big" generally does not exist -
more cache is always better, but `fellow` only supports a memory cache
of size up to that of the disk cache. For more information, see
`slash_fellow_size`_.
On Linux, the memory cache will be allocated from huge pages, if
available and if *memsize* is larger than a huge page. *memsize* will
then be rounded up to a multiple of the respective huge page size.
...
...
@@ -486,6 +491,87 @@ error with sizing requirements.
*delete* specifies if the storage is to be emptied.
.. _slash_fellow_size:
Sizing fellow storage
~~~~~~~~~~~~~~~~~~~~~
This section is intended to provide guidance on cache sizing by
explaining the overall cache organization and ballpark figures for
object sizes.
A simple, yet fundamental insight is that, with `fellow`, there is no
such thing as "delivering objects directly from disk". While hardware
architectures exist which allow DMA directly from flash storage,
`fellow` implements a "disk" and "memory" tier, with all reads and
writes going through RAM first. This architecture has been shown to be
most efficient both in terms of performance and price/performance, but
it establishes a fundamental principle for sizing: The memory cache
should be big enough to hold all actively/frequently accessed
data. Writes happen to memory, and need to be written to disk before
the memory can be re-used. Reads go into memory, from where data can
be accessed.
Besides the always consistent, eventually persistent log, the central
disk structure is the ``fellow_disk_obj``. It contains the fixed and
variable object attributes defined by Varnish-Cache (most importantly
headers) and pointers to the first body segments. For efficiency (log,
memory) this structure is addressed by a single 64bit value. Because
`fellow` uses a minimum disk block size of 4KB, the object can have
sizes between 4KB and just under 16MB. Under optimal circumstances, a
``fellow_disk_obj`` takes only 4KB, but needs to grow bigger if longer
headers or vary specifications need to be stored.
When read into memory, a companion data structure named
``fellow_cache_obj`` is created. Under ideal circumstances (small
headers), both data structures are made fit into a single 4KB
allocation or even less, but as a rule of thumb, the amount of memory
needed per actively accessed object should be assumed to be 4KB plus
the size of the headers and vary specification. Both
``fellow_disk_obj`` and ``fellow_cache_obj`` remain in memory for as
long as any part of the object is accessed.
The object body is organized in chunks of 2^\ *chunk_exponent* bytes,
called segments. Segments are the smallest I/O units of object bodies
and are lru-cached individually, allowing `fellow` to handle objects
bigger than *memsize*: When an object body is iterated over, up to
*readahead* segments are referenced and, if necessary, asynchronously
read into cache in advance. Segments outside the readahead window,
which are not concurrently accessed by other threads, either reside in
memory on the LRU or only on disk. The amount of disk and memory
storage in addition to the actual data amounts to roughly 64 bytes per
segment on disk and another 64 bytes per segment in memory, organized
in larger units called segment lists, which are sized between 4KB for
63 segments and 4MB for 65534 segments. Segment lists are read
asynchronously and LRU'd together with the respective
``fellow_cache_obj``.
Consequently, the *chunk_bytes* / *chunk_exponent* parameter is chosen
such that a typical object needs only a small number of chunks, which
requires an appropriately sized memory cache: To ensure that the cache
can always move data, the parameter is hard capped at 1/1024 of the
memory cache size, so, for example, for 1MB chunks, a memory cache of
at least 1GB is needed.
Extended attributes (currently only used for ESI data) use a separate
segment, which is only read on demand and also LRU'd with the
respective object.
"Busy" objects going into cache while being fetched from a backend
have the same memory requirements as "finished" objects, but need
another 8KB of memory on top while being created.
To achieve high efficiency and to support Direct I/O, the buddy
allocator used to organize both the disk and memory cache only ever
makes allocations at multiples of the requested size, rounded up to
the next power of two. For this reason, it is normal for
:ref:`slashmap(1)` to show substantial amounts of free memory (like
30-40%) in smaller page sizes below 4KB even if LRU is active.
To summarize, one should assume for memory sizing at least the amount
of data actively accessed, plus 4KB per object, plus 8KB per "busy"
object.
.. _slash_fellow_resize:
Resizing fellow storage
...
...
src/vmod_slash.vcc
View file @
db9aacca
...
...
@@ -405,6 +405,11 @@ a global fellow storage. *Note* that this this kind of dynamic storage
removal is a new feature first introduced with `fellow` and might not
work perfectly yet.
When it comes to cache sizes, a "too big" generally does not exist -
more cache is always better, but `fellow` only supports a memory cache
of size up to that of the disk cache. For more information, see
`slash_fellow_size`_.
On Linux, the memory cache will be allocated from huge pages, if
available and if *memsize* is larger than a huge page. *memsize* will
then be rounded up to a multiple of the respective huge page size.
...
...
@@ -430,6 +435,87 @@ error with sizing requirements.
*delete* specifies if the storage is to be emptied.
.. _slash_fellow_size:
Sizing fellow storage
~~~~~~~~~~~~~~~~~~~~~
This section is intended to provide guidance on cache sizing by
explaining the overall cache organization and ballpark figures for
object sizes.
A simple, yet fundamental insight is that, with `fellow`, there is no
such thing as "delivering objects directly from disk". While hardware
architectures exist which allow DMA directly from flash storage,
`fellow` implements a "disk" and "memory" tier, with all reads and
writes going through RAM first. This architecture has been shown to be
most efficient both in terms of performance and price/performance, but
it establishes a fundamental principle for sizing: The memory cache
should be big enough to hold all actively/frequently accessed
data. Writes happen to memory, and need to be written to disk before
the memory can be re-used. Reads go into memory, from where data can
be accessed.
Besides the always consistent, eventually persistent log, the central
disk structure is the ``fellow_disk_obj``. It contains the fixed and
variable object attributes defined by Varnish-Cache (most importantly
headers) and pointers to the first body segments. For efficiency (log,
memory) this structure is addressed by a single 64bit value. Because
`fellow` uses a minimum disk block size of 4KB, the object can have
sizes between 4KB and just under 16MB. Under optimal circumstances, a
``fellow_disk_obj`` takes only 4KB, but needs to grow bigger if longer
headers or vary specifications need to be stored.
When read into memory, a companion data structure named
``fellow_cache_obj`` is created. Under ideal circumstances (small
headers), both data structures are made fit into a single 4KB
allocation or even less, but as a rule of thumb, the amount of memory
needed per actively accessed object should be assumed to be 4KB plus
the size of the headers and vary specification. Both
``fellow_disk_obj`` and ``fellow_cache_obj`` remain in memory for as
long as any part of the object is accessed.
The object body is organized in chunks of 2^\ *chunk_exponent* bytes,
called segments. Segments are the smallest I/O units of object bodies
and are lru-cached individually, allowing `fellow` to handle objects
bigger than *memsize*: When an object body is iterated over, up to
*readahead* segments are referenced and, if necessary, asynchronously
read into cache in advance. Segments outside the readahead window,
which are not concurrently accessed by other threads, either reside in
memory on the LRU or only on disk. The amount of disk and memory
storage in addition to the actual data amounts to roughly 64 bytes per
segment on disk and another 64 bytes per segment in memory, organized
in larger units called segment lists, which are sized between 4KB for
63 segments and 4MB for 65534 segments. Segment lists are read
asynchronously and LRU'd together with the respective
``fellow_cache_obj``.
Consequently, the *chunk_bytes* / *chunk_exponent* parameter is chosen
such that a typical object needs only a small number of chunks, which
requires an appropriately sized memory cache: To ensure that the cache
can always move data, the parameter is hard capped at 1/1024 of the
memory cache size, so, for example, for 1MB chunks, a memory cache of
at least 1GB is needed.
Extended attributes (currently only used for ESI data) use a separate
segment, which is only read on demand and also LRU'd with the
respective object.
"Busy" objects going into cache while being fetched from a backend
have the same memory requirements as "finished" objects, but need
another 8KB of memory on top while being created.
To achieve high efficiency and to support Direct I/O, the buddy
allocator used to organize both the disk and memory cache only ever
makes allocations at multiples of the requested size, rounded up to
the next power of two. For this reason, it is normal for
:ref:`slashmap(1)` to show substantial amounts of free memory (like
30-40%) in smaller page sizes below 4KB even if LRU is active.
To summarize, one should assume for memory sizing at least the amount
of data actively accessed, plus 4KB per object, plus 8KB per "busy"
object.
.. _slash_fellow_resize:
Resizing fellow storage
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment