Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
k8s-ingress
Project
Project
Details
Activity
Releases
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Board
Labels
Milestones
Merge Requests
3
Merge Requests
3
Wiki
Wiki
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Commits
Issue Boards
Open sidebar
uplex-varnish
k8s-ingress
Commits
140f1335
Commit
140f1335
authored
Jan 21, 2019
by
Geoff Simmons
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
Update docs about self-sharding.
Thanks to
@martin
,
@slink
and
@lf
for their comments.
parent
314f2e38
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
46 additions
and
14 deletions
+46
-14
self-sharding.md
docs/self-sharding.md
+46
-14
No files found.
docs/self-sharding.md
View file @
140f1335
# Self-sharding caches in a cluster
# Self-sharding caches in a cluster
An Ingress may be annotated so that the Varnish instances in a cluster
An
``VarnishConfig``
resource may be configured so that the Varnish
that implement the Ingress also implement sharding of their caches --
instances in a cluster that implement Ingress also implement sharding
for each request, there is an instance in the cluster that "owns" the
of their caches -- for each request, there is an instance in the
potentially cached response, and other instances in the cluster
cluster that "owns" the potentially cached response, and other
forward the request to that instance. We refer to this as
instances in the cluster forward the request to that instance. We
"self-sharding", because no sharding on the part of a component that
refer to this as "self-sharding", because no sharding on the part of a
forwards requests to the cluster (such as a load balancer) is required
component that forwards requests to the cluster (such as a load
to shard the requests. Requests may be distributed to the cluster in
balancer) is required to shard the requests. Requests may be
any way, and the Varnish instances take care of the sharding.
distributed to the cluster in any way, and the Varnish instances take
care of the sharding.
A sample manifest for the
``VarnishConfig``
Custom Resource to
A sample manifest for the
``VarnishConfig``
Custom Resource to
configure self-sharding is in the
configure self-sharding is in the
...
@@ -39,6 +40,11 @@ Some of the effects of this design are:
...
@@ -39,6 +40,11 @@ Some of the effects of this design are:
for cacheable responses in the system. The memory load for the cache
for cacheable responses in the system. The memory load for the cache
is approximately duplicated by each instance.
is approximately duplicated by each instance.
* If T is the current total size of the cache (the sum of the
sizes of all distinct cached responses), then the memory load at
each of N replicas in the cluster is approximately T, and the
total load in the cluster is approximately T*N.
*
For N replicas in the Varnish cluster, cacheable responses are
*
For N replicas in the Varnish cluster, cacheable responses are
fetched N times from the Services that send them; and they are
fetched N times from the Services that send them; and they are
re-fetched N times when the TTL expires.
re-fetched N times when the TTL expires.
...
@@ -48,19 +54,30 @@ Some of the effects of this design are:
...
@@ -48,19 +54,30 @@ Some of the effects of this design are:
happens to forward the request to a Varnish instance that doesn't
happens to forward the request to a Varnish instance that doesn't
have it yet in its cache.
have it yet in its cache.
*
If a cached response changes after its TTL elapses, then the
instances in the cluster may return different cached responses to
the same request, if some of them have the version before the
change, and others after the change.
The purpose of self-sharding is to:
The purpose of self-sharding is to:
*
distribute memory load for the cache among the instances in the
*
distribute memory load for the cache among the instances in the
cluster.
cluster.
* If T is the total cache size as described above, then the total
memory load in the cluster is approximately T, and the load at
each replica is approximately T/N.
*
reduce the request load on Services, so that there is only one fetch
*
reduce the request load on Services, so that there is only one fetch
for a cacheable response, from on
e Varnish instance, until the TTL
for a cacheable response, from on
ly one Varnish instance, until the
expires.
TTL
expires.
*
ensure that if a cached response has been fetched from a Service
*
ensure that if a cached response has been fetched from a Service
just once, then any further request for the same object will be a
just once, then any further request for the same object will be a
cache hit until the TTL expires, regardless of which Varnish
cache hit until the TTL expires, regardless of which Varnish
instance receives the request from the LoadBalancer.
instance receives the request from the LoadBalancer. The cache hit
will always be the same response -- the object most recently fetched
from a Service.
## Clustering with sharding
## Clustering with sharding
...
@@ -80,6 +97,18 @@ without forwarding. No special configuration for the LoadBalancer is
...
@@ -80,6 +97,18 @@ without forwarding. No special configuration for the LoadBalancer is
required -- it can, for example, continue distributing requests to
required -- it can, for example, continue distributing requests to
Varnishen in round-robin order.
Varnishen in round-robin order.
This is done by applying Varnish's
[
shard director
](
https://varnish-cache.org/docs/6.1/reference/vmod_directors.generated.html#new-xshard-directors-shard
)
to the Varnish instances in the cluster. If an instance finds that the
director shards the request to itself, then it handles the request
itself as the primary cache for the request. See the documentation at
the link for more details about the shard director.
If a request is evaluated so that the response won't be cacheable in
any case (such as POST requests in default configurations), then the
request is forwarded to the Service directly, since there is no point
in forwarding it to another cache.
When each instance is loaded with the same VCL configuration generated
When each instance is loaded with the same VCL configuration generated
by the Ingress controller, then they each forward requests in the same
by the Ingress controller, then they each forward requests in the same
way. When Varnish instances (Pods) are added to or removed from the
way. When Varnish instances (Pods) are added to or removed from the
...
@@ -93,7 +122,9 @@ Some features that result from self-sharding are:
...
@@ -93,7 +122,9 @@ Some features that result from self-sharding are:
instance receives a cacheable response from another instance, it may
instance receives a cacheable response from another instance, it may
also cache the response, but only for a limited time bounded by the
also cache the response, but only for a limited time bounded by the
``max-secondary-ttl``
parameter described below. So the total memory
``max-secondary-ttl``
parameter described below. So the total memory
load for the cache in the cluster is reduced.
load for the cache in the cluster is reduced, while mutliple copies
of frequently requested objects are still kept for low response
latencies.
*
A cacheable response is fetched from a Service only once, from the
*
A cacheable response is fetched from a Service only once, from the
instance that handles the request itself without forwarding. When
instance that handles the request itself without forwarding. When
...
@@ -102,7 +133,8 @@ Some features that result from self-sharding are:
...
@@ -102,7 +133,8 @@ Some features that result from self-sharding are:
*
If a cacheable response is in at least one the cluster's caches,
*
If a cacheable response is in at least one the cluster's caches,
then subsequent requests for the same object will be cache hits
then subsequent requests for the same object will be cache hits
while the TTL is still valid, regardless of which Varnish instance
while the TTL is still valid, regardless of which Varnish instance
received the request from the LoadBalancer.
received the request from the LoadBalancer. Downstream responses are
consistent, since there is one primary cache for each response.
*
When instances are added to or removed from the cluster, the
*
When instances are added to or removed from the cluster, the
forwarding of requests to instances changes only as necessary. New
forwarding of requests to instances changes only as necessary. New
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment