- 03 Aug, 2020 1 commit
-
-
Geoff Simmons authored
Ref #36
-
- 31 Jul, 2020 1 commit
-
-
Geoff Simmons authored
The haproxy container now runs the app k8s-crt-dnldr, and no longer runs http-faccess. See https://code.uplex.de/k8s/k8s-crt-dnldr k8s-crt-dnldr runs a k8s client that reads Secrets, filtered for RLS (type:kubernetes.io/tls). It provides a REST API with which a client can instruct it to write (PUT) or remove (DELETE) a pem file (concatenated crt and key) corresponding to a TLS Secret in the cluster. By default, these are written to /etc/ssl/private, where haproxy reads certificates. After the next haproxy reload following the write or delete, haproxy will use or not use the certificate. Once k8s-crt-dnldr has been instructed to store a Secret, it responds to Update and Delete events for the Secret by updating or deleting the file on its own. The controller currently sends commands to do so as well, but in practice the k8s-crt-dnldr has already changed the certificate itself (this is not an error). This means that viking Pods must have RBAC rights to read Secrets (the fact that these are filtered for TLS is not expressible in RBAC). That in turn means that viking Pods must be assigned a service account name, to get the RBAC role binding. The controller no longer needs RBAC write privileges for Secrets, and the "tls-cert" Secret with the hard-wired name is no longer necessary. The Secret volume that projects "tls-cert" into viking Pods has been removed. The port faccess in the headless Service for viking admin has been renamed to crt-dnldr. Addresses #36
-
- 23 Jul, 2020 1 commit
-
-
Geoff Simmons authored
-
- 21 Jul, 2020 2 commits
-
-
Geoff Simmons authored
-
Geoff Simmons authored
-
- 20 Jul, 2020 1 commit
-
-
Lars Fenneberg authored
-
- 10 Jul, 2020 7 commits
-
-
Geoff Simmons authored
-
Geoff Simmons authored
-
Geoff Simmons authored
-
Geoff Simmons authored
-
Geoff Simmons authored
In test/e2e.sh, we run the test twice, to confirm that VMOD dynamic picks up the changed IP for the DNS name. For now, just run the test once in the pipeline.
-
Geoff Simmons authored
These are unfortunately set up to run in a specific order. The order should currently match the order in test/e2e.sh.
-
Geoff Simmons authored
Since it has been failing consistently in the CI pipeline.
-
- 09 Jul, 2020 4 commits
-
-
Geoff Simmons authored
VCL label and source file names changed.
-
Geoff Simmons authored
Update comments and fix up whitespace while we're here.
-
Geoff Simmons authored
Add the port "configured" to the headless Varnish admin Service, which responds with status 200 when an Ingress is configured, 503 otherwise. This replaces the previous purpose of the Ready state, to determine if the Pods are currently implementing an Ingress. This is actually a small change to the Varnish images and the admin Service, but a wide-ranging change for testing, since we now check the configured port before verifying a configuration (rather than wait for the Ready state). Common test code is now in the bash library test/utils.sh. This commit also includes a fix for the repeated test of the ExternalName example, which verifies that the changed IP addresses for ExternalName Services are picked up by VMOD dynamic. The test waits for the Ready state of the IngressBackends. The second time around, kubectl wait sometimes picked up previous versions of the Pods that were in the process of terminating. These of course never became Ready, and the wait timed out. Now we wait for those Pods to delete before proceeding with the second test.
-
Geoff Simmons authored
-
- 07 Jul, 2020 1 commit
-
-
Geoff Simmons authored
-
- 06 Jul, 2020 3 commits
-
-
Tim Leers authored
-
Geoff Simmons authored
-
Geoff Simmons authored
Previously all sync operations (add/update/delete for the resource types that the controller watches, and the cluster changes brought about for them if necessary) were only characterized by a Go error variable. The only conditions that mattered were nil or not-nil. On a nil return, success was logged, and a SyncSuccess event was generated for the resource that was synced. But this was done even if no change was required in the cluster, and the resource had nothing to do with Ingress or the viking application. This led to many superfluous Events. On a non-nil return, a warning Event was generated the sync operation was re-queued, using the workqueue's rate-limiting delay. This was done regardless of the type of error. Since the initial delay is quite rapid (subsequent re-queues begin to back off the delay), it led to many re-queues, and many Events. But it is very common that a brief delay is predictable, when not all necessary information is available to the controller, so that the rapid retries just generated a lot of noise. In other cases, retries will not improve the situation -- an invalid config, for example, will still be invalid on the next attempt. This commit introduces pkg/update and the type Status, which classifies the result of a sync operation. All of the sync methods now return an object of this type, which in turn determines how the controller handles errors, logs results, and generates Events. The Status types are: Success: a cluster change was necessary and was executed successfully. The result is logged, and an Event with Reason SyncSuccess is generated (as before). Noop: no cluster change was necessary. The result is logged, but no Event is generated. This reduces Event generation considerably. Fatal: an unrecoverable error (retries won't help). The result is logged, a SyncFatalError warning Event is generated, but no retries are attempted. Recoverable: an error that might do better on retry. The result is logged, a SyncRecoverableError is generated, and the operation is re-queued with the rate-limiting delay (as before). Incomplete: a cluster change is necessary, but some information is missing. The result is logged, a SyncIncomplete warning Event is generated, and the operation is re-queued with a delay. The delay is currently hard-wired to 5s, but will be made configurable. We'll probably tweak some of the decisions about which status types are chosen for which results. But this has already improved the controller's error handling, and has considerably reduced its verbosity, with respect to both logging and event generation.
-
- 01 Jul, 2020 5 commits
-
-
Geoff Simmons authored
These correspond to properties that can be set in VMOD dynamic. Currently we have: dnsRetryDelay: this is set as the ttl in the VMOD. Since we get DNS TTLs from the server, it effectively sets the retry delay after a lookup gets negative results. More recent versions of the VMOD have a separate parameter for this purpose, so this should be updated soon. domainUsageTimeout: corresponds to the domain_usage_timeout param of the VMOD director. firstLookupTimeout: corresponds to the first_lookup_timeout param of the VMOD director. resolverTimeout: set with the .set_timeout() method of the VMOD resolver object. resolverIdleTimeout: set with the .set_idle_timeout() method of the VMOD resolver object. maxDNSQueries: set with the .set_limit_outstanding_queries() method of the VMOD resolver object. followDNSRedirects: set with the .set_follow_redirects() method of the VMOD resolver object.
-
Geoff Simmons authored
-
Geoff Simmons authored
-
Geoff Simmons authored
-
Geoff Simmons authored
-
- 30 Jun, 2020 8 commits
-
-
Geoff Simmons authored
Ref gitlab issue #20
-
Geoff Simmons authored
Ref gitlab issue #20
-
Geoff Simmons authored
Uses VMOD dynamic, and requires that the getdns library is installed in the image running Varnish. This allows us to use dynamic.resolve(), in particular so that TTLs from DNS are honored. Currently sets ttl to a hard-wired value of 30s. Since the TTLs for lookup are obtained from DNS, this actually sets the delay until lookups are retried after negative results (default 1h). The next step is to test and extend BackendConfig support to configure properties of VMOD dynamic. That will make it possible to configure the ttl value (although we might stay with a much shorter ttl than 1h). Partially addresses gitlab issue #20.
-
Geoff Simmons authored
-
Geoff Simmons authored
-
Geoff Simmons authored
-
Geoff Simmons authored
-
Geoff Simmons authored
-
- 18 Jun, 2020 1 commit
-
-
Lars Fenneberg authored
-
- 12 Jun, 2020 5 commits
-
-
Geoff Simmons authored
These weren't necessarily being changed after an Endpoints update.
-
Geoff Simmons authored
-
Geoff Simmons authored
-
Geoff Simmons authored
-
Geoff Simmons authored
The Ingress update may have followed an update for Endpoints.
-