1. 03 Aug, 2020 1 commit
  2. 31 Jul, 2020 1 commit
    • Geoff Simmons's avatar
      Refactor the mechansim for making TLS Secrets available to haproxy. · 0ed73905
      Geoff Simmons authored
      The haproxy container now runs the app k8s-crt-dnldr, and no longer
      runs http-faccess. See https://code.uplex.de/k8s/k8s-crt-dnldr
      
      k8s-crt-dnldr runs a k8s client that reads Secrets, filtered for
      RLS (type:kubernetes.io/tls). It provides a REST API with which a
      client can instruct it to write (PUT) or remove (DELETE) a pem
      file (concatenated crt and key) corresponding to a TLS Secret in
      the cluster. By default, these are written to /etc/ssl/private,
      where haproxy reads certificates. After the next haproxy reload
      following the write or delete, haproxy will use or not use the
      certificate.
      
      Once k8s-crt-dnldr has been instructed to store a Secret, it
      responds to Update and Delete events for the Secret by updating
      or deleting the file on its own. The controller currently sends
      commands to do so as well, but in practice the k8s-crt-dnldr has
      already changed the certificate itself (this is not an error).
      
      This means that viking Pods must have RBAC rights to read
      Secrets (the fact that these are filtered for TLS is not
      expressible in RBAC). That in turn means that viking Pods
      must be assigned a service account name, to get the RBAC
      role binding.
      
      The controller no longer needs RBAC write privileges for Secrets,
      and the "tls-cert" Secret with the hard-wired name is no longer
      necessary. The Secret volume that projects "tls-cert" into viking
      Pods has been removed.
      
      The port faccess in the headless Service for viking admin
      has been renamed to crt-dnldr.
      
      Addresses #36
      0ed73905
  3. 23 Jul, 2020 1 commit
  4. 21 Jul, 2020 2 commits
  5. 20 Jul, 2020 1 commit
  6. 10 Jul, 2020 7 commits
  7. 09 Jul, 2020 4 commits
    • Geoff Simmons's avatar
      Internal renaming to reflect changing "readiness" to "configured". · f8220d1c
      Geoff Simmons authored
      VCL label and source file names changed.
      f8220d1c
    • Geoff Simmons's avatar
      Empty response bodies for readiness and configured checks. · 159f919a
      Geoff Simmons authored
      Update comments and fix up whitespace while we're here.
      159f919a
    • Geoff Simmons's avatar
      Varnish Pods are Ready when Varnish is running, even without an Ingress. · ba45234c
      Geoff Simmons authored
      Add the port "configured" to the headless Varnish admin Service,
      which responds with status 200 when an Ingress is configured, 503
      otherwise. This replaces the previous purpose of the Ready state,
      to determine if the Pods are currently implementing an Ingress.
      
      This is actually a small change to the Varnish images and the admin
      Service, but a wide-ranging change for testing, since we now check
      the configured port before verifying a configuration (rather than
      wait for the Ready state). Common test code is now in the bash
      library test/utils.sh.
      
      This commit also includes a fix for the repeated test of the
      ExternalName example, which verifies that the changed IP addresses
      for ExternalName Services are picked up by VMOD dynamic. The test
      waits for the Ready state of the IngressBackends. The second time
      around, kubectl wait sometimes picked up previous versions of the
      Pods that were in the process of terminating. These of course never
      became Ready, and the wait timed out. Now we wait for those Pods
      to delete before proceeding with the second test.
      ba45234c
    • Geoff Simmons's avatar
  8. 07 Jul, 2020 1 commit
  9. 06 Jul, 2020 3 commits
    • Tim Leers's avatar
      Add vmod-dynamic · d0f2f5c3
      Tim Leers authored
      d0f2f5c3
    • Geoff Simmons's avatar
    • Geoff Simmons's avatar
      Refactor the return status for sync operations in the main loop. · d5c1528b
      Geoff Simmons authored
      Previously all sync operations (add/update/delete for the resource
      types that the controller watches, and the cluster changes brought
      about for them if necessary) were only characterized by a Go error
      variable. The only conditions that mattered were nil or not-nil.
      
      On a nil return, success was logged, and a SyncSuccess event was
      generated for the resource that was synced. But this was done even
      if no change was required in the cluster, and the resource had
      nothing to do with Ingress or the viking application. This led
      to many superfluous Events.
      
      On a non-nil return, a warning Event was generated the sync
      operation was re-queued, using the workqueue's rate-limiting delay.
      This was done regardless of the type of error. Since the initial delay
      is quite rapid (subsequent re-queues begin to back off the delay), it
      led to many re-queues, and many Events. But it is very common that a
      brief delay is predictable, when not all necessary information is
      available to the controller, so that the rapid retries just
      generated a lot of noise. In other cases, retries will not improve
      the situation -- an invalid config, for example, will still be
      invalid on the next attempt.
      
      This commit introduces pkg/update and the type Status, which classifies
      the result of a sync operation. All of the sync methods now return
      an object of this type, which in turn determines how the controller
      handles errors, logs results, and generates Events. The Status types
      are:
      
      Success: a cluster change was necessary and was executed successfully.
      The result is logged, and an Event with Reason SyncSuccess is generated
      (as before).
      
      Noop: no cluster change was necessary. The result is logged, but no
      Event is generated. This reduces Event generation considerably.
      
      Fatal: an unrecoverable error (retries won't help). The result is
      logged, a SyncFatalError warning Event is generated, but no retries
      are attempted.
      
      Recoverable: an error that might do better on retry. The result is
      logged, a SyncRecoverableError is generated, and the operation is
      re-queued with the rate-limiting delay (as before).
      
      Incomplete: a cluster change is necessary, but some information is
      missing. The result is logged, a SyncIncomplete warning Event is
      generated, and the operation is re-queued with a delay. The delay
      is currently hard-wired to 5s, but will be made configurable.
      
      We'll probably tweak some of the decisions about which status types
      are chosen for which results. But this has already improved the
      controller's error handling, and has considerably reduced its
      verbosity, with respect to both logging and event generation.
      d5c1528b
  10. 01 Jul, 2020 5 commits
  11. 30 Jun, 2020 8 commits
  12. 18 Jun, 2020 1 commit
  13. 12 Jun, 2020 5 commits