Where possible, use securityContext settings which will work with the
most restrictive Pod Security Admission policy level (as of Kube 1.25).
Where privileged containers are needed, use the namespace-level
annotation to allow them.
Also adjust some integration tests to make similar changes to allow the
integration tests to pass on test clusters which use restricted PSAs.
This makes it so that our service selector will match exactly the
YAML we specify instead of including an extra "kapp.k14s.io/app" key.
This will take us closer to the standard kubectl behavior which is
desirable since we want to avoid future bugs that only manifest when
kapp is not used.
Signed-off-by: Monis Khan <mok@vmware.com>
At a high level, it switches us to a distroless base container image, but that also includes several related bits:
- Add a writable /tmp but make the rest of our filesystems read-only at runtime.
- Condense our main server binaries into a single pinniped-server binary. This saves a bunch of space in
the image due to duplicated library code. The correct behavior is dispatched based on `os.Args[0]`, and
the `pinniped-server` binary is symlinked to `pinniped-concierge` and `pinniped-supervisor`.
- Strip debug symbols from our binaries. These aren't really useful in a distroless image anyway and all the
normal stuff you'd expect to work, such as stack traces, still does.
- Add a separate `pinniped-concierge-kube-cert-agent` binary with "sleep" and "print" functionality instead of
using builtin /bin/sleep and /bin/cat for the kube-cert-agent. This is split from the main server binary
because the loading/init time of the main server binary was too large for the tiny resource footprint we
established in our kube-cert-agent PodSpec. Using a separate binary eliminates this issue and the extra
binary adds only around 1.5MiB of image size.
- Switch the kube-cert-agent code to use a JSON `{"tls.crt": "<b64 cert>", "tls.key": "<b64 key>"}` format.
This is more robust to unexpected input formatting than the old code, which simply concatenated the files
with some extra newlines and split on whitespace.
- Update integration tests that made now-invalid assumptions about the `pinniped-server` image.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
Previously, when triggering a Tilt reload via a *.go file change, a reload would
take ~13 seconds and we would see this error message in the Tilt logs for each
component.
Live Update failed with unexpected error:
command terminated with exit code 2
Falling back to a full image build + deploy
Now, Tilt should reload images a lot faster (~3 seconds) since we are running
the images as root.
Note! Reloading the Concierge component still takes ~13 seconds because there
are 2 containers running in the Concierge namespace that use the Concierge
image: the main Concierge app and the kube cert agent pod. Tilt can't live
reload both of these at once, so the reload takes longer and we see this error
message.
Will not perform Live Update because:
Error retrieving container info: can only get container info for a single pod; image target image:image/concierge has 2 pods
Falling back to a full image build + deploy
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
I tried to follow a principle of encapsulation here - we can still default to
peeps making connections to 80/443 on a Service object, but internally we will
use 8080/8443.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>