ContainerImage.Pinniped

Author	SHA1	Message	Date
Ryan Richard	208a566bdf	Merge branch 'main' into dynamic_clients	2022-09-23 14:01:11 -07:00
Ryan Richard	1c296e5c4c	Implement the OIDCClientSecretRequest API This commit is a WIP commit because it doesn't include many tests for the new feature. Co-authored-by: Ryan Richard <richardry@vmware.com> Co-authored-by: Benjamin A. Petersen <ben@benjaminapetersen.me>	2022-09-21 15:15:07 -07:00
Ryan Richard	b564454bab	Make Pinniped compatible with Kube clusters which have enabled PSAs Where possible, use securityContext settings which will work with the most restrictive Pod Security Admission policy level (as of Kube 1.25). Where privileged containers are needed, use the namespace-level annotation to allow them. Also adjust some integration tests to make similar changes to allow the integration tests to pass on test clusters which use restricted PSAs.	2022-09-15 14:58:15 -07:00
Margo Crawford	8f4285dbff	Change group names Signed-off-by: Margo Crawford <margaretc@vmware.com>	2022-06-13 14:28:05 -07:00
Margo Crawford	889348e999	WIP aggregated api for oidcclientsecretrequest Signed-off-by: Margo Crawford <margaretc@vmware.com>	2022-06-09 13:47:19 -07:00
Ryan Richard	8d12c1b674	HTTP listener: default disabled and may only bind to loopback interfaces	2022-03-24 15:46:10 -07:00
Monis Khan	1e1789f6d1	Allow configuration of supervisor endpoints This change allows configuration of the http and https listeners used by the supervisor. TCP (IPv4 and IPv6 with any interface and port) and Unix domain socket based listeners are supported. Listeners may also be disabled. Binding the http listener to TCP addresses other than 127.0.0.1 or ::1 is deprecated. The deployment now uses https health checks. The supervisor is always able to complete a TLS connection with the use of a bootstrap certificate that is signed by an in-memory certificate authority. To support sidecar containers used by service meshes, Unix domain socket based listeners include ACLs that allow writes to the socket file from any runAsUser specified in the pod's containers. Signed-off-by: Monis Khan <mok@vmware.com>	2022-01-18 17:43:45 -05:00
Ryan Richard	cec9f3c4d7	Improve the selectors of Deployments and Services Fixes #801. The solution is complicated by the fact that the Selector field of Deployments is immutable. It would have been easy to just make the Selectors of the main Concierge Deployment, the Kube cert agent Deployment, and the various Services use more specific labels, but that would break upgrades. Instead, we make the Pod template labels and the Service selectors more specific, because those not immutable, and then handle the Deployment selectors in a special way. For the main Concierge and Supervisor Deployments, we cannot change their selectors, so they remain "app: app_name", and we make other changes to ensure that only the intended pods are selected. We keep the original "app" label on those pods and remove the "app" label from the pods of the Kube cert agent Deployment. By removing it from the Kube cert agent pods, there is no longer any chance that they will accidentally get selected by the main Concierge Deployment. For the Kube cert agent Deployment, we can change the immutable selector by deleting and recreating the Deployment. The new selector uses only the unique label that has always been applied to the pods of that deployment. Upon recreation, these pods no longer have the "app" label, so they will not be selected by the main Concierge Deployment's selector. The selector of all Services have been updated to use new labels to more specifically target the intended pods. For the Concierge Services, this will prevent them from accidentally including the Kube cert agent pods. For the Supervisor Services, we follow the same convention just to be consistent and to help future-proof the Supervisor app in case it ever has a second Deployment added to it. The selector of the auto-created impersonation proxy Service was also previously using the "app" label. There is no change to this Service because that label will now select the correct pods, since the Kube cert agent pods no longer have that label. It would be possible to update that selector to use the new more specific label, but then we would need to invent a way to pass that label into the controller, so it seemed like more work than was justified.	2021-09-14 13:35:10 -07:00
Matt Moyer	f0a1555aca	Fix broken "read only" fields added in v0.11.0. These fields were changed as a minor hardening attempt when we switched to Distroless, but I bungled the field names and we never noticed because Kapp doesn't apply API validations. This change fixes the field names so they act as was originally intended. We should also follow up with a change that validates all of our installation manifest in CI. Signed-off-by: Matt Moyer <moyerm@vmware.com>	2021-09-02 16:12:39 -05:00
Monis Khan	66ddcf98d3	Provide good defaults for NO_PROXY This change updates the default NO_PROXY for the supervisor to not proxy requests to the Kubernetes API and other Kubernetes endpoints such as Kubernetes services. It also adds https_proxy and no_proxy settings for the concierge with the same default. Signed-off-by: Monis Khan <mok@vmware.com>	2021-08-17 10:03:19 -04:00
Matt Moyer	58bbffded4	Switch to a slimmer distroless base image. At a high level, it switches us to a distroless base container image, but that also includes several related bits: - Add a writable /tmp but make the rest of our filesystems read-only at runtime. - Condense our main server binaries into a single pinniped-server binary. This saves a bunch of space in the image due to duplicated library code. The correct behavior is dispatched based on `os.Args[0]`, and the `pinniped-server` binary is symlinked to `pinniped-concierge` and `pinniped-supervisor`. - Strip debug symbols from our binaries. These aren't really useful in a distroless image anyway and all the normal stuff you'd expect to work, such as stack traces, still does. - Add a separate `pinniped-concierge-kube-cert-agent` binary with "sleep" and "print" functionality instead of using builtin /bin/sleep and /bin/cat for the kube-cert-agent. This is split from the main server binary because the loading/init time of the main server binary was too large for the tiny resource footprint we established in our kube-cert-agent PodSpec. Using a separate binary eliminates this issue and the extra binary adds only around 1.5MiB of image size. - Switch the kube-cert-agent code to use a JSON `{"tls.crt": "<b64 cert>", "tls.key": "<b64 key>"}` format. This is more robust to unexpected input formatting than the old code, which simply concatenated the files with some extra newlines and split on whitespace. - Update integration tests that made now-invalid assumptions about the `pinniped-server` image. Signed-off-by: Matt Moyer <moyerm@vmware.com>	2021-08-09 15:05:13 -04:00
Ryan Richard	f1e63c55d4	Add `https_proxy` and `no_proxy` settings for the Supervisor - Add new optional ytt params for the Supervisor deployment. - When the Supervisor is making calls to an upstream OIDC provider, use these variables if they were provided. - These settings are integration tested in the main CI pipeline by sometimes setting them on deployments in certain cases, and then letting the existing integration tests (e.g. TestE2EFullIntegration) provide the coverage, so there are no explicit changes to the integration tests themselves in this commit.	2021-07-07 12:50:13 -07:00
Ryan Richard	616211c1bc	deploy: wire API group suffix through YTT templates I didn't advertise this feature in the deploy README's since (hopefully) not many people will want to use it? Signed-off-by: Andrew Keesler <akeesler@vmware.com>	2021-01-19 17:23:06 -05:00
Andrew Keesler	af11d8cd58	Run Tilt images as root for faster reload Previously, when triggering a Tilt reload via a *.go file change, a reload would take ~13 seconds and we would see this error message in the Tilt logs for each component. Live Update failed with unexpected error: command terminated with exit code 2 Falling back to a full image build + deploy Now, Tilt should reload images a lot faster (~3 seconds) since we are running the images as root. Note! Reloading the Concierge component still takes ~13 seconds because there are 2 containers running in the Concierge namespace that use the Concierge image: the main Concierge app and the kube cert agent pod. Tilt can't live reload both of these at once, so the reload takes longer and we see this error message. Will not perform Live Update because: Error retrieving container info: can only get container info for a single pod; image target image:image/concierge has 2 pods Falling back to a full image build + deploy Signed-off-by: Andrew Keesler <akeesler@vmware.com>	2021-01-15 11:34:53 -05:00
Andrew Keesler	e17bc31b29	Pass CSRF cookie signing key from controller to cache This also sets the CSRF cookie Secret's OwnerReference to the Pod's grandparent Deployment so that when the Deployment is cleaned up, then the Secret is as well. Obviously this controller implementation has a lot of issues, but it will at least get us started. Signed-off-by: Andrew Keesler <akeesler@vmware.com>	2020-12-11 11:49:27 -05:00
Andrew Keesler	724c0d3eb0	Add YTT template value for setting log level This is helpful for us, amongst other users, because we want to enable "debug" logging whenever we deploy components for testing. See `a5643e3` for addition of log level. Signed-off-by: Andrew Keesler <akeesler@vmware.com>	2020-11-11 09:01:38 -05:00
Ryan Richard	05cf56a0fa	Merge pull request #180 from vmware-tanzu/limits Add CPU/memory limits to our deployments	2020-11-02 16:22:37 -08:00
Ryan Richard	05233963fb	Add CPU requests and limits to the Concierge and Supervisor deployments	2020-11-02 15:47:20 -08:00
Ryan Richard	781f86d18c	deploy: add memory limits This is the beginning of a change to add cpu/memory limits to our pods. We are doing this because some consumers require this, and it is generally a good practice. The limits == requests for "Guaranteed" QoS. Signed-off-by: Andrew Keesler <akeesler@vmware.com>	2020-11-02 14:57:39 -05:00
Andrew Keesler	fcea48c8f9	Run as non-root I tried to follow a principle of encapsulation here - we can still default to peeps making connections to 80/443 on a Service object, but internally we will use 8080/8443. Signed-off-by: Andrew Keesler <akeesler@vmware.com>	2020-11-02 12:51:15 -05:00
Ryan Richard	29e0ce5662	Configure name of the supervisor default TLS cert secret via ConfigMap Signed-off-by: Andrew Keesler <akeesler@vmware.com>	2020-10-28 11:56:50 -07:00
Ryan Richard	8b7c30cfbd	Supervisor listens for HTTPS on port 443 with configurable TLS certs - TLS certificates can be configured on the OIDCProviderConfig using the `secretName` field. - When listening for incoming TLS connections, choose the TLS cert based on the SNI hostname of the incoming request. - Because SNI hostname information on incoming requests does not include the port number of the request, we add a validation that OIDCProviderConfigs where the issuer hostnames (not including port number) are the same must use the same `secretName`. - Note that this approach does not yet support requests made to an IP address instead of a hostname. Also note that `localhost` is considered a hostname by SNI. - Add port 443 as a container port to the pod spec. - A new controller watches for TLS secrets and caches them in memory. That same in-memory cache is used while servicing incoming connections on the TLS port. - Make it easy to configure both port 443 and/or port 80 for various Service types using our ytt templates for the supervisor. - When deploying to kind, add another nodeport and forward it to the host on another port to expose our new HTTPS supervisor port to the host.	2020-10-26 17:03:26 -07:00
Ryan Richard	122f7cffdb	Make the supervisor healthz endpoint public Based on our experiences today with GKE, it will be easier for our users to configure Ingress health checks if the healthz endpoint is available on the same public port as the OIDC endpoints. Also add an integration test for the healthz endpoint now that it is public. Also add the optional `containers[].ports.containerPort` to the supervisor Deployment because the GKE docs say that GKE will look at that field while inferring how to invoke the health endpoint. See https://cloud.google.com/kubernetes-engine/docs/concepts/ingress#def_inf_hc	2020-10-21 15:24:58 -07:00
Andrew Keesler	fa5f653de6	Implement readinessProbe and livenessProbe for supervisor Signed-off-by: Ryan Richard <richardry@vmware.com>	2020-10-21 11:51:31 -07:00
Andrew Keesler	617c5608ca	Supervisor controllers apply custom labels to JWKS secrets Signed-off-by: Ryan Richard <richardry@vmware.com>	2020-10-15 12:40:56 -07:00
Ryan Richard	94f20e57b1	Concierge controllers add labels to all created resources	2020-10-15 10:14:23 -07:00
Ryan Richard	1301018655	Support installing concierge and supervisor into existing namespace - New optional ytt value called `into_namespace` means install into that preexisting namespace rather than creating a new namespace for each app - Also ensure that every resource that is created statically by our yaml at install-time by either app is labeled consistently - Also support adding custom labels to all of those resources from a new ytt value called `custom_labels`	2020-10-14 15:05:42 -07:00
Ryan Richard	354b922e48	Allow creation of different Service types in Supervisor ytt templates - Tiltfile and prepare-for-integration-tests.sh both specify the NodePort Service using `--data-value-yaml 'service_nodeport_port=31234'` - Also rename the namespaces used by the Concierge and Supervisor apps during integration tests running locally	2020-10-09 16:00:11 -07:00
Ryan Richard	f5a6a0bb1e	Move all three deployment dirs under a new top-level `deploy/` dir	2020-10-09 10:00:22 -07:00

29 Commits