ContainerImage.Pinniped/deploy/supervisor/deployment.yaml

246 lines
9.2 KiB
YAML
Raw Normal View History

#! Copyright 2020-2023 the Pinniped contributors. All Rights Reserved.
#! SPDX-License-Identifier: Apache-2.0
#@ load("@ytt:data", "data")
#@ load("@ytt:yaml", "yaml")
#@ load("helpers.lib.yaml",
#@ "defaultLabel",
#@ "labels",
#@ "deploymentPodLabel",
#@ "namespace",
#@ "defaultResourceName",
#@ "defaultResourceNameWithSuffix",
#@ "pinnipedDevAPIGroupWithPrefix",
#@ "getPinnipedConfigMapData",
#@ "hasUnixNetworkEndpoint",
#@ )
Improve the selectors of Deployments and Services Fixes #801. The solution is complicated by the fact that the Selector field of Deployments is immutable. It would have been easy to just make the Selectors of the main Concierge Deployment, the Kube cert agent Deployment, and the various Services use more specific labels, but that would break upgrades. Instead, we make the Pod template labels and the Service selectors more specific, because those not immutable, and then handle the Deployment selectors in a special way. For the main Concierge and Supervisor Deployments, we cannot change their selectors, so they remain "app: app_name", and we make other changes to ensure that only the intended pods are selected. We keep the original "app" label on those pods and remove the "app" label from the pods of the Kube cert agent Deployment. By removing it from the Kube cert agent pods, there is no longer any chance that they will accidentally get selected by the main Concierge Deployment. For the Kube cert agent Deployment, we can change the immutable selector by deleting and recreating the Deployment. The new selector uses only the unique label that has always been applied to the pods of that deployment. Upon recreation, these pods no longer have the "app" label, so they will not be selected by the main Concierge Deployment's selector. The selector of all Services have been updated to use new labels to more specifically target the intended pods. For the Concierge Services, this will prevent them from accidentally including the Kube cert agent pods. For the Supervisor Services, we follow the same convention just to be consistent and to help future-proof the Supervisor app in case it ever has a second Deployment added to it. The selector of the auto-created impersonation proxy Service was also previously using the "app" label. There is no change to this Service because that label will now select the correct pods, since the Kube cert agent pods no longer have that label. It would be possible to update that selector to use the new more specific label, but then we would need to invent a way to pass that label into the controller, so it seemed like more work than was justified.
2021-09-14 20:35:10 +00:00
#@ load("@ytt:template", "template")
#@ if not data.values.into_namespace:
---
apiVersion: v1
kind: Namespace
metadata:
name: #@ data.values.namespace
labels: #@ labels()
#@ end
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: #@ defaultResourceName()
namespace: #@ namespace()
labels: #@ labels()
---
apiVersion: v1
kind: ConfigMap
metadata:
name: #@ defaultResourceNameWithSuffix("static-config")
namespace: #@ namespace()
labels: #@ labels()
data:
#@yaml/text-templated-strings
pinniped.yaml: #@ yaml.encode(getPinnipedConfigMapData())
---
#@ if data.values.image_pull_dockerconfigjson and data.values.image_pull_dockerconfigjson != "":
apiVersion: v1
kind: Secret
metadata:
name: image-pull-secret
namespace: #@ namespace()
labels: #@ labels()
type: kubernetes.io/dockerconfigjson
data:
.dockerconfigjson: #@ data.values.image_pull_dockerconfigjson
#@ end
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: #@ defaultResourceName()
namespace: #@ namespace()
labels: #@ labels()
spec:
replicas: #@ data.values.replicas
selector:
Improve the selectors of Deployments and Services Fixes #801. The solution is complicated by the fact that the Selector field of Deployments is immutable. It would have been easy to just make the Selectors of the main Concierge Deployment, the Kube cert agent Deployment, and the various Services use more specific labels, but that would break upgrades. Instead, we make the Pod template labels and the Service selectors more specific, because those not immutable, and then handle the Deployment selectors in a special way. For the main Concierge and Supervisor Deployments, we cannot change their selectors, so they remain "app: app_name", and we make other changes to ensure that only the intended pods are selected. We keep the original "app" label on those pods and remove the "app" label from the pods of the Kube cert agent Deployment. By removing it from the Kube cert agent pods, there is no longer any chance that they will accidentally get selected by the main Concierge Deployment. For the Kube cert agent Deployment, we can change the immutable selector by deleting and recreating the Deployment. The new selector uses only the unique label that has always been applied to the pods of that deployment. Upon recreation, these pods no longer have the "app" label, so they will not be selected by the main Concierge Deployment's selector. The selector of all Services have been updated to use new labels to more specifically target the intended pods. For the Concierge Services, this will prevent them from accidentally including the Kube cert agent pods. For the Supervisor Services, we follow the same convention just to be consistent and to help future-proof the Supervisor app in case it ever has a second Deployment added to it. The selector of the auto-created impersonation proxy Service was also previously using the "app" label. There is no change to this Service because that label will now select the correct pods, since the Kube cert agent pods no longer have that label. It would be possible to update that selector to use the new more specific label, but then we would need to invent a way to pass that label into the controller, so it seemed like more work than was justified.
2021-09-14 20:35:10 +00:00
#! In hindsight, this should have been deploymentPodLabel(), but this field is immutable so changing it would break upgrades.
matchLabels: #@ defaultLabel()
template:
metadata:
Improve the selectors of Deployments and Services Fixes #801. The solution is complicated by the fact that the Selector field of Deployments is immutable. It would have been easy to just make the Selectors of the main Concierge Deployment, the Kube cert agent Deployment, and the various Services use more specific labels, but that would break upgrades. Instead, we make the Pod template labels and the Service selectors more specific, because those not immutable, and then handle the Deployment selectors in a special way. For the main Concierge and Supervisor Deployments, we cannot change their selectors, so they remain "app: app_name", and we make other changes to ensure that only the intended pods are selected. We keep the original "app" label on those pods and remove the "app" label from the pods of the Kube cert agent Deployment. By removing it from the Kube cert agent pods, there is no longer any chance that they will accidentally get selected by the main Concierge Deployment. For the Kube cert agent Deployment, we can change the immutable selector by deleting and recreating the Deployment. The new selector uses only the unique label that has always been applied to the pods of that deployment. Upon recreation, these pods no longer have the "app" label, so they will not be selected by the main Concierge Deployment's selector. The selector of all Services have been updated to use new labels to more specifically target the intended pods. For the Concierge Services, this will prevent them from accidentally including the Kube cert agent pods. For the Supervisor Services, we follow the same convention just to be consistent and to help future-proof the Supervisor app in case it ever has a second Deployment added to it. The selector of the auto-created impersonation proxy Service was also previously using the "app" label. There is no change to this Service because that label will now select the correct pods, since the Kube cert agent pods no longer have that label. It would be possible to update that selector to use the new more specific label, but then we would need to invent a way to pass that label into the controller, so it seemed like more work than was justified.
2021-09-14 20:35:10 +00:00
labels:
#! This has always included defaultLabel(), which is used by this Deployment's selector.
_: #@ template.replace(defaultLabel())
#! More recently added the more unique deploymentPodLabel() so Services can select these Pods more specifically
#! without accidentally selecting pods from any future Deployments which might also want to use the defaultLabel().
_: #@ template.replace(deploymentPodLabel())
spec:
securityContext:
runAsUser: #@ data.values.run_as_user
runAsGroup: #@ data.values.run_as_group
serviceAccountName: #@ defaultResourceName()
#@ if data.values.image_pull_dockerconfigjson and data.values.image_pull_dockerconfigjson != "":
imagePullSecrets:
- name: image-pull-secret
#@ end
containers:
- name: #@ defaultResourceName()
#@ if data.values.image_digest:
image: #@ data.values.image_repo + "@" + data.values.image_digest
#@ else:
image: #@ data.values.image_repo + ":" + data.values.image_tag
#@ end
imagePullPolicy: IfNotPresent
Switch to a slimmer distroless base image. At a high level, it switches us to a distroless base container image, but that also includes several related bits: - Add a writable /tmp but make the rest of our filesystems read-only at runtime. - Condense our main server binaries into a single pinniped-server binary. This saves a bunch of space in the image due to duplicated library code. The correct behavior is dispatched based on `os.Args[0]`, and the `pinniped-server` binary is symlinked to `pinniped-concierge` and `pinniped-supervisor`. - Strip debug symbols from our binaries. These aren't really useful in a distroless image anyway and all the normal stuff you'd expect to work, such as stack traces, still does. - Add a separate `pinniped-concierge-kube-cert-agent` binary with "sleep" and "print" functionality instead of using builtin /bin/sleep and /bin/cat for the kube-cert-agent. This is split from the main server binary because the loading/init time of the main server binary was too large for the tiny resource footprint we established in our kube-cert-agent PodSpec. Using a separate binary eliminates this issue and the extra binary adds only around 1.5MiB of image size. - Switch the kube-cert-agent code to use a JSON `{"tls.crt": "<b64 cert>", "tls.key": "<b64 key>"}` format. This is more robust to unexpected input formatting than the old code, which simply concatenated the files with some extra newlines and split on whitespace. - Update integration tests that made now-invalid assumptions about the `pinniped-server` image. Signed-off-by: Matt Moyer <moyerm@vmware.com>
2021-07-26 16:18:43 +00:00
command:
- pinniped-supervisor
- /etc/podinfo
- /etc/config/pinniped.yaml
securityContext:
readOnlyRootFilesystem: true
runAsNonRoot: true
allowPrivilegeEscalation: false
capabilities:
drop: [ "ALL" ]
#! seccompProfile was introduced in Kube v1.19. Using it on an older Kube version will result in a
#! kubectl validation error when installing via `kubectl apply`, which can be ignored using kubectl's
#! `--validate=false` flag. Note that installing via `kapp` does not complain about this validation error.
seccompProfile:
type: "RuntimeDefault"
resources:
requests:
#! If OIDCClient CRs are being used, then the Supervisor needs enough CPU to run expensive bcrypt
#! operations inside the implementation of the token endpoint for any authcode flows performed by those
#! clients, so for that use case administrators may wish to increase the requests.cpu value to more
#! closely align with their anticipated needs. Increasing this value will cause Kubernetes to give more
#! available CPU to this process during times of high CPU contention. By default, don't ask for too much
#! because that would make it impossible to install the Pinniped Supervisor on small clusters.
#! Aside from performing bcrypts at the token endpoint for those clients, the Supervisor is not a
#! particularly CPU-intensive process.
cpu: "100m" #! by default, request one-tenth of a CPU
memory: "128Mi"
limits:
#! By declaring a CPU limit that is not equal to the CPU request value, the Supervisor will be classified
#! by Kubernetes to have "burstable" quality of service.
#! See https://kubernetes.io/docs/tasks/configure-pod-container/quality-service-pod/#create-a-pod-that-gets-assigned-a-qos-class-of-burstable
#! If OIDCClient CRs are being used, and lots of simultaneous users have active sessions, then it is hard
#! pre-determine what the CPU limit should be for that use case. Guessing too low would cause the
#! pod's CPU usage to be throttled, resulting in poor performance. Guessing too high would allow clients
#! to cause the usage of lots of CPU resources. Administrators who have a good sense of anticipated usage
#! patterns may choose to set the requests.cpu and limits.cpu differently from these defaults.
cpu: "1000m" #! by default, throttle each pod's usage at 1 CPU
memory: "128Mi"
volumeMounts:
- name: config-volume
mountPath: /etc/config
readOnly: true
- name: podinfo
mountPath: /etc/podinfo
readOnly: true
#@ if hasUnixNetworkEndpoint():
- name: socket
mountPath: /pinniped_socket
readOnly: false #! writable to allow for socket use
#@ end
ports:
- containerPort: 8443
protocol: TCP
env:
#@ if data.values.https_proxy:
- name: HTTPS_PROXY
value: #@ data.values.https_proxy
#@ end
#@ if data.values.https_proxy and data.values.no_proxy:
- name: NO_PROXY
value: #@ data.values.no_proxy
#@ end
livenessProbe:
httpGet:
path: /healthz
port: 8443
scheme: HTTPS
initialDelaySeconds: 2
timeoutSeconds: 15
periodSeconds: 10
failureThreshold: 5
readinessProbe:
httpGet:
path: /healthz
port: 8443
scheme: HTTPS
initialDelaySeconds: 2
timeoutSeconds: 3
periodSeconds: 10
failureThreshold: 3
volumes:
- name: config-volume
configMap:
name: #@ defaultResourceNameWithSuffix("static-config")
- name: podinfo
downwardAPI:
items:
- path: "labels"
fieldRef:
fieldPath: metadata.labels
- path: "namespace"
fieldRef:
fieldPath: metadata.namespace
- path: "name"
fieldRef:
fieldPath: metadata.name
#@ if hasUnixNetworkEndpoint():
- name: socket
emptyDir: {}
#@ end
tolerations:
- key: kubernetes.io/arch
effect: NoSchedule
operator: Equal
value: amd64 #! Allow running on amd64 nodes.
- key: kubernetes.io/arch
effect: NoSchedule
operator: Equal
value: arm64 #! Also allow running on arm64 nodes.
#! This will help make sure our multiple pods run on different nodes, making
#! our deployment "more" "HA".
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 50
podAffinityTerm:
labelSelector:
Improve the selectors of Deployments and Services Fixes #801. The solution is complicated by the fact that the Selector field of Deployments is immutable. It would have been easy to just make the Selectors of the main Concierge Deployment, the Kube cert agent Deployment, and the various Services use more specific labels, but that would break upgrades. Instead, we make the Pod template labels and the Service selectors more specific, because those not immutable, and then handle the Deployment selectors in a special way. For the main Concierge and Supervisor Deployments, we cannot change their selectors, so they remain "app: app_name", and we make other changes to ensure that only the intended pods are selected. We keep the original "app" label on those pods and remove the "app" label from the pods of the Kube cert agent Deployment. By removing it from the Kube cert agent pods, there is no longer any chance that they will accidentally get selected by the main Concierge Deployment. For the Kube cert agent Deployment, we can change the immutable selector by deleting and recreating the Deployment. The new selector uses only the unique label that has always been applied to the pods of that deployment. Upon recreation, these pods no longer have the "app" label, so they will not be selected by the main Concierge Deployment's selector. The selector of all Services have been updated to use new labels to more specifically target the intended pods. For the Concierge Services, this will prevent them from accidentally including the Kube cert agent pods. For the Supervisor Services, we follow the same convention just to be consistent and to help future-proof the Supervisor app in case it ever has a second Deployment added to it. The selector of the auto-created impersonation proxy Service was also previously using the "app" label. There is no change to this Service because that label will now select the correct pods, since the Kube cert agent pods no longer have that label. It would be possible to update that selector to use the new more specific label, but then we would need to invent a way to pass that label into the controller, so it seemed like more work than was justified.
2021-09-14 20:35:10 +00:00
matchLabels: #@ deploymentPodLabel()
topologyKey: kubernetes.io/hostname
---
apiVersion: v1
kind: Service
metadata:
#! If name is changed, must also change names.apiService in the ConfigMap above and spec.service.name in the APIService below.
name: #@ defaultResourceNameWithSuffix("api")
namespace: #@ namespace()
labels: #@ labels()
#! prevent kapp from altering the selector of our services to match kubectl behavior
annotations:
kapp.k14s.io/disable-default-label-scoping-rules: ""
spec:
type: ClusterIP
selector: #@ deploymentPodLabel()
ports:
- protocol: TCP
port: 443
targetPort: 10250
---
apiVersion: apiregistration.k8s.io/v1
kind: APIService
metadata:
name: #@ pinnipedDevAPIGroupWithPrefix("v1alpha1.clientsecret.supervisor")
labels: #@ labels()
spec:
version: v1alpha1
group: #@ pinnipedDevAPIGroupWithPrefix("clientsecret.supervisor")
groupPriorityMinimum: 9900
versionPriority: 15
#! caBundle: Do not include this key here. Starts out null, will be updated/owned by the golang code.
service:
name: #@ defaultResourceNameWithSuffix("api")
namespace: #@ namespace()
port: 443