Also:
- Make our code generator script work with Go 1.17
- Make our update.sh script work on linux
- Update the patch versions of the old Kube versions that we were using
to generate code (see kube-versions.txt)
- Use our container images from ghcr instead of
projects.registry.vmware.com for codegen purposes
- Make it easier to debug in the future by passing "-v" to the Kube
codegen scripts
- Updated copyright years to make commit checks pass
The purpose of this change is to allow Helm to be used to deploy Pinniped
into the local KinD cluster for the local integration tests. That said,
the change allows any alternate deployment mechanism, I just happen
to be using it with Helm.
All default behavior is preserved. This won't change how anyone uses the
script today, it just allows me not to copy/paste the whole setup for the
integration tests.
Changes:
1) An option called `--alternate-deploy <path-to-deploy-script>` has been
added, that when enabled calls the specified script instead of using ytt
and kapp. The alternate deploy script is called with the app to deploy
and the tag of the docker image to use. We set the default value of
the alternate_deploy variable to undefined, and there is a check that
tests if the alternate deploy is defined. For the superivsor it looks
like this:
```
if [ "$alternate_deploy" != "undefined" ]; then
log_note "The Pinniped Supervisor will be deployed with $alternate_deploy pinniped-supervisor $tag..."
$alternate_deploy pinniped-supervisor $tag
else
normal ytt/kapp deploy
fi
```
2) Additional log_note entries have been added to enumerate all values passed
into the ytt/kapp deploy. Used while I was trying to reach parity in the integration
tests, but I think they are useful for debugging.
3) The manifests produced by ytt and written to /tmp are now named individually.
This is so an easy comparison can be made between manifests produced by a ytt/kapp
run of integration tests and manifests produced by helm run of the integration tests.
If something is not working I have been comparing the manifests after these runs to
find differences.
This change allows configuration of the http and https listeners
used by the supervisor.
TCP (IPv4 and IPv6 with any interface and port) and Unix domain
socket based listeners are supported. Listeners may also be
disabled.
Binding the http listener to TCP addresses other than 127.0.0.1 or
::1 is deprecated.
The deployment now uses https health checks. The supervisor is
always able to complete a TLS connection with the use of a bootstrap
certificate that is signed by an in-memory certificate authority.
To support sidecar containers used by service meshes, Unix domain
socket based listeners include ACLs that allow writes to the socket
file from any runAsUser specified in the pod's containers.
Signed-off-by: Monis Khan <mok@vmware.com>
This change updates the TLS config used by all pinniped components.
There are no configuration knobs associated with this change. Thus
this change tightens our static defaults.
There are four TLS config levels:
1. Secure (TLS 1.3 only)
2. Default (TLS 1.2+ best ciphers that are well supported)
3. Default LDAP (TLS 1.2+ with less good ciphers)
4. Legacy (currently unused, TLS 1.2+ with all non-broken ciphers)
Highlights per component:
1. pinniped CLI
- uses "secure" config against KAS
- uses "default" for all other connections
2. concierge
- uses "secure" config as an aggregated API server
- uses "default" config as a impersonation proxy API server
- uses "secure" config against KAS
- uses "default" config for JWT authenticater (mostly, see code)
- no changes to webhook authenticater (see code)
3. supervisor
- uses "default" config as a server
- uses "secure" config against KAS
- uses "default" config against OIDC IDPs
- uses "default LDAP" config against LDAP IDPs
Signed-off-by: Monis Khan <mok@vmware.com>
This should make it easier for us to to notice if something is wrong
with our service (especially in any future kubectl tests we add).
Signed-off-by: Monis Khan <mok@vmware.com>
Those images that are pulled from Dockerhub will cause pull failures
on some test clusters due to Dockerhub rate limiting.
Because we already have some images that we use for testing, and
because those images are already pre-loaded onto our CI clusters
to make the tests faster, use one of those images and always specify
PullIfNotPresent to avoid pulling the image again during the integration
test.
This required a weird hack because some of the Fosite tests (or a transitive dependency of them) depends on a newer version of gRPC that's incompatible with the Kubernetes runtime version we use. It wasn't as simple as just replacing the gRPC module with an older version, because in the latest versions of gRPC, they split out the "examples" packages into their own module. This new module name doesn't exist at the old version.
Ultimately, the workaround was to make a fake "examples" module locally. This module can be empty because we never actually depend on that code (it's only used in transitive dependency tests).
Signed-off-by: Matt Moyer <moyerm@vmware.com>
- The linux base64 command is different, so avoid using it at all.
On linux the default is to split the output into multiple lines,
which messes up the integration-test-env file. The flag used to
disable this behavior on linux ("-w0") does not exist on MacOS's
base64.
- On debian linux, the latest version of Docker from apt-get still
requires DOCKER_BUILDKIT=1 or else it barfs.
$PINNIPED_TEST_SUPERVISOR_UPSTREAM_OIDC_ISSUER_CA_BUNDLE was recently
changed to be a base64 encoded value, so this script does not need to
base64 encode the value itself anymore.
Avoid them because they can't be used in GoLand for running integration
tests in the UI, like running in the debugger.
Also adds optional PINNIPED_TEST_TOOLS_NAMESPACE because we need it
on the LDAP feature branch where we are developing the upcoming LDAP
support for the Supervisor.
- Rename the test/deploy/dex directory to test/deploy/tools
- Rename the dex namespace to tools
- Add a new ytt value called `pinny_ldap_password` for the tools
ytt templates
- This new value is not used on main at this time. We intend to use
it in the forthcoming ldap branch. We're defining it on main so
that the CI scripts can use it across all branches and PRs.
Signed-off-by: Ryan Richard <richardry@vmware.com>
A demo of running the Supervisor and Concierge on
a kind cluster. Can be used to quickly set up an
environment for manual testing.
Also added some missing copyright headers to other
hack scripts.
It takes a lot of manual steps to get ready to manually test the
impersonation proxy on a kind cluster, which makes it error prone,
so encapsulate them into a script to make it easier.
It also works on the slightly older MacOS Catalina.
This script is only used on development laptops, so hopefully
this will work for more laptop OS's now.
Signed-off-by: Ryan Richard <richardry@vmware.com>
Because otherwise `go test` will panic/crash your test if it takes
longer than 10 minutes, which is an annoying way for an integration
test to fail since it skips all of the t.Cleanup's.
This change adds a new virtual aggregated API that can be used by
any user to echo back who they are currently authenticated as. This
has general utility to end users and can be used in tests to
validate if authentication was successful.
Signed-off-by: Monis Khan <mok@vmware.com>
Makes it easy to deploy Pinniped under a different API group for manual
testing and iterating on integration tests on your laptop.
Signed-off-by: Ryan Richard <richardry@vmware.com>
We need this in CI when we want to configure Dex with the redirect URI for both
primary and secondary deploys at one time (since we only stand up Dex once).
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
Previously, when triggering a Tilt reload via a *.go file change, a reload would
take ~13 seconds and we would see this error message in the Tilt logs for each
component.
Live Update failed with unexpected error:
command terminated with exit code 2
Falling back to a full image build + deploy
Now, Tilt should reload images a lot faster (~3 seconds) since we are running
the images as root.
Note! Reloading the Concierge component still takes ~13 seconds because there
are 2 containers running in the Concierge namespace that use the Concierge
image: the main Concierge app and the kube cert agent pod. Tilt can't live
reload both of these at once, so the reload takes longer and we see this error
message.
Will not perform Live Update because:
Error retrieving container info: can only get container info for a single pod; image target image:image/concierge has 2 pods
Falling back to a full image build + deploy
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
kubectl 1.20 prints "Kubernetes control plane" instead of "Kubernetes master".
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
Prior to this we re-used the CLI testing client to test the authorize flow of the supervisor, but they really need to be separate upstream clients. For example, the supervisor client should be a non-public client with a client secret and a different callback endpoint.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
Before, we did this in an init container, which meant if the Dex pod restarted we would have fresh certs, but our Tilt/bash setup didn't account for this.
Now, the certs are generated by a Job which runs once and saves the generated files into a Secret. This should be a bit more stable.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
This change deploys a small Squid-based proxy into the `dex` namespace in our integration test environment. This lets us use the cluster-local DNS name (`http://dex.dex.svc.cluster.local/dex`) as the OIDC issuer. It will make generating certificates easier, and most importantly it will mean that our CLI can see Dex at the same name/URL as the supervisor running inside the cluster.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
This check predates the API renaming we did. Now that our API groups have `concierge`/`supervisor` in the name, we don't need to maintain a specific set of `cp` commands and keep them in sync, so we don't really need this check.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
These only really make sense for aggregated API types where we need `conversion-gen` to do version conversion.
Signed-off-by: Matt Moyer <moyerm@vmware.com>