At a high level, it switches us to a distroless base container image, but that also includes several related bits:
- Add a writable /tmp but make the rest of our filesystems read-only at runtime.
- Condense our main server binaries into a single pinniped-server binary. This saves a bunch of space in
the image due to duplicated library code. The correct behavior is dispatched based on `os.Args[0]`, and
the `pinniped-server` binary is symlinked to `pinniped-concierge` and `pinniped-supervisor`.
- Strip debug symbols from our binaries. These aren't really useful in a distroless image anyway and all the
normal stuff you'd expect to work, such as stack traces, still does.
- Add a separate `pinniped-concierge-kube-cert-agent` binary with "sleep" and "print" functionality instead of
using builtin /bin/sleep and /bin/cat for the kube-cert-agent. This is split from the main server binary
because the loading/init time of the main server binary was too large for the tiny resource footprint we
established in our kube-cert-agent PodSpec. Using a separate binary eliminates this issue and the extra
binary adds only around 1.5MiB of image size.
- Switch the kube-cert-agent code to use a JSON `{"tls.crt": "<b64 cert>", "tls.key": "<b64 key>"}` format.
This is more robust to unexpected input formatting than the old code, which simply concatenated the files
with some extra newlines and split on whitespace.
- Update integration tests that made now-invalid assumptions about the `pinniped-server` image.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
This may be a temporary fix. It switches the manual auth code prompt to use `promptForValue()` instead of `promptForSecret()`. The `promptForSecret()` function no longer supports cancellation (the v0.9.2 behavior) and the method of cancelling in `promptForValue()` is now based on running the blocking read in a background goroutine, which is allowed to block forever or leak (which is not important for our CLI use case).
This means that the authorization code is now visible in the user's terminal, but this is really not a big deal because of PKCE and the limited lifetime of an auth code.
The main goroutine now correctly waits for the "manual prompt" goroutine to clean up, which now includes printing the extra newline that would normally have been entered by the user in the manual flow.
The text of the manual login prompt is updated to be more concise and less scary (don't use the word "fail").
Signed-off-by: Matt Moyer <moyerm@vmware.com>
This test is asynchronously waiting for the controller to do something, and in some of our test environments it will take a bit longer than we'd previously allowed.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
this will hopefully fix some flakes where aws provisioned a host for the
load balancer but the tests weren't able to resolve it.
Signed-off-by: Margo Crawford <margaretc@vmware.com>
It seems like page.ClearCookies() only clears cookies for the current
domain, so there doesn't seem to be a function to clear all browser
cookies. Instead, we'll just start a whole new browser each test.
They start fast enough that it shouldn't be a problem.
Our actual CLI code behaved correctly, but this test made some invalid assumptions about the "upstream" IDP we're testing. It assumed that the upstream didn't support `response_mode=form_post`, but Okta does. This means that when we end up on the localhost callback page, there are no URL query parameters.
Adjusting this regex makes the test pass as expected.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
Using the same fake TTY trick we used to test LDAP login, this new subtest runs through the "manual"/"jump box" login flow. It runs the login with a `--skip-listen` flag set, causing the CLI to skip opening the localhost listener. We can then wait for the login URL to be printed, visit it with the browser and log in, and finally simulate "manually" copying the auth code from the browser and entering it into the waiting CLI prompt.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
For some reason our headless Chrome test setup behaves slightly differently on Linux and macOS hosts. On Linux, the emoji characters are not recognized as valid text, so they are URL encoded. This change updates the test to cope with both cases correctly.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
This adds a new login flow that allows manually pasting the authorization code instead of receiving a browser-based callback.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
This is a new pacakge internal/oidc/provider/formposthtml containing a number of static files embedded using the relatively recent Go "//go:embed" functionality introduced in Go 1.16 (https://blog.golang.org/go1.16).
The Javascript and CSS files are minifiied and injected to make a single self-contained HTML response. There is a special Content-Security-Policy helper to calculate hash-based script-src and style-src rules.
This new code is covered by a new integration test that exercises the JS/HTML functionality in a real browser outside of the rest of the Supervisor.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
This test would occasionally flake for me when running locally. This change moves more of the assertions into the "eventually" loop, so they can temporarily fail as long as they converge on the expected values.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
This test did not tolerate this connection failing, which can happen for any number of flaky networking-related reasons. This change moves the connection setup into an "eventually" retry loop so it's allowed to fail temporarily as long as it eventually connects.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
TestSimultaneousLDAPRequestsOnSingleProvider proved to be unreliable
on AKS due to some kind of kubectl port-forward issue, so only
run the LDAP client's integration tests on Kind. They are testing
the integration between the client code and the OpenLDAP test server,
not testing anything about Kubernetes, so running only on Kind should
give us sufficient test coverage.
This fixes some rare test flakes caused by a data race inherent in the way we use `assert.Eventually()` with extra variables for followup assertions. This function is tricky to use correctly because it runs the passed function in a separate goroutine, and you have no guarantee that any shared variables are in a coherent state when the `assert.Eventually()` call returns. Even if you add manual mutexes, it's tricky to get the semantics right. This has been a recurring pain point and the cause of several test flakes.
This change introduces a new `library.RequireEventually()` that works by internally constructing a per-loop `*require.Assertions` and running everything on a single goroutine (using `wait.PollImmediate()`). This makes it very easy to write eventual assertions.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
This change updates the impersonator to always authorize every
request instead of relying on the Kuberentes API server to perform
the check on the impersonated request. This protects us from
scenarios where we fail to correctly impersonate the user due to
some bug in our proxy logic. We still rely completely on the API
server to perform admission checks on the impersonated requests.
Signed-off-by: Monis Khan <mok@vmware.com>
This change updates the impersonation proxy code to run as a
distinct service account that only has permission to impersonate
identities. Thus any future vulnerability that causes the
impersonation headers to be dropped will fail closed instead of
escalating to the concierge's default service account which has
significantly more permissions.
Signed-off-by: Monis Khan <mok@vmware.com>
When anonymous authentication is disabled, the impersonation proxy
will no longer authenticate anonymous requests other than calls to
the token credential request API (this API is used to retrieve
credentials and thus must be accessed anonymously).
Signed-off-by: Benjamin A. Petersen <ben@benjaminapetersen.me>
Signed-off-by: Monis Khan <mok@vmware.com>
There was nothing to guarantee that _all_ Supervisor pods would be ready to handle this request. We saw a rare test flake where the LDAPIdentityProvider was marked as ready but one of the Supervisor pods didn't have it loaded yet and returned an HTTP 422 error (`Unprocessable Entity: No upstream providers are configured`).
Signed-off-by: Matt Moyer <moyerm@vmware.com>
The `require.Eventually()` function runs the body of the check in a separate goroutine, so it's not safe to use other `require` assertions as we did here. Our `library.RequireEventuallyWithoutError()` function does not spawn a goroutine, so it's safer to use here.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
The LastUpdateTime is no longer updated on every resync. It only changes if the underlying status has changed, so that it effectively shows when the transition happened.
This change happened in ab750f48aa, but we missed this test. It only fails when it has been more than ten minutes since the CredentialIssuer transitioned into a healthy state, but that can happen in our long-running CI environments.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
This test felt overly complex and some of the cleanup logic wasn't 100% correct (it didn't clean up in all cases).
The new code is essentially the same flow but hopefully easier to read.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
We see that occasionally kubectl returns 11 lines (probably related to https://github.com/kubernetes/kubernetes/issues/72628).
This test doesn't need to be so picky, so now it allows +/- one line from the expected count.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
We had this one test that mutated the CredentialIssuer, which could cause the impersonation proxy to blip on one or both of the running concierge pods. This would sometimes break other concurrently running tests.
Instead, this bit of code is split into a separate non-concurrent test.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
The new version has different behavior for the `nonce` claim, which is now omitted if it would be empty (see https://github.com/ory/fosite/pull/570).
Signed-off-by: Matt Moyer <moyerm@vmware.com>
This is to allow the use of binary LDAP entry attributes as the UID.
For example, a user might like to configure AD’s objectGUID or maybe
objectSid attributes as the UID attribute.
This negatively impacts the readability of the UID when it did not come
from a binary value, but we're considering this an okay trade-off to
keep things simple for now. In the future, we may offer more
customizable encoding options for binary attributes.
These UIDs are currently only used in the downstream OIDC `sub` claim.
They do not effect the user's identity on the Kubernetes cluster,
which is only based on their mapped username and group memberships from
the upstream identity provider. We are not currently supporting any
special encoding for those username and group name LDAP attributes, so
their values in the LDAP entry must be ASCII or UTF-8 in order for them
to be interpreted correctly.
This test setup should tolerate when the TokenCredentialRequest API isn't quite ready to authenticate the user or issue a cert.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
This check is no longer valid, because there can be ephemeral, recoverable errors that show as ErrorDuringSetup.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
- This enhances our LDAP client code to make it possible to optionally
dial an LDAP server without TLS and then use StartTLS to upgrade
the connection to TLS.
- The controller for LDAPIdentityProviders is not using this option
yet. That will come in a future commit.
Previously, our controllers would automatically create a CredentialIssuer with a singleton name. The helpers we had for this also used "raw" client access and did not take advantage of the informer cache pattern.
With this change, the CredentialIssuer is always created at install time in the ytt YAML. The controllers now only update the existing CredentialIssuer status, and they do so using the informer cache as much as possible.
This change is targeted at only the kubecertagent controller to start. The impersonatorconfig controller will be updated in a following PR along with other changes.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
- For testing purposes, we would like to ensure that when we connect
to the LDAP server we cannot accidentally avoid using TLS or StartTLS.
- Also enabled the openldap `memberOf` overlay in case we want to
support group search using `memberOf` in the future.
- This required changes to the docker.io/bitnami/openldap container
image, so we're using our own fork for now. Will submit a PR to
bitnami/openldap to see if they will accept it (or something similar)
upstream.