Even though a client may hold the leader election lock in the Kube
lease API, that does not mean it has had a chance to update its
internal state to reflect that. Thus we retry the checks in
checkOnlyLeaderCanWrite a few times to allow the client to catch up.
Signed-off-by: Monis Khan <mok@vmware.com>
We no longer have a transitive dependency on this older repository, so we don't need the replace directive anymore.
There is a new fork of this that we should move to (https://github.com/golang-jwt/jwt), but we can't easily do that until a couple of our direct dependencies upgrade.
This is a revert of d162cb9adf.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
The Kube API server code that we use will cast inputs in an attempt
to see if they implement optional interfaces. This change adds a
simple wrapper struct to prevent such casts from causing us any
issues.
Signed-off-by: Monis Khan <mok@vmware.com>
This change updates the default NO_PROXY for the supervisor to not
proxy requests to the Kubernetes API and other Kubernetes endpoints
such as Kubernetes services.
It also adds https_proxy and no_proxy settings for the concierge
with the same default.
Signed-off-by: Monis Khan <mok@vmware.com>
In the upstream dynamiccertificates package, we rely on two pieces
of code:
1. DynamicServingCertificateController.newTLSContent which calls
- clientCA.CurrentCABundleContent
- servingCert.CurrentCertKeyContent
2. unionCAContent.VerifyOptions which calls
- unionCAContent.CurrentCABundleContent
This results in calls to our tlsServingCertDynamicCertProvider and
impersonationSigningCertProvider. If we Unset these providers, we
subtly break these consumers. At best this results in test slowness
and flakes while we wait for reconcile loops to converge. At worst,
it results in actual errors during runtime. For example, we
previously would Unset the impersonationSigningCertProvider on any
sync loop error (even a transient one caused by a network blip or
a conflict between writes from different replicas of the concierge).
This would cause us to transiently fail to issue new certificates
from the token credential require API. It would also cause us to
transiently fail to authenticate previously issued client certs
(which results in occasional Unauthorized errors in CI).
Signed-off-by: Monis Khan <mok@vmware.com>
This change updates the pinniped CLI entrypoint to prevent browser
processes that we spawn from polluting our std out stream.
For example, chrome will print the following message to std out:
Opening in existing browser session.
Which leads to the following incomprehensible error message from
kubectl:
Unable to connect to the server: getting credentials:
decoding stdout: couldn't get version/kind; json parse error:
json: cannot unmarshal string into Go value of type struct
{ APIVersion string "json:\"apiVersion,omitempty\"";
Kind string "json:\"kind,omitempty\"" }
This would only occur on the initial login when we opened the
browser. Since credentials would be cached afterwards, kubectl
would work as expected for future invocations as no browser was
opened.
I could not think of a good way to actually test this change. There
is a clear gap in our integration tests - we never actually launch a
browser in the exact same way a user does - we instead open a chrome
driver at the login URL as a subprocess of the integration test
binary and not the pinniped CLI. Thus even if the chrome driver was
writing to std out, we would not notice any issues.
It is also unclear if there is a good way to prevent future related
bugs since std out is global to the process.
Signed-off-by: Monis Khan <mok@vmware.com>
At a high level, it switches us to a distroless base container image, but that also includes several related bits:
- Add a writable /tmp but make the rest of our filesystems read-only at runtime.
- Condense our main server binaries into a single pinniped-server binary. This saves a bunch of space in
the image due to duplicated library code. The correct behavior is dispatched based on `os.Args[0]`, and
the `pinniped-server` binary is symlinked to `pinniped-concierge` and `pinniped-supervisor`.
- Strip debug symbols from our binaries. These aren't really useful in a distroless image anyway and all the
normal stuff you'd expect to work, such as stack traces, still does.
- Add a separate `pinniped-concierge-kube-cert-agent` binary with "sleep" and "print" functionality instead of
using builtin /bin/sleep and /bin/cat for the kube-cert-agent. This is split from the main server binary
because the loading/init time of the main server binary was too large for the tiny resource footprint we
established in our kube-cert-agent PodSpec. Using a separate binary eliminates this issue and the extra
binary adds only around 1.5MiB of image size.
- Switch the kube-cert-agent code to use a JSON `{"tls.crt": "<b64 cert>", "tls.key": "<b64 key>"}` format.
This is more robust to unexpected input formatting than the old code, which simply concatenated the files
with some extra newlines and split on whitespace.
- Update integration tests that made now-invalid assumptions about the `pinniped-server` image.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
We accidentally missed this in the v0.10.0 release process. The new YAML field here should make it easier to automate this step, which seems like a really good idea.
This may be a temporary fix. It switches the manual auth code prompt to use `promptForValue()` instead of `promptForSecret()`. The `promptForSecret()` function no longer supports cancellation (the v0.9.2 behavior) and the method of cancelling in `promptForValue()` is now based on running the blocking read in a background goroutine, which is allowed to block forever or leak (which is not important for our CLI use case).
This means that the authorization code is now visible in the user's terminal, but this is really not a big deal because of PKCE and the limited lifetime of an auth code.
The main goroutine now correctly waits for the "manual prompt" goroutine to clean up, which now includes printing the extra newline that would normally have been entered by the user in the manual flow.
The text of the manual login prompt is updated to be more concise and less scary (don't use the word "fail").
Signed-off-by: Matt Moyer <moyerm@vmware.com>
- Remove all the "latest" links and replace them with our new shortcode so they point at the latest release in a more explicit way.
This also eliminates one of the sections in our Concierge and Supervisor install guides, since you're always installing a specific version.
- Provide instructions for installing with both kapp (one step) and kubectl (two steps for the Concierge).
- Minor wording changes. Mainly we are now a bit less verbose about reminding people they can choose a different version (once per page instead of in each step).
- When we give an example `kapp deploy` command, don't suggest `--yes` and `--diff-changes`.
Users can still use these but it seems overly verbose for an example command.
Signed-off-by: Matt Moyer <moyerm@vmware.com>