Commit Graph

888 Commits

Author SHA1 Message Date
Andrew Keesler
7502190135
Fix some copy issues in the docs 2020-08-27 08:39:57 -04:00
Andrew Keesler
aea3f0f90d
Merge pull request #74 from ankeesler/public-readme
First draft of public README (and neighboring docs)
2020-08-26 18:22:39 -04:00
Andrew Keesler
f66f7f14f5
First draft of public README (and neighboring docs)
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
2020-08-26 18:19:35 -04:00
Ryan Richard
d8bcea88a7
Merge pull request #70 from suzerain-io/self_test
Self test feature
2020-08-26 14:26:59 -07:00
Ryan Richard
2629a9c42f Empty commit to trigger PR CI pipeline 2020-08-26 09:17:08 -07:00
Ryan Richard
90fe733f94 Empty commit to trigger PR CI pipeline 2020-08-26 08:49:44 -07:00
Ryan Richard
5ed97f7f9e Merge branch 'main' into self_test 2020-08-25 19:02:27 -07:00
Ryan Richard
80153f9a80 Allow app to start despite failing to borrow the cluster signing key
- Controller and aggregated API server are allowed to run
- Keep retrying to borrow the cluster signing key in case the failure
  to get it was caused by a transient failure
- The CredentialRequest endpoint will always return an authentication
  failure as long as the cluster signing key cannot be borrowed
- Update which integration tests are skipped to reflect what should
  and should not work based on the cluster's capability under this
  new behavior
- Move CreateOrUpdateCredentialIssuerConfig() and related methods
  to their own file
- Update the CredentialIssuerConfig's Status every time we try to
  refresh the cluster signing key
2020-08-25 18:22:53 -07:00
Andrew Keesler
4306599396
Fix linter errors 2020-08-25 10:40:59 -04:00
Ryan Richard
6e59596285 Upon pod startup, update the Status of CredentialIssuerConfig
- Indicate the success or failure of the cluster signing key strategy
- Also introduce the concept of "capabilities" of an integration test
  cluster to allow the integration tests to be run against clusters
  that do or don't allow the borrowing of the cluster signing key
- Tests that are not expected to pass on clusters that lack the
  borrowing of the signing key capability are now ignored by
  calling the new library.SkipUnlessClusterHasCapability test helper
- Rename library.Getenv to library.GetEnv
- Add copyrights where they were missing
2020-08-24 18:07:34 -07:00
Matt Moyer
c2e6a1408d
Remove old generated directories from dependabot. (#72)
These never worked quite right, so let's disable them for now: #51

We can probably come up with some better solution now with the new codegen scripts, but I'll leave that for later.

Signed-off-by: Matt Moyer <moyerm@vmware.com>
2020-08-24 16:20:34 -05:00
Matt Moyer
4e08866e87
Merge pull request #71 from mattmoyer/multi-version-codegen
Generate API/client code for several Kubernetes versions.
2020-08-24 16:12:31 -05:00
Matt Moyer
cbd6dd3356 Use a symlink instead of directly mounting into GOPATH.
This supports CI better, where the original input dir isn't in GOPATH.

Signed-off-by: Matt Moyer <moyerm@vmware.com>
2020-08-24 15:58:52 -05:00
Matt Moyer
eb05e7a138 Reverse the order of this diff so it makes more sense.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
2020-08-24 15:46:51 -05:00
Matt Moyer
22f1ca24d9 Remove old generated code from ./kubernetes directory.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
2020-08-24 15:03:55 -05:00
Matt Moyer
8b36f2e8ae Convert code to use the new generated packages.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
2020-08-24 14:42:27 -05:00
Matt Moyer
34d13f71c2 Add newly generated code.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
2020-08-24 14:32:07 -05:00
Matt Moyer
1aef2f07d3 Add new ./apis directory and codegen scripts.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
2020-08-24 14:32:07 -05:00
Andrew Keesler
142e9a1583
internal/certauthority: backdate certs even further
We are seeing between 1 and 2 minutes of difference between the current time
reported in the API server pod and the pinniped pods on one of our testing
environments. Hopefully this change makes our tests pass again.

Signed-off-by: Andrew Keesler <akeesler@vmware.com>
2020-08-24 15:01:07 -04:00
Andrew Keesler
ed8b1be178
Revert "test/library: try another cert rest config"
Didn't fix CI. I didn't think it would.

I have never seen the integration tests fail like this locally, so I
have to imagine the failure has something to do with the environment
on which we are testing.

This reverts commit ba2e2f509a.
2020-08-24 11:52:47 -04:00
Ryan Richard
399e1d2eb8 Merge branch 'main' into self_test 2020-08-24 08:33:18 -07:00
Andrew Keesler
ba2e2f509a
test/library: try another cert rest config
We are getting these weird flakes in CI where the kube client that we
create with these helper functions doesn't work against the kube API.
The kube API tells us that we are unauthorized (401). Seems like something
is wrong with the keypair itself, but when I create a one-off kubeconfig
with the keypair, I get 200s from the API. Hmmm...I wonder what CI will
think of this change?

I also tried to align some naming in this package.

Signed-off-by: Andrew Keesler <akeesler@vmware.com>
2020-08-24 11:01:37 -04:00
Ryan Richard
6d43d7ba19 Update the schema of CredentialIssuerConfig
- Move the current info from spec to status
- Add schema for new stuff that we will use in a future commit to status
- Regenerate the generated code
2020-08-21 17:00:42 -07:00
Ryan Richard
ace01c86de Rename PinnipedDiscoveryInfo to CredentialIssuerConfig
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
2020-08-21 16:16:34 -07:00
Ryan Richard
d4b184a7d5 Allow aliases for the first argument of module.sh
- Makes it easier to guess/remember what are the legal arguments
- Also update the output a little to make it easier to tell
  when the command has succeeded
- And run tests using `-count 1` because cached test results are not
  very trustworthy
2020-08-21 16:15:48 -07:00
Andrew Keesler
76bd274fc4 Update the generated code
Mostly just fixes the imports

Signed-off-by: Ryan Richard <richardry@vmware.com>
2020-08-21 12:50:53 -07:00
Ryan Richard
0a805861ea Fix bug in code generator which prevented it from generating code
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
2020-08-21 12:30:50 -07:00
Andrew Keesler
2b297c28d5
Get rid of TODO that was completed in ecde8fa8
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
2020-08-21 10:38:28 -04:00
Ryan Richard
d0a9d8df33
pkg/config: force api.servingCertificate.renewBeforeSeconds to be positive
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
2020-08-20 18:21:48 -04:00
Ryan Richard
88f3b41e71
deploy: add API cert config map values
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
2020-08-20 17:14:16 -04:00
Andrew Keesler
89b6b9ee44
Merge pull request #68 from ankeesler/auto-rotate-ca
Use duration and renewBefore to control API cert rotation
2020-08-20 16:52:40 -04:00
Andrew Keesler
39c299a32d
Use duration and renewBefore to control API cert rotation
These configuration knobs are much more human-understandable than the
previous percentage-based threshold flag.

We now allow users to set the lifetime of the serving cert via a ConfigMap.
Previously this was hardcoded to 1 year.

Signed-off-by: Andrew Keesler <akeesler@vmware.com>
2020-08-20 16:35:04 -04:00
Ryan Richard
3929fa672e Rename project 2020-08-20 10:54:15 -07:00
Andrew Keesler
43888e9e0a
Make CA age threshold delta more observable via more precision
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
2020-08-20 11:42:29 -04:00
Andrew Keesler
a26d86044e
internal/mocks: fix go generate call
We need a way to validate that this generated code is up to date. I added
a long-term engineering TODO for this.

Signed-off-by: Andrew Keesler <akeesler@vmware.com>
2020-08-20 10:48:50 -04:00
Andrew Keesler
5946c2920a
Merge pull request #66 from ankeesler/auto-rotate-ca
Auto-rotate TLS certificates of the aggregated API endpoints before they expire
2020-08-20 10:22:30 -04:00
Andrew Keesler
6b90dc8bb7
Auto-rotate serving certificate
The rotation is forced by a new controller that deletes the serving cert
secret, as other controllers will see this deletion and ensure that a new
serving cert is created.

Note that the integration tests now have an addition worst case runtime of
60 seconds. This is because of the way that the aggregated API server code
reloads certificates. We will fix this in a future story. Then, the
integration tests should hopefully get much faster.

Signed-off-by: Andrew Keesler <akeesler@vmware.com>
2020-08-20 10:03:36 -04:00
Matt Moyer
1b9a70d089
Switch back to an exec-based approach to grab the controller-manager CA. (#65)
This switches us back to an approach where we use the Pod "exec" API to grab the keys we need, rather than forcing our code to run on the control plane node. It will help us fail gracefully (or dynamically switch to alternate implementations) when the cluster is not self-hosted.

Signed-off-by: Matt Moyer <moyerm@vmware.com>
Co-authored-by: Ryan Richard <richardry@vmware.com>
2020-08-19 13:21:07 -05:00
Andrew Keesler
40d1360b74
hack/lib/codegen.sh: get rid of TODO about K8S_PKG_VERSION
See c43946c in the CI repo.
2020-08-18 13:18:41 -04:00
Ryan Richard
57578f16d4
Merge pull request #64 from suzerain-io/probes
Implement basic liveness and readiness probes
2020-08-18 09:19:24 -07:00
Ryan Richard
003aef75d2 For liveness and readiness, succeed quickly and fail slowly
- No reason to wait a long time before the first check, since our
  app should start quickly
2020-08-18 09:18:51 -07:00
Andrew Keesler
e3397c1c35
Hide codegen.sh in hack/lib
We don't want people to run codegen.sh directly, because it is meant
to be driven by hack/module.sh. To discourage this behavior, we will hide
codegen.sh away in hack/lib. I don't think this is actually what the
hack/lib directory is for, though...meh.

Signed-off-by: Andrew Keesler <akeesler@vmware.com>
2020-08-18 11:06:59 -04:00
Andrew Keesler
c4ce97f1a5
Remove old hack/{update,verify}-codegen.sh scripts
We now use hack/module.sh codegen{,_verify}. See f95f585.

Signed-off-by: Andrew Keesler <akeesler@vmware.com>
2020-08-18 10:56:47 -04:00
Andrew Keesler
f95f5857ef
Merge pull request #57 from suzerain-io/module-aware-codegen
`./hack/module.sh` learns `codegen` command
2020-08-18 10:11:05 -04:00
Andrew Keesler
cedd47b92e
hack/codegen.sh: fix stashing, symlinking, failure, and usage
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
2020-08-18 09:50:07 -04:00
aram price
7fa8f7797a
hack/module.sh learns codegen_verify 2020-08-18 09:50:07 -04:00
aram price
a456daa0b2
./hack/module.sh learns codegen command
Runs code generation on a per-module basis. If `CONTAINED` is not set
the code generation is run in a container.

Mount point in docker is randomzied to simulate Concourse.

Introduce K8S_PKG_VERSION to make room to build different versions
eventually.
2020-08-18 09:50:07 -04:00
Ryan Richard
ecde8fa8af Implement basic liveness and readiness probes
- Call the auto-generated /healthz endpoint of our aggregated API server
- Use http for liveness even though tcp seems like it might be
  more appropriate, because tcp probes cause TLS handshake errors
  to appear in our logs every few seconds
- Use conservative timeouts and retries on the liveness probe to avoid
  having our container get restarted when it is temporarily slow due
  to running in an environment under resource pressure
- Use less conservative timeouts and retries for the readiness probe
  to remove an unhealthy pod from the service less conservatively than
  restarting the container
- Tuning the settings for retries and timeouts seem to be a mysterious
  art, so these are just a first draft
2020-08-17 16:44:42 -07:00
Ryan Richard
29654c39a5 Update a CRD validation
- Allow both http and https because a user using `kubectl proxy` would
  want to use http, since the proxy upgrades requests from http to https
2020-08-17 16:29:21 -07:00
Ryan Richard
d8d49be5d9 Make an integration test more reliable
- It would sometimes fail with this error:
  namespaces is forbidden: User "tanzu-user-authentication@groups.vmware.com"
  cannot list resource "namespaces" in API group "" at the cluster scope
- Seems like it was because the RBAC rule added by the test needs a
  moment before it starts to take effect, so change the test to retry
  the API until it succeeds or fail after 3 seconds of trying.
2020-08-17 16:28:12 -07:00