Commit Graph

608 Commits

Author SHA1 Message Date
Ryan Richard
2b93fdf357 Fix a bug in the e2e tests
When the test was going to fail, a goroutine would accidentally block
on writing to an unbuffered channel, and the spawnTestGoroutine helper
would wait for that goroutine to end on cleanup, causing the test to
hang forever while it was trying to fail.
2022-02-07 11:57:54 -08:00
Margo Crawford
842ef38868 Ensure warning is on stderr and not stdout. 2022-01-20 13:48:50 -08:00
Margo Crawford
acd23c4c37 Separate test for access token refresh 2022-01-20 13:48:50 -08:00
Margo Crawford
38d184fe81 Integration test + making sure we get the session correctly in token handler 2022-01-20 13:48:50 -08:00
Margo Crawford
513c943e87 Keep all scopes except offline_access in integration test 2022-01-19 13:29:26 -08:00
Monis Khan
1e1789f6d1
Allow configuration of supervisor endpoints
This change allows configuration of the http and https listeners
used by the supervisor.

TCP (IPv4 and IPv6 with any interface and port) and Unix domain
socket based listeners are supported.  Listeners may also be
disabled.

Binding the http listener to TCP addresses other than 127.0.0.1 or
::1 is deprecated.

The deployment now uses https health checks.  The supervisor is
always able to complete a TLS connection with the use of a bootstrap
certificate that is signed by an in-memory certificate authority.

To support sidecar containers used by service meshes, Unix domain
socket based listeners include ACLs that allow writes to the socket
file from any runAsUser specified in the pod's containers.

Signed-off-by: Monis Khan <mok@vmware.com>
2022-01-18 17:43:45 -05:00
Ryan Richard
814399324f Merge branch 'main' into upstream_access_revocation_during_gc 2022-01-14 10:49:22 -08:00
Ryan Richard
91924ec685 Revert adding allowAccessTokenBasedRefresh flag to OIDCIdentityProvider
Signed-off-by: Margo Crawford <margaretc@vmware.com>
2022-01-12 18:03:25 -08:00
Margo Crawford
683a2c5b23 WIP adding access token to storage upon login 2022-01-12 18:03:25 -08:00
Margo Crawford
2958461970 Addressing PR feedback
store issuer and subject in storage for refresh
Clean up some constants

Signed-off-by: Margo Crawford <margaretc@vmware.com>
2022-01-10 11:03:37 -08:00
Margo Crawford
0cd086cf9c Check username claim is unchanged for oidc.
Also add integration tests for claims changing.
2022-01-10 11:03:37 -08:00
Monis Khan
c155c6e629
Clean up nits in AD code
- Make everything private
- Drop unused AuthTime field
- Use %q format string instead of "%s"
- Only rely on GetRawAttributeValues in AttributeUnchangedSinceLogin

Signed-off-by: Monis Khan <mok@vmware.com>
2021-12-17 08:53:44 -05:00
Margo Crawford
59d999956c Move ad specific stuff to controller
also make extra refresh attributes a separate field rather than part of
Extra

Signed-off-by: Margo Crawford <margaretc@vmware.com>
2021-12-09 16:16:36 -08:00
Margo Crawford
65f3464995 Fix issue with very high integer value parsing, add unit tests
also add comment about urgent replication
2021-12-09 16:16:36 -08:00
Margo Crawford
ee4f725209 Incorporate PR feedback 2021-12-09 16:16:36 -08:00
Margo Crawford
ef5a04c7ce Check for locked users on ad upstream refresh
Signed-off-by: Margo Crawford <margaretc@vmware.com>
2021-12-09 16:16:36 -08:00
Margo Crawford
f62e9a2d33 Active directory checks for deactivated user
Signed-off-by: Margo Crawford <margaretc@vmware.com>
2021-12-09 16:16:36 -08:00
Margo Crawford
da9b4620b3 Active Directory checks whether password has changed recently during
upstream refresh

Signed-off-by: Margo Crawford <margaretc@vmware.com>
2021-12-09 16:16:35 -08:00
Monis Khan
764a1ad7e4
tls: fix integration tests for long lived environments
This change updates the new TLS integration tests to:

1. Only create the supervisor default TLS serving cert if needed
2. Port forward the node port supervisor service since that is
   available in all environments

Signed-off-by: Monis Khan <mok@vmware.com>
2021-11-18 03:55:56 -05:00
Monis Khan
cd686ffdf3
Force the use of secure TLS config
This change updates the TLS config used by all pinniped components.
There are no configuration knobs associated with this change.  Thus
this change tightens our static defaults.

There are four TLS config levels:

1. Secure (TLS 1.3 only)
2. Default (TLS 1.2+ best ciphers that are well supported)
3. Default LDAP (TLS 1.2+ with less good ciphers)
4. Legacy (currently unused, TLS 1.2+ with all non-broken ciphers)

Highlights per component:

1. pinniped CLI
   - uses "secure" config against KAS
   - uses "default" for all other connections
2. concierge
   - uses "secure" config as an aggregated API server
   - uses "default" config as a impersonation proxy API server
   - uses "secure" config against KAS
   - uses "default" config for JWT authenticater (mostly, see code)
   - no changes to webhook authenticater (see code)
3. supervisor
   - uses "default" config as a server
   - uses "secure" config against KAS
   - uses "default" config against OIDC IDPs
   - uses "default LDAP" config against LDAP IDPs

Signed-off-by: Monis Khan <mok@vmware.com>
2021-11-17 16:55:35 -05:00
Margo Crawford
cb60a44f8a extract ldap refresh search into helper function
also added an integration test for refresh failing after updating the username attribute
2021-11-05 14:22:43 -07:00
Margo Crawford
b5b8cab717 Refactors:
- pull construction of authenticators.Response into searchAndBindUser
- remove information about the identity provider in the error that gets
  returned to users. Put it in debug instead, where it may show up in
  logs.

Signed-off-by: Margo Crawford <margaretc@vmware.com>
2021-11-05 14:22:43 -07:00
Margo Crawford
c84329d7a4 Fix broken ldap_client_test 2021-11-05 14:22:43 -07:00
Margo Crawford
f988879b6e Addressing code review changes
- changed to use custom authenticators.Response rather than the k8s one
  that doesn't include space for a DN
- Added more checking for correct idp type in token handler
- small style changes

Signed-off-by: Margo Crawford <margaretc@vmware.com>
2021-11-05 14:22:43 -07:00
Margo Crawford
722b5dcc1b Test for change to stored username or subject.
All of this is still done staticly.
2021-11-05 14:22:43 -07:00
Margo Crawford
19281313dd Basic upstream LDAP/AD refresh
This stores the user DN in the session data upon login and checks that
the entry still exists upon refresh. It doesn't check anything
else about the entry yet.
2021-11-05 14:22:42 -07:00
Monis Khan
1e17418585
TestSupervisorUpstreamOIDCDiscovery: include AdditionalAuthorizeParametersValid condition
Signed-off-by: Monis Khan <mok@vmware.com>
2021-10-25 10:21:51 -04:00
Ryan Richard
7ec0304472 Add offline_access scope for integration tests when using Dex 2021-10-19 12:25:51 -07:00
Ryan Richard
9e05d175a7 Add integration test: upstream refresh failure during downstream refresh 2021-10-13 15:12:19 -07:00
Monis Khan
266d64f7d1
Do not truncate x509 errors
Signed-off-by: Monis Khan <mok@vmware.com>
2021-09-29 09:38:22 -04:00
Ryan Richard
ddf5e566b0 Update a comment 2021-09-21 14:07:08 -07:00
Ryan Richard
f700246bfa Allow focused integration tests to be run from the GoLand UI again
This was broken recently by the improvements in #808.
2021-09-21 12:04:45 -07:00
Ryan Richard
fca183b203 Show DefaultStrategy as a new printer column for CredentialIssuer 2021-09-21 12:01:30 -07:00
Ryan Richard
1b2a116518 Merge branch 'main' into crd_printcolumns 2021-09-21 09:36:46 -07:00
Ryan Richard
4e98c1bbdb Tests use CertificatesV1 when available, otherwise use CertificatesV1beta1
CertificatesV1beta1 was removed in Kube 1.22, so the tests cannot
blindly rely on it anymore. Use CertificatesV1 whenever the server
reports that is available, and otherwise use the old
CertificatesV1beta1.

Note that CertificatesV1 was introduced in Kube 1.19.
2021-09-20 17:14:58 -07:00
Ryan Richard
0a31f45812 Update the AdditionalPrinterColumns of the CRDs, and add a test for it 2021-09-20 12:47:39 -07:00
Ryan Richard
04544b3d3c Update TestKubeCertAgent to use new "v3" label value 2021-09-15 11:09:07 -07:00
Margo Crawford
05f5bac405 ValidatedSettings is all or nothing
If either the search base or the tls settings is invalid, just
recheck everything.
2021-09-07 13:09:35 -07:00
Margo Crawford
27c1d2144a Make sure search base in the validatedSettings cache is properly updated when the bind secret changes 2021-09-07 13:09:35 -07:00
Monis Khan
0d285ce993
Ensure concierge and supervisor gracefully exit
Changes made to both components:

1. Logs are always flushed on process exit
2. Informer cache sync can no longer hang process start up forever

Changes made to concierge:

1. Add pre-shutdown hook that waits for controllers to exit cleanly
2. Informer caches are synced in post-start hook

Changes made to supervisor:

1. Add shutdown code that waits for controllers to exit cleanly
2. Add shutdown code that waits for active connections to become idle

Waiting for controllers to exit cleanly is critical as this allows
the leader election logic to release the lock on exit.  This reduces
the time needed for the next leader to be elected.

Signed-off-by: Monis Khan <mok@vmware.com>
2021-08-30 20:29:52 -04:00
Monis Khan
ba80b691e1
test/integration: use short timeouts with distinct requests to prevent hangs
Signed-off-by: Monis Khan <mok@vmware.com>
2021-08-27 16:10:36 -04:00
Monis Khan
5078cdbc90
test/integration: increase timeout on disruptive tests
Signed-off-by: Monis Khan <mok@vmware.com>
2021-08-27 14:56:51 -04:00
Margo Crawford
43694777d5 Change some comments on API docs, fix lint error by ignoring it 2021-08-26 16:55:43 -07:00
Margo Crawford
2d32e0fa7d Merge branch 'main' of github.com:vmware-tanzu/pinniped into active-directory-identity-provider 2021-08-26 16:21:08 -07:00
Margo Crawford
6f221678df Change sAMAccountName env vars to userPrincipalName
and add E2E ActiveDirectory test
also fixed regexes in supervisor_login_test to be anchored to the
beginning and end
2021-08-26 16:18:05 -07:00
Monis Khan
d4a7f0b3e1
test/integration: mark certain tests as disruptive
This prevents them from running with any other test, including other
parallel tests.

Signed-off-by: Monis Khan <mok@vmware.com>
2021-08-26 15:11:47 -04:00
Monis Khan
e2cf9f6b74
leader election test: approximate that followers have observed change
Instead of blindly waiting long enough for a disruptive change to
have been observed by the old leader and followers, we instead rely
on the approximation that checkOnlyLeaderCanWrite provides - i.e.
only a single actor believes they are the leader.  This does not
account for clients that were in the followers list before and after
the disruptive change, but it serves as a reasonable approximation.

Signed-off-by: Monis Khan <mok@vmware.com>
2021-08-26 12:59:52 -04:00
Monis Khan
74daa1da64
test/integration: run parallel tests concurrently with serial tests
Signed-off-by: Monis Khan <mok@vmware.com>
2021-08-26 12:59:52 -04:00
Ryan Richard
d20cab10b9 Replace one-off usages of busybox and debian images in integration tests
Those images that are pulled from Dockerhub will cause pull failures
on some test clusters due to Dockerhub rate limiting.

Because we already have some images that we use for testing, and
because those images are already pre-loaded onto our CI clusters
to make the tests faster, use one of those images and always specify
PullIfNotPresent to avoid pulling the image again during the integration
test.
2021-08-25 15:12:07 -07:00
Monis Khan
c71ffdcd1e
leader election: use better duration defaults
OpenShift has good defaults for these duration fields that we can
use instead of coming up with them ourselves:

e14e06ba8d/pkg/config/leaderelection/leaderelection.go (L87-L109)

Copied here for easy future reference:

// We want to be able to tolerate 60s of kube-apiserver disruption without causing pod restarts.
// We want the graceful lease re-acquisition fairly quick to avoid waits on new deployments and other rollouts.
// We want a single set of guidance for nearly every lease in openshift.  If you're special, we'll let you know.
// 1. clock skew tolerance is leaseDuration-renewDeadline == 30s
// 2. kube-apiserver downtime tolerance is == 78s
//      lastRetry=floor(renewDeadline/retryPeriod)*retryPeriod == 104
//      downtimeTolerance = lastRetry-retryPeriod == 78s
// 3. worst non-graceful lease acquisition is leaseDuration+retryPeriod == 163s
// 4. worst graceful lease acquisition is retryPeriod == 26s
if ret.LeaseDuration.Duration == 0 {
	ret.LeaseDuration.Duration = 137 * time.Second
}

if ret.RenewDeadline.Duration == 0 {
	// this gives 107/26=4 retries and allows for 137-107=30 seconds of clock skew
	// if the kube-apiserver is unavailable for 60s starting just before t=26 (the first renew),
	// then we will retry on 26s intervals until t=104 (kube-apiserver came back up at 86), and there will
	// be 33 seconds of extra time before the lease is lost.
	ret.RenewDeadline.Duration = 107 * time.Second
}
if ret.RetryPeriod.Duration == 0 {
	ret.RetryPeriod.Duration = 26 * time.Second
}

Signed-off-by: Monis Khan <mok@vmware.com>
2021-08-24 16:21:53 -04:00