Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: shirady The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
leelavg
left a comment
There was a problem hiding this comment.
didn't review the controller yet as pr is in draft.
| &nbv1.ObjectBucketClaim{}: { | ||
| Namespaces: map[string]cache.Config{corev1.NamespaceAll: {}}, | ||
| }, |
There was a problem hiding this comment.
even the returned OB/Secret/Configmap should be watched in other namespaces, isn't it?
There was a problem hiding this comment.
The secret and config map should be watched in other namespaces as well, yes -I added the change.
OB is cluster-scoped, so I didn't change.
a8fa7f2 to
d7c518a
Compare
From what I understand from @nb-ohad for testing we can install the CRD that we need.
Yes, you are correct. We would need to make changes. As we didn't want to be blocked at this point, it was suggested that we add the finalizer for testing purposes. |
159612a to
622dfff
Compare
e1d0de3 to
2f407e5
Compare
|
@leelavg @rewantsoni, Please take a look |
Signed-off-by: shirady <[email protected]>
Signed-off-by: shirady <[email protected]>
Signed-off-by: shirady <[email protected]>
Signed-off-by: shirady <[email protected]>
2f407e5 to
d45478a
Compare
Signed-off-by: shirady <[email protected]>
d45478a to
5d62d88
Compare
Signed-off-by: shirady <[email protected]>
Signed-off-by: shirady <[email protected]>
Signed-off-by: shirady <[email protected]>
Signed-off-by: shirady <[email protected]>
Signed-off-by: shirady <[email protected]>
Signed-off-by: shirady <[email protected]>
…rEndpoint Signed-off-by: shirady <[email protected]>
Signed-off-by: shirady <[email protected]>
Signed-off-by: shirady <[email protected]>
like in storage client controller Signed-off-by: shirady <[email protected]>
Signed-off-by: shirady <[email protected]>
Signed-off-by: shirady <[email protected]>
…erClient Signed-off-by: shirady <[email protected]>
will be used in deletion phase of the storage client before calling offboarding Signed-off-by: shirady <[email protected]>
Signed-off-by: shirady <[email protected]>
Signed-off-by: shirady <[email protected]>
Signed-off-by: shirady <[email protected]>
rewantsoni
left a comment
There was a problem hiding this comment.
The PR looks good expect one comment regarding adding the storageClient label to the OBC
| ObcFinalizer = nbv1.ObjectBucketFinalizer | ||
| ObjectBucketClaimStatusPhaseFailed = "Failed" |
There was a problem hiding this comment.
Instead of creating a copy of the Phase and Finalizer, can we not use it from the pkg directly?
There was a problem hiding this comment.
I will use it directly, thanks.
For the status, currently I don't have it, but I raised a PR to have it -
noobaa/noobaa-operator#1823
| if r.obc.GetDeletionTimestamp().IsZero() && r.obc.Status.Phase == "" { | ||
| r.obc.Status.Phase = ObjectBucketClaimStatusPhaseFailed | ||
| } |
There was a problem hiding this comment.
| if r.obc.GetDeletionTimestamp().IsZero() && r.obc.Status.Phase == "" { | |
| r.obc.Status.Phase = ObjectBucketClaimStatusPhaseFailed | |
| } | |
| if r.obc.Status.Phase == "" { | |
| r.obc.Status.Phase = ObjectBucketClaimStatusPhaseFailed | |
| } |
We have a check for deletionTimestamp above
| if r.obc.Status.Phase == ObjectBucketClaimStatusPhaseFailed { | ||
| r.obc.Status.Phase = "" | ||
| } |
There was a problem hiding this comment.
Can this cause a deadlock? I am thinking of a scenario, if for some reason the OBC on the provider reached a failed state, we update the Status of OBC from the storage_client controller to a failed state, and then the OBC controller updates the Phase to "".
There was a problem hiding this comment.
From what I checked in the code, there no place where the status phase is Failed.
I added the status change since it was raised as a comment in the design to reflect it.
To be on the safe side I will not change the phase, and the way to debug it would be to check the logs and see the error.
and remove redandunt check of deletion timestamp Signed-off-by: shirady <[email protected]>
Signed-off-by: shirady <[email protected]>
Signed-off-by: shirady <[email protected]>
Signed-off-by: shirady <[email protected]>
Signed-off-by: shirady <[email protected]>
|
@nb-ohad , please take a look |
| func NewProviderClientForStorageClient( | ||
| ctx context.Context, | ||
| storageProviderEndpoint string, | ||
| ) (*providerClient.OCSProviderClient, error) { | ||
| ocsProviderClient, err := providerClient.NewProviderClient(ctx, storageProviderEndpoint, OcsClientTimeout) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("failed to create provider client with endpoint %v: %w", storageProviderEndpoint, err) | ||
| } | ||
| return ocsProviderClient, nil | ||
| } |
There was a problem hiding this comment.
This is a meaningless util, it does nothing but call another function. And it does not even allow us to configure the error message in accordance with the given context of the call site which means that the call site now needs to wrap the underlaying error with another error. I don't get why this is being introduced.
This function also limits all call site to share the same timeout value, which seems wrong. The next developer who will need a different value for timeout will now add the timeout as an argument (which I don't blame them), but then you cannot even use the timeout as a reason for the existence of this function.
In addition, now a reader need to have a context switch when reading the call site code with no apparent value.
It seems like this function was only introduced to "skip" typing the timeout constant in multiple places (which is kind of the wrong reason for refactoring). We should not create code that limits the usability of our libraries. If we wanted a fixed timeout, or a fixed error message, we could have just hard coded it into the NewProviderClient function itself, we chose not to do so and provide the caller the flexibility. This function seems to go directly against that design decision.
@shirady Could you please make a case for this function, or just revert the change.
CC: @leelavg
| predicate.NewPredicateFuncs(func(obj client.Object) bool { | ||
| return !obj.GetDeletionTimestamp().IsZero() | ||
| }), |
There was a problem hiding this comment.
This predicate means that we will not handle "deleting". In that case who should be handling the deleting phase? shouldn't this controller be owning that code path?
| r.obc.Name = req.Name | ||
| r.obc.Namespace = req.Namespace | ||
|
|
||
| r.log.Info("Starting reconcile iteration for OBC", "req", req) |
There was a problem hiding this comment.
I would suggest adding the NamespaceName from the request (the only thing that is actually on the request) as a WithValues on the logger, to make sure that every log message will have the context of the obc that it refers to. This also means there will be no need to add the req value in this log statement.
| if r.obc.Status.Phase != initialPhase { | ||
| statusErr = r.Client.Status().Update(r.ctx, &r.obc) | ||
| if statusErr != nil { | ||
| r.log.Error(statusErr, "Failed to update OBC status.") | ||
| } | ||
| } |
There was a problem hiding this comment.
This is wrong! By design of the controller pattern, each reconcile start with a conceptual empty status, because each reconcile runs all of the steps from step 0 every time. This also mean it is a bug/mistake to take the output of a previous reconcile and use it as input for the current reconcile.
In addition, the fact that the phase field on the status didn't change, don't not mean that other fields didn't change. This mean that the phase field cannot be a good indication for an skipping the update operation. Even if one makes a case that there are no other fields, I would argue that the next developer might add additional fields then we have a subtle bug here that will be hard to detect.
If you really want to run the update command only when the status change you need to condition on a deep equal operation, but I think this is an over kill as this controller should only run when there is a change on the obc so I don't think there is any harm in running the update operation on each reconcile.
| if r.obc.Status.Phase != initialPhase { | |
| statusErr = r.Client.Status().Update(r.ctx, &r.obc) | |
| if statusErr != nil { | |
| r.log.Error(statusErr, "Failed to update OBC status.") | |
| } | |
| } | |
| if statusErr := r.Client.Status().Update(r.ctx, &r.obc); statusErr != nil { | |
| r.log.Error(statusErr, "Failed to update OBC status.") | |
| } |
CC: @leelavg
| if !r.obc.GetDeletionTimestamp().IsZero() { | ||
| return r.deletionPhase(ocsProviderClient, storageClient) | ||
| } |
There was a problem hiding this comment.
This goes against the predicate that was set on the setupWithManager. This is one example of why that predicate need to get removed
| if !r.obc.GetDeletionTimestamp().IsZero() { | ||
| return r.deletionPhase(ocsProviderClient, storageClient) | ||
| } |
There was a problem hiding this comment.
If we have a branch based on GetDeletionTimestamp I would have structure the code to be more readable and apparent. this means an if ... else ... structure. And if you already decided that deletion phase gets its own function (instead of implemented here) then it make more sense to give the same treatment to the second code flow. I would also prefer that we handle the common case first (creation/update) and the uncommon case later (as a user reads code from top to bottom).
With all of these changes the code should look something like
if r.obc.GetDeletionTimeStamp.IsZero() {
r.handlerObcCreationOrUpdate(...)
} else {
r.handleObcDeletion(...)
}
| } | ||
|
|
||
| if _, err := ocsProviderClient.NotifyObcCreated(r.ctx, storageClient.Status.ConsumerID, &r.obc); err != nil { | ||
| r.log.Error(err, "failed to notify provider of OBC created/updated", "namespaced/name", client.ObjectKeyFromObject(&r.obc)) |
There was a problem hiding this comment.
The obc namespace name / object key, should already be on the logger
| r.log.Error(err, "failed to notify provider of OBC created/updated", "namespaced/name", client.ObjectKeyFromObject(&r.obc)) | |
| r.log.Error(err, "failed to notify provider of OBC created/updated") |
@shirady Please apply that to all of the log messages
| ) (ctrl.Result, error) { | ||
| r.log.Info("OBC deleted", "namespaced/name", client.ObjectKeyFromObject(&r.obc)) | ||
|
|
||
| obcNamespacedName := types.NamespacedName{Namespace: r.obc.Namespace, Name: r.obc.Name} |
There was a problem hiding this comment.
The Namespaced name already exist on the req, I am not sure why we need to get it from the obc.
And even if we do, we can get it using client.ObjectKeyFromObject(&r.obc)
| // getStorageClientFromStorageClass returns the StorageClient that owns the given StorageClass (via ownerReference). | ||
| func (r *obcReconcile) getStorageClientFromStorageClass(storageClassName string) (*v1alpha1.StorageClient, error) { | ||
| sc := &storagev1.StorageClass{} | ||
| if err := r.Get(r.ctx, types.NamespacedName{Name: storageClassName}, sc); err != nil { |
There was a problem hiding this comment.
Please prefer ObjectKeyFromObject to manually creating a NamespcedName / ObjectKey
| } | ||
|
|
||
| // getStorageClientFromStorageClass returns the StorageClient that owns the given StorageClass (via ownerReference). | ||
| func (r *obcReconcile) getStorageClientFromStorageClass(storageClassName string) (*v1alpha1.StorageClient, error) { |
There was a problem hiding this comment.
Why aren't we just creating a client index for SC by client? If we do that this entire function becomes redundant and we only need to have a simple list operation (with a limit of 1) to get the client that is associated with a given sc.
Describe the problem
Part of RHSTOR-6230
Continue of PRs red-hat-storage/ocs-operator#3692 by calling the Notify when an OBC is created and deleted.
Explain the changes
Note: to avoid direct import of lib-bucket-provision, I had to add them manually.
In case we decide to import it directly, it can be in the same style as others.
Notify.Notes:
Failed.Pending.ownerReferencesfrom the storageclass.Notifyand remove finalizer.make manifestsandmake bundle.Testing Instructions:
Manual Test:
kubectl apply -f ./noobaa-operator/deploy/obc/objectbucket.io_objectbucketclaims_crd.yamlownerReferencesof the storage client (you can copy from another storage class on the cluster) manually with a temp finalizer (else it is deleted).Example of storage class:
Example of OBC:
Note: OBC can be created in any namespace. If you want to test on another namespace - create the namespace (
kubectl create ns <namespacename>) and change thenamespacevalue in the yaml.Boundphase.Since we do not have the copy mechanism implemented yet, we will test only the OBC deleted (without additional resources):
a. Install the OB CRD (
kubectl apply -f noobaa-operator/deploy/obc/objectbucket.io_objectbuckets_crd.yaml)b. Change the clusterRole (use Ai generated file from the kubebuilder marks).