feat(ws): Implement workspace start + pause as backend APIs #340

andyatmiami · 2025-05-14T19:13:57Z

related: #298

Added StartWorkspaceHandler + PauseWorkspaceHandler to handle the respective workspace actions
Introduced new routes for starting and pausing workspaces in the API.
- api/v1/workspaces/{namespace}/{name}/actions/start
- api/v1/workspaces/{namespace}/{name}/actions/pause
Created a new PauseStateEnvelope type for successful responses.
Added tests for the new APIs, including success and error cases.
Updated README/OpenAPI documentation to include the new endpoints.

Functionality tested/verified via generated Swagger client:

$ kubectl get workspace
NAME                           STATE
andy-test-direct-with-secret   Running
andy-test-pr-verify            Paused
andy-test-with-secret          Running
andy-test-with-secret-no-op    Running
jupyterlab-workspace           Paused

<interact with /start endpoint in Swagger UI>

$ kubectl get workspace
NAME                           STATE
andy-test-direct-with-secret   Running
andy-test-pr-verify            Running
andy-test-with-secret          Running
andy-test-with-secret-no-op    Running
jupyterlab-workspace           Paused

$ kubectl get events --sort-by='.lastTimestamp'
LAST SEEN   TYPE      REASON                             OBJECT                                     MESSAGE
...
44s         Normal    Scheduled                          pod/ws-andy-test-pr-verify-z5qfg-0         Successfully assigned default/ws-andy-test-pr-verify-z5qfg-0 to kind-control-plane
44s         Normal    SuccessfulCreate                   statefulset/ws-andy-test-pr-verify-z5qfg   create Pod ws-andy-test-pr-verify-z5qfg-0 in StatefulSet ws-andy-test-pr-verify-z5qfg successful
43s         Normal    Pulled                             pod/ws-andy-test-pr-verify-z5qfg-0         Container image "docker.io/kubeflownotebookswg/jupyter-scipy:v1.9.0" already present on machine
43s         Normal    Created                            pod/ws-andy-test-pr-verify-z5qfg-0         Created container: main
43s         Normal    Started                            pod/ws-andy-test-pr-verify-z5qfg-0         Started container main

google-oss-prow · 2025-05-14T19:14:02Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign kimwnasptd for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

ederign · 2025-05-14T19:54:00Z

/ok-to-test

harshad16

Great work 💯
Tested with cli:

related: kubeflow#298 - Added StartWorkspaceHandler + PauseWorkspaceHandler to handle the respective workspace actions - Introduced new routes for starting and pausing workspaces in the API. - `api/v1/workspaces/{namespace}/{name}/actions/start` - `api/v1/workspaces/{namespace}/{name}/actions/pause` - Created a new PauseStateEnvelope type for successful responses. - Added tests for the new APIs, including success and error cases. - Updated README/OpenAPI documentation to include the new endpoints. Signed-off-by: Andy Stoneberg <[email protected]>

ederign

Some questions inline.

workspaces/backend/api/workspace_actions_handler.go

ederign · 2025-05-29T12:03:16Z

workspaces/backend/api/workspace_actions_handler.go

+		return
+	}
+
+	// Get the workspace to check its paused state


I believe this comment is outdated.

yes, sorry.. its not doing a "check" here - moreso querying the workspace to return its actual/persisted state.

ederign · 2025-05-29T12:03:42Z

workspaces/backend/api/workspace_actions_handler.go

+
+	// Get the workspace to check its paused state
+	workspace, err := a.repositories.Workspace.GetWorkspace(r.Context(), namespace, workspaceName)
+	if err != nil {


On this call, we are not checking anything, are we?

Is our intention check this based on the. action?

(and lets not forget about the async especially on start)

perhaps the phrase "checking" is misleading here - but we do interact with this queried workspace to return this value:

https://github.com/kubeflow/notebooks/pull/340/files/3237336e3a5b817c9e98779e55971b711ad4ea07#diff-1fb105758eb9e23f65387cd3c4f07d07030f7223c3243a7ee0d878815ee71f1fR93

Does this warrant/justify a "read" operation when the backend could potentially simply "infer" this value? I think its still better to reflect actual state after operation by querying the resource..

But, I don't really follow your comments about "async" here... as mentioned in other comment... setting the .spec.paused attribute IS instantaneous (which is all we are concerned about in this function)

admittedly the State of the workspace resource is subject to eventual consistency - but that isn't (in my view) a focus/concern of this API

Andy, thanks for the clarification on the chat. let's fup here!

So, for transparency here.. I was slightly incorrect in my comments above...

It seems the RawPatch command leveraged in the application code is itself asynchronously applied by k8s... so there is a race condition present here in this version of the code...

At the time we call this GetWorkspace function - we cannot guarantee that the Patch has been applied.. and as such the .spec.paused attribute might not have been updated yet...

I'm honestly not sure how to handle this...

Blocking and waiting seems kinda gross (to me) - but I also feel dirty hard-coding the .data.paused value we return in the PausedState response struct...

We should definitely not block it. For the UI, maybe we can use the status to notify user that something is being processed.

I would not hardcode the value, because, for instance, on start, things can fail, and if we hardcode, it could be inconsistent.

Why don't return 202 and something like
{data:
"status": "accepted",
"message": "bla",
"
}

ederign · 2025-05-29T12:05:33Z

workspaces/backend/api/workspace_actions_handler.go

+	}
+
+	// Return 200 OK with pause state
+	err = a.WriteJSON(w, http.StatusOK, PauseStateEnvelope{


One question here. I assume Pause/Start actions are not instantaneous, shall we just return a 202 accepted?

well... setting the spec.paused property itself is instantaneous...

I generally think of 202 meaning the API request itself has not been processed...

whereas this case is "the backend k8s state is now churning to reach a state of eventual consistency"

Andy, thanks for the clarification on the chat. let's fup here!

see comment above: #340 (comment)

202 might be the safer option - but still need to then answer/determine if/how we want to return the paused attribute in the response

github-project-automation bot added this to Kubeflow Notebooks May 14, 2025

github-project-automation bot moved this to Needs Triage in Kubeflow Notebooks May 14, 2025

google-oss-prow bot added the do-not-merge/work-in-progress label May 14, 2025

google-oss-prow bot requested review from kimwnasptd and thesuperzapper May 14, 2025 19:14

google-oss-prow bot added the size/XL label May 14, 2025

google-oss-prow bot added the ok-to-test label May 14, 2025

andyatmiami force-pushed the feat/workspace-start-api branch 3 times, most recently from fb2e2fb to f22b3d3 Compare May 15, 2025 20:54

harshad16 reviewed May 20, 2025

View reviewed changes

andyatmiami mentioned this pull request May 21, 2025

feat(ws): Implement pause workspace functionality as backend API #328

Closed

andyatmiami force-pushed the feat/workspace-start-api branch from f22b3d3 to 8d5e467 Compare May 21, 2025 18:05

andyatmiami force-pushed the feat/workspace-start-api branch from 8d5e467 to 3237336 Compare May 21, 2025 18:05

andyatmiami marked this pull request as ready for review May 21, 2025 18:08

google-oss-prow bot removed the do-not-merge/work-in-progress label May 21, 2025

andyatmiami changed the title ~~feat(ws): Add start workspace functionality to backend API~~ feat(ws): Implement workspace start + pause as backend APIs May 21, 2025

andyatmiami requested a review from harshad16 May 21, 2025 18:10

ederign suggested changes May 29, 2025

View reviewed changes

google-oss-prow bot assigned ederign May 29, 2025

feat(ws): Implement workspace start + pause as backend APIs #340

Are you sure you want to change the base?

feat(ws): Implement workspace start + pause as backend APIs #340

Uh oh!

Conversation

andyatmiami commented May 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

google-oss-prow bot commented May 14, 2025

Uh oh!

ederign commented May 14, 2025

Uh oh!

harshad16 left a comment

Choose a reason for hiding this comment

Uh oh!

ederign left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

andyatmiami May 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ederign May 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

andyatmiami May 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ederign May 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

andyatmiami commented May 14, 2025 •

edited

Loading

andyatmiami May 29, 2025 •

edited

Loading

ederign May 29, 2025 •

edited

Loading

andyatmiami May 29, 2025 •

edited

Loading

ederign May 29, 2025 •

edited

Loading