Skip to content

Improve UX for reconnecting after server restart#548

Open
Darshan808 wants to merge 27 commits intojupyterlab:mainfrom
Darshan808:reconnect-automatically
Open

Improve UX for reconnecting after server restart#548
Darshan808 wants to merge 27 commits intojupyterlab:mainfrom
Darshan808:reconnect-automatically

Conversation

@Darshan808
Copy link
Member

@Darshan808 Darshan808 commented Feb 23, 2026

Fixes #535

Description

Jupyter_server_ydoc:

  • Store session IDs and collaboration version in .jupyter/collaboration_sessions.json on each connection
  • On reconnect with an old session ID, check compatibility before rejecting, allow silent reconnect if the root directory and package version are unchanged
  • Only close with 1003 when reconnection would be unsafe (different directory or version mismatch)

User Facing Changes

Most server restarts (same directory, same version) no longer interrupt the user at all.

@github-actions
Copy link
Contributor

Binder 👈 Launch a Binder on branch Darshan808/jupyter-collaboration/reconnect-automatically

@Darshan808 Darshan808 added the enhancement New feature or request label Feb 23, 2026
Copy link
Member

@krassowski krassowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this goes in the right direction. The frontend changes fallback to unknown_session so it should be a fine upgrade experience for existing serves (whether jupyter-server-ydoc of jupyverse)

@Darshan808 Darshan808 closed this Mar 11, 2026
@Darshan808 Darshan808 reopened this Mar 11, 2026
@Darshan808 Darshan808 marked this pull request as ready for review March 11, 2026 07:20
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves real-time collaboration reconnection behavior after a server restart by persisting recent collaboration session IDs/versions server-side and by showing a translated, actionable client dialog only when reconnection is unsafe.

Changes:

  • Persist recent server session IDs (and package version) under .jupyter/collaboration_sessions.json and use them to decide whether an old client sessionId can safely reconnect.
  • Update the WebSocket close payload for unsafe reconnects / initialization errors to structured JSON and handle it client-side with translated dialogs and an optional “Reload” action.
  • Add unit tests for session-store compatibility logic (Python) and close-payload handling (TS).

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
projects/jupyter-server-ydoc/jupyter_server_ydoc/utils.py Adds session store + compatibility check helpers and exposes SERVER_SESSION/YDOC_SERVER_VERSION.
projects/jupyter-server-ydoc/jupyter_server_ydoc/handlers.py Persists current session on connect and conditionally closes with structured JSON payload when reconnect is unsafe.
packages/docprovider/src/yprovider.ts Parses 1003 close payload JSON and presents translated dialogs with optional reload.
packages/docprovider/src/tokens.ts Defines ISessionClosePayload payload shape for close-reason handling.
tests/test_session_store.py Adds Python unit tests for session compatibility decisions.
packages/docprovider/src/__tests__/sessionClose.spec.ts Adds Jest tests validating dialog + reload behavior for 1003 close payloads.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Member

@krassowski krassowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense to me. The UI test failures are unrelated, due to a new JupyterLab version release which fixed incorrect fonts in some input elements.

@krassowski
Copy link
Member

@davidbrochart does it look good to you too?

Copy link
Collaborator

@davidbrochart davidbrochart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be better to use an SQLite database to store the sessions?

@krassowski
Copy link
Member

Would it be better to use an SQLite database to store the sessions?

On one hand SQLite is already a dependency because pycrdt-store and jupyter-server-file-id use it, on the other hand this is such a simple structure that I am not sure if it is better. No strong opinion here.

Copy link
Collaborator

@davidbrochart davidbrochart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that this PR introduces a new .jupyter directory. I'm wondering if we could take advantage of the YStore, which is by default an SQLite database in .jupyter_ystore.db?

Comment on lines +7 to +12
import json
import uuid
from datetime import datetime, timezone
from ._version import __version__ # noqa
import asyncio
import os
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since these imports are not ordered, it looks like the linter doesn't run on this file?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the linter isn't configured for these. I just raised a PR #553


# Collaboration package version changed → reject
if previous_version != current_version:
return False, "version_mismatch"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already have shared document versions, e.g. the YNotebook version. Should we use it or is it the jupyter-collaboration package version that we want to check?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is definitely jupyter-collaboration (specifically jupyter-server-ydoc and jupyter-docprovider but collaboration metapackage is a good proxy) that we want to check as changes sometimes come with changes to custom messages sent yjs protocol or can come with changes to yjs version itself.

Maybe we should also consider versions of YNotebook and other documents. I am not sure how easy it would be to add here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can get the document version
document_version = getattr(self.room._document, "version", None).
I made it optional so the connection is still allowed if the previous session didn’t store a document version.

@krassowski
Copy link
Member

It seems that this PR introduces a new .jupyter directory.

It is not new in the sense that both JupyterLab Desktop and jupyter-ai use .jupyter already. I am also planning to align usage of hidden files with this pattern in jupyter-lsp It is new use in this package, yes.

I'm wondering if we could take advantage of the YStore, which is by default an SQLite database in .jupyter_ystore.db?

YStore is configurable and some deployments do not use the SQLite database variant.

@krassowski
Copy link
Member

Hmm the tests on macos seem to hit the limit of 20 minutes:

image

but I did not see that on recent merge to main:

image

I see that the test itself "passes":

image

but then it hangs. This hints at a potential issue with asyncio cleanup of tasks in this PR

Copy link
Member

@krassowski krassowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels like AnyioPath or Locking mechanism could be to blame. I will ask copilot for a pass.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@krassowski
Copy link
Member

Hmm with the latest change the tests are failing and timing out. I guess it tells us that the locking is problematic.

@Darshan808
Copy link
Member Author

Not sure where things are going wrong!
These tests are stalling locally as well.

async def test_room_concurrent_initialization(
jp_serverapp,
rtc_create_file,
rtc_connect_doc_client,
):
file_format = "text"
file_type = "file"
file_path = "dummy.txt"
await rtc_create_file(file_path)
async def connect(file_format, file_type, file_path):
websocket, room_name = await rtc_connect_doc_client(file_format, file_type, file_path)
async with websocket:
pass
t0 = time()
async with create_task_group() as tg:
tg.start_soon(connect, file_format, file_type, file_path)
tg.start_soon(connect, file_format, file_type, file_path)
t1 = time()
delta = t1 - t0
assert delta < 0.6
await cleanup(jp_serverapp)
async def test_room_sequential_opening(
jp_serverapp,
rtc_create_file,
rtc_connect_doc_client,
):
file_format = "text"
file_type = "file"
file_path = "dummy.txt"
await rtc_create_file(file_path)
async def connect(file_format, file_type, file_path):
t0 = time()
websocket, room_name = await rtc_connect_doc_client(file_format, file_type, file_path)
async with websocket:
pass
t1 = time()
return t1 - t0
dt = await connect(file_format, file_type, file_path)
assert dt < 1
dt = await connect(file_format, file_type, file_path)
assert dt < 1
await cleanup(jp_serverapp)

Looking into it 👀

return encoded_path.split("/")[-1]


async def _get_jupyter_session_store(root_dir: str) -> Path:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is not async. Could you remind me why we don't use async file operations?

Copy link
Member

@krassowski krassowski Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was changed back in 14e7ae3 because the tests are failing and timing out. It appears that AnyioPath does not play well with the event loop here. If there is an async implementation that can work here I see no reason to not use it but given that we are looking at a tiny JSON file I do not consider it blocking myself as the blocking time would be minimal, right?

(I think this might be down to interaction between AnyioPath and locking, especially problematic in pytest context)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It appears that AnyioPath does not play well with the event loop here.

Maybe because we didn't use AnyIO's pytest plugin?

Usually all file operations in a server should be async, but if you don't want to do that here then at least the added async functions that are not actually async (they don't await) should be converted back to sync.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that makes sense. @Darshan808 can you explore if anyio pytest plugin can help here and if not just drop the async?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Improve the UX for reconnecting after server restart

4 participants