-
Notifications
You must be signed in to change notification settings - Fork 5
Description
Note: This issue description was AI-generated based on internal issue #140. Please review, validate, and adjust the details as needed.
Background
The current coderun agent sends code to IBM Quantum Platform (IQP) and then waits for a response before returning. This synchronous approach blocks the application while waiting for quantum job execution, which can take significant time depending on queue wait times and circuit complexity.
Problem Statement
The synchronous API call pattern has several critical issues:
- Timeout Failures: Jobs fail after 70-100 seconds in cloud deployments when running on real quantum hardware
- Application Blocking: The entire application is blocked while waiting for IQP job completion
- No Progress Tracking: Users have no visibility into job status or queue position
- Poor Scalability: Cannot handle multiple concurrent jobs efficiently
- Bad User Experience: UI freezes during long-running quantum executions, especially problematic for real hardware where queue times can be significant
Proposed Solution
Implement an asynchronous pattern where:
- Code is sent to IQP and immediately returns with a job ID
- Job status can be queried later using the job ID
- Results are retrieved when the job completes
This follows the standard async job pattern:
- Submit: Send code → Receive job ID
- Poll: Query job status using job ID
- Retrieve: Get results when job is complete
Implementation Approach
Backend Changes (api/coderun-agent/)
Current Flow:
@app.post("/run")
def run_code(code: str):
# Execute code on IQP
result = execute_on_iqp(code) # Blocks here
return resultProposed Async Flow:
@app.post("/run")
async def run_code(code: str):
# Submit job to IQP
job_id = await submit_to_iqp(code)
return {"job_id": job_id, "status": "submitted"}
@app.get("/job/{job_id}/status")
async def get_job_status(job_id: str):
status = await check_iqp_job_status(job_id)
return {"job_id": job_id, "status": status}
@app.get("/job/{job_id}/result")
async def get_job_result(job_id: str):
result = await fetch_iqp_job_result(job_id)
return {"job_id": job_id, "result": result}Frontend Changes
Update execution nodes to:
- Submit job and receive job ID
- Poll for job status
- Retrieve results when complete
- Display job status to user
Implementation Steps
-
Update Coderun Agent API:
- Modify
/runendpoint to return job ID immediately - Add
/job/{job_id}/statusendpoint for status queries - Add
/job/{job_id}/resultendpoint for result retrieval - Implement job tracking/storage mechanism
- Handle IQP async API calls
- Modify
-
Update Frontend:
- Modify execution nodes to handle async pattern
- Implement job status polling
- Add UI indicators for job status (queued, running, completed)
- Handle job result retrieval
- Add error handling for failed jobs
-
Testing:
- Test with various circuit complexities
- Test with multiple concurrent jobs
- Test error scenarios (timeouts, failures)
- Validate job status transitions
Files to Modify
Backend:
api/coderun-agent/agent.py- Update endpoints for async pattern
Frontend:
components/nodes/execution-node.tsx- Handle async executioncomponents/nodes/runtime-node.tsx- Handle async runtime callslib/api-service.ts- Add job status/result methods
Acceptance Criteria
- Code submission returns immediately with job ID
- Job status can be queried using job ID
- Results can be retrieved when job completes
- UI shows job status (queued, running, completed, failed)
- Multiple jobs can be submitted concurrently
- Error handling for failed jobs
- Documentation updated
Technical Considerations
- Use Qiskit Runtime's async job submission capabilities
- Implement appropriate polling intervals (avoid excessive API calls)
- Consider job result caching
- Handle job expiration/cleanup
- Implement proper error handling for network issues
- Consider adding job cancellation capability
Benefits
Implementing async job submission will:
- Eliminate Timeout Errors: Return immediately with a job ID instead of blocking for 70-100+ seconds
- Improve Responsiveness: Application remains responsive during long queue waits on real quantum hardware
- Enable Long-Running Jobs: Properly handle jobs that take minutes or hours to complete
- Better User Experience: Users can continue working while jobs execute
- Support Multiple Jobs: Submit and track multiple concurrent quantum jobs