Skip to content

Commit 290741e

Browse files
authored
feat: Add --batch option for single Ctrl+C termination (pdsh -b compatibility) (#102)
* feat: Add --batch option for single Ctrl+C termination (pdsh -b compatibility) Add --batch / -b option that changes Ctrl+C behavior to immediately terminate all parallel jobs with a single press. This improves automation and CI/CD integration by providing immediate termination without requiring confirmation. Changes: - Add --batch / -b CLI flag to Cli struct - Pass batch flag through ExecuteCommandParams to ParallelExecutor - Implement two-stage Ctrl+C handling in stream mode: * Default: First Ctrl+C shows status, second terminates (within 1s) * Batch mode: Single Ctrl+C immediately terminates all jobs - Update TUI mode signature to accept batch parameter (reserved for future use) - Add CLI help examples for batch mode usage - Update README.md with Batch Mode section and examples - Document signal handling implementation in ARCHITECTURE.md The default behavior provides visibility into execution progress before termination, while batch mode is optimized for scripts and non-interactive environments. Implements: #95 * fix: Address PR #102 review issues for signal handling This commit fixes all HIGH and MEDIUM severity issues identified in PR #102 review: HIGH SEVERITY FIXES: 1. Signal Handler Race Condition - Time Window Reset Logic Bug - Fixed: When time window expires (>1 second) and user presses Ctrl+C again, the code now displays running/completed status in the reset path - Location: src/executor/parallel.rs lines 701-713 (stream mode) - Impact: Consistent user experience across all Ctrl+C press scenarios 2. Inconsistent Signal Handling Across Execution Modes - Fixed: Added complete signal handling to normal execute() method - Previously only handle_stream_mode() had signal handling - Location: src/executor/parallel.rs lines 172-280 (execute method) - Impact: Batch mode and two-stage Ctrl+C now work in all execution modes MEDIUM SEVERITY FIXES: 3. Missing Exit Code Handling After Signal Termination - Fixed: All signal terminations now exit with code 130 (SIGINT standard) - Applied to both batch and non-batch modes in all execution paths - Location: Multiple locations in src/executor/parallel.rs - Impact: Scripts can now distinguish user interruption from command failure 4. No Documentation Conflict Warning for TUI Mode - Fixed: Updated CLI help text to clarify TUI mode ignores batch flag - Location: src/cli.rs line 103 - Impact: Clear user expectations for TUI mode behavior 5. Documentation Mismatch in ARCHITECTURE.md - Fixed: Updated pseudocode and documentation to match actual implementation - Added exit code handling details and implementation coverage notes - Location: ARCHITECTURE.md lines 303-394 - Impact: Accurate documentation for future maintainers TESTING: - All tests pass (cargo test) - No clippy warnings (cargo clippy -- -D warnings) - Code properly formatted (cargo fmt --check) Changes: - ARCHITECTURE.md: Updated signal handling documentation with accurate pseudocode - src/cli.rs: Added TUI mode note to batch flag help text - src/executor/parallel.rs: Added signal handling to execute(), fixed reset path status display, added exit code 130 * fix: Add signal handling to file transfer and file output mode operations This commit adds proper Ctrl+C signal handling to all file transfer methods and the file output mode handler: - upload_file: Added tokio::select! with two-stage Ctrl+C handling - download_file: Added tokio::select! with two-stage Ctrl+C handling - download_files: Added tokio::select! with two-stage Ctrl+C handling - handle_file_mode: Refactored from simple while loop to tokio::select! pattern All methods now support: - Batch mode: Single Ctrl+C immediately terminates with exit code 130 - Non-batch mode: First Ctrl+C shows status, second terminates - Proper abort of pending task handles on termination - Status reporting showing running/completed task counts These changes complete the signal handling improvements from PR #102, ensuring all parallel execution paths can be gracefully interrupted.
1 parent 55fd38d commit 290741e

File tree

7 files changed

+560
-20
lines changed

7 files changed

+560
-20
lines changed

ARCHITECTURE.md

Lines changed: 93 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -300,6 +300,99 @@ let tasks: Vec<JoinHandle<Result<ExecutionResult>>> = nodes
300300
- Buffered I/O for output collection
301301
- Early termination on critical failures
302302

303+
**Signal Handling (Added 2025-12-16, Issue #95; Updated 2025-12-16, PR #102):**
304+
305+
The executor supports two modes for handling Ctrl+C (SIGINT) signals during parallel execution:
306+
307+
1. **Default Mode (Two-Stage)**:
308+
- First Ctrl+C: Displays status (running/completed job counts)
309+
- Second Ctrl+C (within 1 second): Terminates all jobs immediately with exit code 130
310+
- Time window reset: If >1 second passes, next Ctrl+C restarts the sequence and shows status again
311+
- Provides users visibility into execution progress before termination
312+
313+
2. **Batch Mode (`--batch` / `-b`)**:
314+
- Single Ctrl+C: Immediately terminates all jobs with exit code 130
315+
- Optimized for non-interactive environments (CI/CD, scripts)
316+
- Compatible with pdsh `-b` option for tool compatibility
317+
318+
**Exit Code Handling:**
319+
- Normal completion: Exit code determined by ExitCodeStrategy (MainRank/RequireAllSuccess/etc.)
320+
- Signal termination (Ctrl+C): Always exits with code 130 (standard SIGINT exit code)
321+
- This ensures scripts can detect user interruption vs. command failure
322+
323+
**Implementation Coverage:**
324+
Signal handling is implemented in both execution modes:
325+
- `execute()` method (normal/progress bar mode) - lines 172-280
326+
- `handle_stream_mode()` method (stream mode) - lines 714-838
327+
- TUI mode has its own quit handling (q or Ctrl+C) and ignores the batch flag
328+
329+
Implementation is in `executor/parallel.rs` using `tokio::select!` to handle signals alongside normal execution:
330+
331+
```rust
332+
loop {
333+
tokio::select! {
334+
_ = signal::ctrl_c() => {
335+
if self.batch {
336+
// Batch mode: terminate immediately
337+
eprintln!("\nReceived Ctrl+C (batch mode). Terminating all jobs...");
338+
for handle in pending_handles.drain(..) {
339+
handle.abort();
340+
}
341+
// Exit with SIGINT exit code (130)
342+
std::process::exit(130);
343+
} else {
344+
// Two-stage mode: first shows status, second terminates
345+
if !first_ctrl_c {
346+
first_ctrl_c = true;
347+
ctrl_c_time = Some(std::time::Instant::now());
348+
eprintln!("\nReceived Ctrl+C. Press Ctrl+C again within 1 second to terminate.");
349+
350+
// Show status
351+
let running_count = pending_handles.len();
352+
let completed_count = self.nodes.len() - running_count;
353+
eprintln!("Status: {} running, {} completed", running_count, completed_count);
354+
} else {
355+
// Second Ctrl+C: check time window
356+
if let Some(first_time) = ctrl_c_time {
357+
if first_time.elapsed() <= Duration::from_secs(1) {
358+
// Within time window: terminate
359+
eprintln!("Received second Ctrl+C. Terminating all jobs...");
360+
for handle in pending_handles.drain(..) {
361+
handle.abort();
362+
}
363+
// Exit with SIGINT exit code (130)
364+
std::process::exit(130);
365+
} else {
366+
// Time window expired: reset and show status again
367+
first_ctrl_c = true;
368+
ctrl_c_time = Some(std::time::Instant::now());
369+
eprintln!("\nReceived Ctrl+C. Press Ctrl+C again within 1 second to terminate.");
370+
371+
// Show current status
372+
let running_count = pending_handles.len();
373+
let completed_count = self.nodes.len() - running_count;
374+
eprintln!("Status: {} running, {} completed", running_count, completed_count);
375+
}
376+
}
377+
}
378+
}
379+
}
380+
// Wait for all tasks to complete
381+
results = join_all(pending_handles.iter_mut()) => {
382+
return self.collect_results(results);
383+
}
384+
}
385+
386+
// Small sleep to avoid busy waiting
387+
tokio::time::sleep(Duration::from_millis(50)).await;
388+
}
389+
```
390+
391+
The batch flag is passed through the executor chain:
392+
- CLI `--batch` flag → `ExecuteCommandParams.batch``ParallelExecutor.batch`
393+
- Applied in both normal mode (`execute()`) and stream mode (`handle_stream_mode()`)
394+
- TUI mode maintains its own quit handling and ignores this flag
395+
303396
### 4. SSH Client (`ssh/client/*`, `ssh/tokio_client/*`)
304397

305398
**SSH Client Module Structure (Refactored 2025-10-17):**

README.md

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -330,6 +330,32 @@ bssh -C production "df -h" > disk-usage.log
330330
CI=true bssh -C production "command"
331331
```
332332

333+
### Batch Mode (Ctrl+C Handling)
334+
335+
bssh provides two modes for handling Ctrl+C during parallel execution:
336+
337+
**Default (Two-Stage)**:
338+
- First Ctrl+C: Shows status (running/completed counts)
339+
- Second Ctrl+C (within 1 second): Terminates all jobs
340+
341+
**Batch Mode (`-b` / `--batch`)**:
342+
- Single Ctrl+C: Immediately terminates all jobs
343+
- Useful for non-interactive scripts and CI/CD pipelines
344+
345+
```bash
346+
# Default behavior (two-stage Ctrl+C)
347+
bssh -C production "long-running-command"
348+
# Ctrl+C once: shows status
349+
# Ctrl+C again (within 1s): terminates
350+
351+
# Batch mode (immediate termination)
352+
bssh -C production -b "long-running-command"
353+
# Ctrl+C once: immediately terminates all jobs
354+
355+
# Useful for automation
356+
bssh -H nodes --batch --stream "deployment-script.sh"
357+
```
358+
333359
### Built-in Commands
334360
```bash
335361
# Test connectivity to hosts

src/app/dispatcher.rs

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -403,6 +403,7 @@ async fn handle_exec_command(cli: &Cli, ctx: &AppContext, command: &str) -> Resu
403403
require_all_success: cli.require_all_success,
404404
check_all_nodes: cli.check_all_nodes,
405405
sudo_password,
406+
batch: cli.batch,
406407
};
407408
execute_command(params).await
408409
}

src/cli.rs

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ use std::path::PathBuf;
2323
before_help = "\n\nBroadcast SSH - Parallel command execution across cluster nodes",
2424
about = "Broadcast SSH - SSH-compatible parallel command execution tool",
2525
long_about = "bssh is a high-performance SSH client with parallel execution capabilities.\nIt can be used as a drop-in replacement for SSH (single host) or as a powerful cluster management tool (multiple hosts).\n\nThe tool provides secure file transfer using SFTP and supports SSH keys, SSH agent, and password authentication.\nIt automatically detects Backend.AI multi-node session environments.\n\nOutput Modes:\n- TUI Mode (default): Interactive terminal UI with real-time monitoring (auto-enabled in terminals)\n- Stream Mode (--stream): Real-time output with [node] prefixes\n- File Mode (--output-dir): Save per-node output to timestamped files\n- Normal Mode: Traditional output after all nodes complete\n\nSSH Configuration Support:\n- Reads standard SSH config files (defaulting to ~/.ssh/config)\n- Supports Host patterns, HostName, User, Port, IdentityFile, StrictHostKeyChecking\n- ProxyJump, and many other SSH configuration directives\n- CLI arguments override SSH config values following SSH precedence rules",
26-
after_help = "EXAMPLES:\n SSH Mode:\n bssh user@host # Interactive shell\n bssh [email protected] \"uptime\" # Execute command\n bssh -p 2222 -i ~/.ssh/key user@host # Custom port and key\n bssh -F ~/.ssh/myconfig webserver # Use custom SSH config\n\n Port Forwarding:\n bssh -L 8080:example.com:80 user@host # Local forward: localhost:8080 → example.com:80\n bssh -R 8080:localhost:80 user@host # Remote forward: remote:8080 → localhost:80\n bssh -D 1080 user@host # SOCKS5 proxy on localhost:1080\n bssh -L 3306:db:3306 -R 80:web:80 user@host # Multiple forwards\n bssh -D *:1080/4 user@host # SOCKS4 proxy on all interfaces\n\n Multi-Server Mode:\n bssh -C production \"systemctl status\" # Execute on cluster (TUI mode auto-enabled)\n bssh -H \"web1,web2,web3\" \"df -h\" # Execute on multiple hosts\n bssh -H \"web1,web2,web3\" -f \"web1\" \"df -h\" # Filter to web1 only\n bssh -C production -f \"web*\" \"uptime\" # Filter cluster nodes\n bssh --parallel 20 -H web* \"apt update\" # Increase parallelism\n\n Host Exclusion (--exclude):\n bssh -H \"node1,node2,node3\" --exclude \"node2\" \"uptime\" # Exclude single host\n bssh -C production --exclude \"web1,web2\" \"apt update\" # Exclude multiple hosts\n bssh -C production --exclude \"db*\" \"systemctl restart\" # Exclude with wildcard pattern\n bssh -C production --exclude \"*-backup\" \"df -h\" # Exclude backup nodes\n\n Output Modes:\n bssh -C prod \"apt-get update\" # TUI mode (default, interactive monitoring)\n bssh -C prod --stream \"tail -f log\" # Stream mode (real-time with [node] prefixes)\n bssh -C prod --output-dir ./logs \"ps\" # File mode (save to timestamped files)\n bssh -C prod \"uptime\" | tee log.txt # Normal mode (auto-detected when piped)\n\n TUI Mode Controls (when in TUI):\n 1-9 Jump to node detail view\n s Enter split view (2-4 nodes)\n d Enter diff view (compare nodes)\n f Toggle auto-scroll\n ↑/↓ Scroll output\n ←/→ Switch nodes\n Esc Return to summary\n ? Show help\n q Quit\n\n File Operations:\n bssh -C staging upload file.txt /tmp/ # Upload to cluster\n bssh -H host1,host2 download /etc/hosts ./backups/\n\n Other Commands:\n bssh list # List configured clusters\n bssh -C production ping # Test connectivity\n bssh -H hosts interactive # Interactive mode\n\n SSH Config Example (~/.ssh/config):\n Host web*\n HostName web.example.com\n User webuser\n Port 2222\n IdentityFile ~/.ssh/web_key\n StrictHostKeyChecking yes\n\nDeveloped and maintained as part of the Backend.AI project.\nFor more information: https://github.com/lablup/bssh"
26+
after_help = "EXAMPLES:\n SSH Mode:\n bssh user@host # Interactive shell\n bssh [email protected] \"uptime\" # Execute command\n bssh -p 2222 -i ~/.ssh/key user@host # Custom port and key\n bssh -F ~/.ssh/myconfig webserver # Use custom SSH config\n\n Port Forwarding:\n bssh -L 8080:example.com:80 user@host # Local forward: localhost:8080 → example.com:80\n bssh -R 8080:localhost:80 user@host # Remote forward: remote:8080 → localhost:80\n bssh -D 1080 user@host # SOCKS5 proxy on localhost:1080\n bssh -L 3306:db:3306 -R 80:web:80 user@host # Multiple forwards\n bssh -D *:1080/4 user@host # SOCKS4 proxy on all interfaces\n\n Multi-Server Mode:\n bssh -C production \"systemctl status\" # Execute on cluster (TUI mode auto-enabled)\n bssh -H \"web1,web2,web3\" \"df -h\" # Execute on multiple hosts\n bssh -H \"web1,web2,web3\" -f \"web1\" \"df -h\" # Filter to web1 only\n bssh -C production -f \"web*\" \"uptime\" # Filter cluster nodes\n bssh --parallel 20 -H web* \"apt update\" # Increase parallelism\n\n Host Exclusion (--exclude):\n bssh -H \"node1,node2,node3\" --exclude \"node2\" \"uptime\" # Exclude single host\n bssh -C production --exclude \"web1,web2\" \"apt update\" # Exclude multiple hosts\n bssh -C production --exclude \"db*\" \"systemctl restart\" # Exclude with wildcard pattern\n bssh -C production --exclude \"*-backup\" \"df -h\" # Exclude backup nodes\n\n Output Modes:\n bssh -C prod \"apt-get update\" # TUI mode (default, interactive monitoring)\n bssh -C prod --stream \"tail -f log\" # Stream mode (real-time with [node] prefixes)\n bssh -C prod --output-dir ./logs \"ps\" # File mode (save to timestamped files)\n bssh -C prod \"uptime\" | tee log.txt # Normal mode (auto-detected when piped)\n\n Batch Mode (Ctrl+C Handling):\n bssh -C prod \"long-running-command\" # Default: first Ctrl+C shows status, second terminates\n bssh -C prod -b \"long-command\" # Batch mode: single Ctrl+C terminates immediately\n bssh -H nodes --batch --stream \"cmd\" # Useful for CI/CD and non-interactive scripts\n\n TUI Mode Controls (when in TUI):\n 1-9 Jump to node detail view\n s Enter split view (2-4 nodes)\n d Enter diff view (compare nodes)\n f Toggle auto-scroll\n ↑/↓ Scroll output\n ←/→ Switch nodes\n Esc Return to summary\n ? Show help\n q Quit\n\n File Operations:\n bssh -C staging upload file.txt /tmp/ # Upload to cluster\n bssh -H host1,host2 download /etc/hosts ./backups/\n\n Other Commands:\n bssh list # List configured clusters\n bssh -C production ping # Test connectivity\n bssh -H hosts interactive # Interactive mode\n\n SSH Config Example (~/.ssh/config):\n Host web*\n HostName web.example.com\n User webuser\n Port 2222\n IdentityFile ~/.ssh/web_key\n StrictHostKeyChecking yes\n\nDeveloped and maintained as part of the Backend.AI project.\nFor more information: https://github.com/lablup/bssh"
2727
)]
2828
pub struct Cli {
2929
/// SSH destination in format: [user@]hostname[:port] or ssh://[user@]hostname[:port]
@@ -104,6 +104,13 @@ pub struct Cli {
104104
)]
105105
pub sudo_password: bool,
106106

107+
#[arg(
108+
short = 'b',
109+
long = "batch",
110+
help = "Batch mode: single Ctrl+C immediately terminates all jobs\nDisables two-stage Ctrl+C handling (status display on first press)\nUseful for non-interactive scripts and CI/CD pipelines\nNote: TUI mode has its own quit handling (q or Ctrl+C) and ignores this flag"
111+
)]
112+
pub batch: bool,
113+
107114
#[arg(
108115
short = 'J',
109116
long = "jump-host",

src/commands/exec.rs

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -45,6 +45,7 @@ pub struct ExecuteCommandParams<'a> {
4545
pub require_all_success: bool,
4646
pub check_all_nodes: bool,
4747
pub sudo_password: Option<Arc<SudoPassword>>,
48+
pub batch: bool,
4849
}
4950

5051
pub async fn execute_command(params: ExecuteCommandParams<'_>) -> Result<()> {
@@ -174,6 +175,7 @@ async fn execute_command_with_forwarding(params: ExecuteCommandParams<'_>) -> Re
174175
// Execute the actual command
175176
let result = execute_command_without_forwarding(ExecuteCommandParams {
176177
port_forwards: None, // Remove forwarding from params to avoid recursion
178+
batch: params.batch,
177179
..params
178180
})
179181
.await;
@@ -209,7 +211,8 @@ async fn execute_command_without_forwarding(params: ExecuteCommandParams<'_>) ->
209211
.with_timeout(params.timeout)
210212
.with_connect_timeout(params.connect_timeout)
211213
.with_jump_hosts(params.jump_hosts.map(|s| s.to_string()))
212-
.with_sudo_password(params.sudo_password);
214+
.with_sudo_password(params.sudo_password)
215+
.with_batch_mode(params.batch);
213216

214217
// Set keychain usage if on macOS
215218
#[cfg(target_os = "macos")]

0 commit comments

Comments
 (0)