CLI on GPU Instance
Command-line tool for checkpointing, replaying, and tracking commands on GPU instances.
Overview
gg is a command-line tool that runs inside your rented GPU instance. It wraps any shell command with automatic checkpointing — every command you run is saved with its full output, exit code, and timing. If your instance restarts, gg replay re-runs unfinished commands automatically.
Key features:
- Checkpoint every command with full log capture
- Replay unfinished commands after instance restart
- Sync checkpoints to the GPUniq backend
- Browse and search command history
Installation
gg is included in the GPUniq Python package:
pip install GPUniq
This installs both the Python SDK and the gg CLI. Requires Python 3.8+.
Quick Start
# 1. Initialize with your API token (shown in dashboard after rental)
gg init <your-gg-token>
# 2. Run any command — it gets checkpointed automatically
gg run python train.py --epochs 50
# 3. Check your command history
gg list
# 4. View logs from a checkpoint
gg logs abc12
# 5. After instance restart — replay unfinished commands
gg replay
Commands
gg init
Initialize gg on your instance. Run this once after connecting.
gg init <token>
- token — your GG token, shown in the dashboard after renting an instance
- Verifies the token with the GPUniq backend
- Creates the
.ggdirectory at/workspace/volume/.gg - Stores config, checkpoint manifest, and logs
You only need to run gg init once per instance. The config persists across sessions.
gg run
Execute a command with automatic checkpointing.
gg run <command>
Examples:
gg run python train.py --epochs 100 --lr 0.001
gg run bash setup.sh
gg run nvidia-smi
Shorthand — if the first argument is not a known subcommand (init, list, logs, replay, status), gg treats the entire input as a command:
# These are equivalent:
gg run python train.py
gg python train.py
What happens when you run a command:
- Command starts in a PTY with full terminal support (colors, progress bars)
- Output is teed to both your terminal and a log file
- On completion, a checkpoint is saved locally with:
- Unique checkpoint ID
- Command text, exit code, status
- Start/end timestamps, duration
- Working directory, log file path
- Checkpoint is synced to the GPUniq backend
- Summary is printed with checkpoint ID
Output:
[gg] command : python train.py --epochs 100
[gg] exit_code : 0
[gg] status : completed
[gg] duration : 3421s
[gg] checkpoint : a1b2c3d4
Log files are capped at 100 MB per command. Output beyond that limit is truncated.
gg list
Show all saved checkpoints.
gg list
Displays a formatted table with:
- Checkpoint ID (short prefix)
- Command
- Status (
completed,failed,killed) - Exit code
- Duration
- Sync status
gg logs
View the full output log of a checkpoint.
gg logs <checkpoint_id> [--tail N]
- checkpoint_id — full ID or short prefix (first few characters)
- --tail N — show only the last N lines (default: full log)
Examples:
# Full log
gg logs a1b2c3
# Last 50 lines
gg logs a1b2 --tail 50
gg replay
Re-run all commands that were interrupted (status: running or killed).
gg replay
Use this after an instance restart to resume training or setup scripts. Each replayed command creates a new checkpoint.
gg status
Show current gg configuration and statistics.
gg status
Displays:
- Instance name and task ID
- API endpoint
- Number of saved checkpoints
- Total log size on disk
How It Works
Storage Layout
All gg data is stored under /workspace/volume/.gg/:
/workspace/volume/.gg/
config.json # Token, API URL, instance info
checkpoints.json # Checkpoint manifest (all runs)
logs/
<checkpoint_id>.log # Output log per command
Since /workspace/volume/ is on a persistent volume, your checkpoints survive instance restarts.
Typical Workflow
# Connect to your instance
ssh root@<host> -p <port>
# First time: initialize gg
gg init eyJhbGciOi...
# Run your setup
gg run apt-get update && apt-get install -y vim htop
gg run pip install torch transformers wandb
# Start training
gg run python train.py --config config.yaml
# Check history
gg list
# If instance restarts, reconnect and:
gg replay
Notes
gguses a PTY, so commands behave exactly as in a normal terminal — interactive prompts, colors, and progress bars all work- SIGINT (Ctrl+C) and SIGTERM are forwarded to the running process
- Checkpoint sync to the backend is fire-and-forget — if the network is unavailable, checkpoints are still saved locally
- The
ggtoken is separate from your API key. It is instance-specific and shown in the dashboard after you rent a GPU
Last updated 6 days ago
Built with Documentation.AI