Python SDK
Install and use the GPUniq Python SDK to control GPU instances, volumes, LLM models, and more programmatically.
Installation
pip install GPUniq
Requires Python 3.8+. Only dependency: requests.
Initialize Client
from gpuniq import GPUniq
client = GPUniq(api_key="gpuniq_your_key")
Get your API key from LLM API Keys in the dashboard.
Configuration
client = GPUniq(
api_key="gpuniq_your_key",
base_url="https://api.gpuniq.com/v1", # default
timeout=120, # seconds, default 60
)
Modules
The SDK provides access to all platform features through module attributes:
| Module | Description |
|---|---|
client.marketplace | Browse GPUs, create orders |
client.gpu_cloud | Deploy by GPU type |
client.burst | Multi-GPU burst deployments |
client.instances | Manage running instances |
client.volumes | Persistent storage |
client.llm | LLM chat completions |
client.payments | Deposits and billing history |
client.settings | SSH keys and Telegram |
Marketplace
# List GPUs with filters
gpus = client.marketplace.list(
gpu_model=["RTX 4090", "A100"],
min_vram_gb=24,
verified_only=True,
sort_by="price-low",
page=1,
page_size=20,
)
# Get marketplace statistics
stats = client.marketplace.statistics()
# Get agent details
agent = client.marketplace.get_agent(agent_id=123)
# Check availability
avail = client.marketplace.check_availability(agent_id=123)
# Create order
order = client.marketplace.create_order(
agent_id=123,
pricing_type="hour",
docker_image="pytorch/pytorch:latest",
ssh_key_ids=[1, 2],
disk_gb=100,
volume_id=9,
)
# Async order (returns job_id for polling)
job = client.marketplace.create_order_async(agent_id=123, pricing_type="hour")
status = client.marketplace.get_order_status(job["job_id"])
GPU Dex-Cloud
# List available GPU types
gpus = client.gpu_cloud.list_instances(search="4090")
# Check pricing
pricing = client.gpu_cloud.pricing("RTX_4090", gpu_count=2, disk_gb=100)
# Deploy
deploy = client.gpu_cloud.deploy(
gpu_name="RTX_4090",
gpu_count=1,
docker_image="pytorch/pytorch:latest",
disk_gb=100,
volume_id=9,
)
GPU Burst
# Create burst order with fallback GPUs
order = client.burst.create_order(
docker_image="pytorch/pytorch:latest",
primary_gpu="RTX_4090",
gpu_count=8,
extra_gpus=[
{"gpu_name": "RTX_3090", "max_price": 0.5},
{"gpu_name": "A100", "max_price": 1.2},
],
disk_gb=200,
)
# Manage orders
orders = client.burst.list_orders()
client.burst.start_order(order_id=1)
client.burst.stop_order(order_id=1)
client.burst.delete_order(order_id=1)
# Cost estimate (without creating order)
estimate = client.burst.estimate(
docker_image="pytorch/pytorch:latest",
primary_gpu="RTX_4090",
gpu_count=8,
)
# Billing history
txns = client.burst.transactions(order_id=1)
runs = client.burst.runs(order_id=1)
Instances
# List active instances
instances = client.instances.list(page=1, page_size=20)
archived = client.instances.list_archived()
# Instance details
details = client.instances.get(task_id=456)
# Lifecycle
client.instances.start(task_id=456)
client.instances.stop(task_id=456)
client.instances.delete(task_id=456)
# Rename
client.instances.rename(task_id=456, name="my-training-run")
# Logs and SLA
logs = client.instances.logs(task_id=456)
sla = client.instances.sla(task_id=456)
# SSH keys per instance
client.instances.attach_ssh_key(task_id=456, ssh_key_id=1)
client.instances.detach_ssh_key(task_id=456, key_id=1)
Volumes
# Create
vol = client.volumes.create(name="my-dataset", size_limit_gb=50)
# List
volumes = client.volumes.list()
# Upload / download files
client.volumes.upload(volume_id=1, file_path="/local/data.tar.gz", subpath="datasets/")
files = client.volumes.list_files(volume_id=1, subpath="datasets/")
client.volumes.download_to(
volume_id=1,
remote_path="datasets/data.tar.gz",
local_path="./data.tar.gz",
)
# Delete
client.volumes.delete_file(volume_id=1, path="datasets/old.tar.gz")
client.volumes.delete(volume_id=1)
LLM
GPUniq gives you Claude (Opus 4.7, Sonnet 4.6, Haiku 4.5), GPT-5 family, Gemini 3, Grok 4, and 30+ open-source models under one API key. Chat and image generation are both billed directly in USD from user.balance — no token pools, no separate top-ups.
For tools that expect the OpenAI protocol (Claude Code via LiteLLM, Cursor, Continue.dev), use the drop-in OpenAI-compatible endpoint at https://api.gpuniq.com/v1/openai instead.
Chat
# One-shot
reply = client.llm.chat("claude-haiku-4-5", "Explain transformers")
print(reply)
# Full completion with history and parameters
data = client.llm.chat_completion(
messages=[
{"role": "system", "content": "You are a terse assistant."},
{"role": "user", "content": "Hello!"},
],
model="claude-haiku-4-5",
temperature=0.3,
max_tokens=1000,
)
# data = {"content": "...", "model": "...", "tokens_used": 42,
# "cost_usd": 0.0003, "balance_usd": 42.17, "finish_reason": "stop"}
# What's available
models = client.llm.models() # list of text-model slugs
default = client.llm.default_model() # platform default
catalog = client.llm.model_catalog() # full catalog with pricing metadata
# Current USD balance (what pays for chat + images)
print(client.llm.balance())
# Persistent chat sessions (multi-turn history kept server-side)
session = client.llm.create_chat_session(model="claude-haiku-4-5", title="Research")
client.llm.send_message(chat_id=session["id"], message="Hello!")
client.llm.list_chat_sessions()
client.llm.delete_chat_session(chat_id=session["id"])
# Usage history
client.llm.usage_history(limit=50)
Image generation
Text-to-image and image-to-image through Nano Banana, Nano Banana Pro / 4K, and Grok 4 Image. See the LLM & Image API page for model slugs and pricing.
Synchronous — fast single images
result = client.llm.generate_image(
"a red cat astronaut on Mars",
model="nano-banana",
n=1,
size="1024x1024",
save_to="cat.png", # decodes b64_json and writes to disk
)
print(result["saved_paths"]) # → ['cat.png']
print(result["cost_usd"], result["balance_usd"])
Async + poll — recommended for Nano Banana
Higher-resolution generations go through a job surface that isn't bound by the proxy's ~100s read timeout. generate_image_async handles polling for you.
result = client.llm.generate_image_async(
"isometric cyberpunk city at dusk",
model="nano-banana-pro",
size="2048x2048",
save_to="city.png",
on_progress=lambda status, _payload: print("→", status),
)
Image-to-image / editing
Pass references as local paths, data: URLs, https:// URLs, raw bytes, or bare base64 — the SDK inlines local files as data URLs automatically.
client.llm.generate_image(
"same cat but in Tokyo at night, neon reflections",
model="nano-banana-pro",
input_images=["cat.png", "reference/mood_board.jpg"],
size="2048x2048",
save_to="cat_tokyo.png",
)
Low-level job control
If you need your own polling / cancellation UI:
job = client.llm.start_image_job("abstract painting of a neural network", model="nano-banana")
while True:
status = client.llm.get_image_job(job["job_id"])
if status["status"] in ("completed", "failed"):
break
time.sleep(3)
save_to semantics
- Single image + filename → writes to that path, returns
[path]. - Single image + directory → writes
image_1.pnginside it. - Multiple images + filename → appends
_1,_2, … before the extension. - Multiple images + directory → writes
image_1.png,image_2.png, … - Omit
save_toto keep base64 payloads inresult["images"].
Payments
# Deposit
deposit = client.payments.deposit(amount=1000, payment_system="yookassa")
print(deposit["confirmation_url"])
# History
history = client.payments.history()
spending = client.payments.spending_history()
Settings
# SSH keys
keys = client.settings.list_ssh_keys()
new_key = client.settings.create_ssh_key(key_name="laptop", public_key="ssh-rsa AAAA...")
client.settings.delete_ssh_key(key_id=1)
# Telegram notifications
client.settings.link_telegram(telegram_username="myuser")
Error Handling
from gpuniq import (
GPUniq,
GPUniqError,
AuthenticationError,
RateLimitError,
NotFoundError,
)
client = GPUniq(api_key="gpuniq_your_key")
try:
instances = client.instances.list()
except AuthenticationError:
print("Invalid API key")
except RateLimitError as e:
print(f"Rate limited, retry after {e.retry_after}s")
except NotFoundError:
print("Resource not found")
except GPUniqError as e:
print(f"Error: {e.message} (code={e.error_code}, status={e.http_status})")
The SDK automatically retries on rate limit (429) up to 3 times with Retry-After backoff.
Backward Compatibility
v1.x code continues to work:
import gpuniq
client = gpuniq.init("gpuniq_your_key")
response = client.request("claude-haiku-4-5", "Hello!")