Getting StartedSDK Reference

Python SDK

Install and use the GPUniq Python SDK to control GPU instances, volumes, LLM models, and more programmatically.

Installation

pip install GPUniq

Requires Python 3.8+. Only dependency: requests.

Initialize Client

from gpuniq import GPUniq

client = GPUniq(api_key="gpuniq_your_key")

Get your API key from LLM API Keys in the dashboard.

Configuration

client = GPUniq(
    api_key="gpuniq_your_key",
    base_url="https://api.gpuniq.com/v1",  # default
    timeout=120,                            # seconds, default 60
)

Modules

The SDK provides access to all platform features through module attributes:

ModuleDescription
client.marketplaceBrowse GPUs, create orders
client.gpu_cloudDeploy by GPU type
client.burstMulti-GPU burst deployments
client.instancesManage running instances
client.volumesPersistent storage
client.llmLLM chat completions
client.paymentsDeposits and billing history
client.settingsSSH keys and Telegram

Marketplace

# List GPUs with filters
gpus = client.marketplace.list(
    gpu_model=["RTX 4090", "A100"],
    min_vram_gb=24,
    verified_only=True,
    sort_by="price-low",
    page=1,
    page_size=20,
)

# Get marketplace statistics
stats = client.marketplace.statistics()

# Get agent details
agent = client.marketplace.get_agent(agent_id=123)

# Check availability
avail = client.marketplace.check_availability(agent_id=123)

# Create order
order = client.marketplace.create_order(
    agent_id=123,
    pricing_type="hour",
    docker_image="pytorch/pytorch:latest",
    ssh_key_ids=[1, 2],
    disk_gb=100,
    volume_id=9,
)

# Async order (returns job_id for polling)
job = client.marketplace.create_order_async(agent_id=123, pricing_type="hour")
status = client.marketplace.get_order_status(job["job_id"])

GPU Dex-Cloud

# List available GPU types
gpus = client.gpu_cloud.list_instances(search="4090")

# Check pricing
pricing = client.gpu_cloud.pricing("RTX_4090", gpu_count=2, disk_gb=100)

# Deploy
deploy = client.gpu_cloud.deploy(
    gpu_name="RTX_4090",
    gpu_count=1,
    docker_image="pytorch/pytorch:latest",
    disk_gb=100,
    volume_id=9,
)

GPU Burst

# Create burst order with fallback GPUs
order = client.burst.create_order(
    docker_image="pytorch/pytorch:latest",
    primary_gpu="RTX_4090",
    gpu_count=8,
    extra_gpus=[
        {"gpu_name": "RTX_3090", "max_price": 0.5},
        {"gpu_name": "A100", "max_price": 1.2},
    ],
    disk_gb=200,
)

# Manage orders
orders = client.burst.list_orders()
client.burst.start_order(order_id=1)
client.burst.stop_order(order_id=1)
client.burst.delete_order(order_id=1)

# Cost estimate (without creating order)
estimate = client.burst.estimate(
    docker_image="pytorch/pytorch:latest",
    primary_gpu="RTX_4090",
    gpu_count=8,
)

# Billing history
txns = client.burst.transactions(order_id=1)
runs = client.burst.runs(order_id=1)

Instances

# List active instances
instances = client.instances.list(page=1, page_size=20)
archived = client.instances.list_archived()

# Instance details
details = client.instances.get(task_id=456)

# Lifecycle
client.instances.start(task_id=456)
client.instances.stop(task_id=456)
client.instances.delete(task_id=456)

# Rename
client.instances.rename(task_id=456, name="my-training-run")

# Logs and SLA
logs = client.instances.logs(task_id=456)
sla = client.instances.sla(task_id=456)

# SSH keys per instance
client.instances.attach_ssh_key(task_id=456, ssh_key_id=1)
client.instances.detach_ssh_key(task_id=456, key_id=1)

Volumes

# Create
vol = client.volumes.create(name="my-dataset", size_limit_gb=50)

# List
volumes = client.volumes.list()

# Upload / download files
client.volumes.upload(volume_id=1, file_path="/local/data.tar.gz", subpath="datasets/")
files = client.volumes.list_files(volume_id=1, subpath="datasets/")
client.volumes.download_to(
    volume_id=1,
    remote_path="datasets/data.tar.gz",
    local_path="./data.tar.gz",
)

# Delete
client.volumes.delete_file(volume_id=1, path="datasets/old.tar.gz")
client.volumes.delete(volume_id=1)

LLM

GPUniq gives you Claude (Opus 4.7, Sonnet 4.6, Haiku 4.5), GPT-5 family, Gemini 3, Grok 4, and 30+ open-source models under one API key. Chat and image generation are both billed directly in USD from user.balance — no token pools, no separate top-ups.

For tools that expect the OpenAI protocol (Claude Code via LiteLLM, Cursor, Continue.dev), use the drop-in OpenAI-compatible endpoint at https://api.gpuniq.com/v1/openai instead.

Chat

# One-shot
reply = client.llm.chat("claude-haiku-4-5", "Explain transformers")
print(reply)

# Full completion with history and parameters
data = client.llm.chat_completion(
    messages=[
        {"role": "system", "content": "You are a terse assistant."},
        {"role": "user",   "content": "Hello!"},
    ],
    model="claude-haiku-4-5",
    temperature=0.3,
    max_tokens=1000,
)
# data = {"content": "...", "model": "...", "tokens_used": 42,
#         "cost_usd": 0.0003, "balance_usd": 42.17, "finish_reason": "stop"}

# What's available
models   = client.llm.models()            # list of text-model slugs
default  = client.llm.default_model()     # platform default
catalog  = client.llm.model_catalog()     # full catalog with pricing metadata

# Current USD balance (what pays for chat + images)
print(client.llm.balance())

# Persistent chat sessions (multi-turn history kept server-side)
session = client.llm.create_chat_session(model="claude-haiku-4-5", title="Research")
client.llm.send_message(chat_id=session["id"], message="Hello!")
client.llm.list_chat_sessions()
client.llm.delete_chat_session(chat_id=session["id"])

# Usage history
client.llm.usage_history(limit=50)

Image generation

Text-to-image and image-to-image through Nano Banana, Nano Banana Pro / 4K, and Grok 4 Image. See the LLM & Image API page for model slugs and pricing.

Synchronous — fast single images

result = client.llm.generate_image(
    "a red cat astronaut on Mars",
    model="nano-banana",
    n=1,
    size="1024x1024",
    save_to="cat.png",                 # decodes b64_json and writes to disk
)
print(result["saved_paths"])           # → ['cat.png']
print(result["cost_usd"], result["balance_usd"])

Higher-resolution generations go through a job surface that isn't bound by the proxy's ~100s read timeout. generate_image_async handles polling for you.

result = client.llm.generate_image_async(
    "isometric cyberpunk city at dusk",
    model="nano-banana-pro",
    size="2048x2048",
    save_to="city.png",
    on_progress=lambda status, _payload: print("→", status),
)

Image-to-image / editing

Pass references as local paths, data: URLs, https:// URLs, raw bytes, or bare base64 — the SDK inlines local files as data URLs automatically.

client.llm.generate_image(
    "same cat but in Tokyo at night, neon reflections",
    model="nano-banana-pro",
    input_images=["cat.png", "reference/mood_board.jpg"],
    size="2048x2048",
    save_to="cat_tokyo.png",
)

Low-level job control

If you need your own polling / cancellation UI:

job = client.llm.start_image_job("abstract painting of a neural network", model="nano-banana")
while True:
    status = client.llm.get_image_job(job["job_id"])
    if status["status"] in ("completed", "failed"):
        break
    time.sleep(3)

save_to semantics

  • Single image + filename → writes to that path, returns [path].
  • Single image + directory → writes image_1.png inside it.
  • Multiple images + filename → appends _1, _2, … before the extension.
  • Multiple images + directory → writes image_1.png, image_2.png, …
  • Omit save_to to keep base64 payloads in result["images"].

Payments

# Deposit
deposit = client.payments.deposit(amount=1000, payment_system="yookassa")
print(deposit["confirmation_url"])

# History
history = client.payments.history()
spending = client.payments.spending_history()

Settings

# SSH keys
keys = client.settings.list_ssh_keys()
new_key = client.settings.create_ssh_key(key_name="laptop", public_key="ssh-rsa AAAA...")
client.settings.delete_ssh_key(key_id=1)

# Telegram notifications
client.settings.link_telegram(telegram_username="myuser")

Error Handling

from gpuniq import (
    GPUniq,
    GPUniqError,
    AuthenticationError,
    RateLimitError,
    NotFoundError,
)

client = GPUniq(api_key="gpuniq_your_key")

try:
    instances = client.instances.list()
except AuthenticationError:
    print("Invalid API key")
except RateLimitError as e:
    print(f"Rate limited, retry after {e.retry_after}s")
except NotFoundError:
    print("Resource not found")
except GPUniqError as e:
    print(f"Error: {e.message} (code={e.error_code}, status={e.http_status})")

The SDK automatically retries on rate limit (429) up to 3 times with Retry-After backoff.

Backward Compatibility

v1.x code continues to work:

import gpuniq

client = gpuniq.init("gpuniq_your_key")
response = client.request("claude-haiku-4-5", "Hello!")