Decentralized AI Inference Marketplace — Swan Chain's Next Evolution
Overview
Swan 2.0 marks Swan Chain's evolution from a UBI-subsidized computing network into a market-driven AI inference marketplace. Built as the Inference Cloud, it connects AI model consumers with GPU providers through a decentralized coordination layer, enabling anyone to access AI models via a single API key — or earn stablecoin revenue by sharing their GPU resources.
Swan Inference provides a drop-in replacement for OpenAI's API. Any existing OpenAI SDK or integration works with Swan Inference by changing the base URL and API key.
Supported Endpoints
Endpoint
Description
/v1/chat/completions
Chat-based text generation (streaming supported)
/v1/embeddings
Text embeddings
/v1/images/generations
Image generation
/v1/audio/transcriptions
Audio-to-text transcription
/v1/models
List available models
Example Request
Tiered Model Catalog
Swan Inference organizes models into hardware-based tiers with per-token pricing set at 50–66% below comparable centralized providers.
Swan 2.0 accepts both stablecoins and SWAN as payment. Users who pay with SWAN receive a 20% discount, creating organic buy pressure.
Payment Method
Discount
Example (1M Tier B output tokens)
USDC/USDT
0% (base price)
$0.03
SWAN
20% discount
$0.024 (saves $0.006)
Pay-with-SWAN flow:
Consumer sends an inference request with payment: "SWAN"
SWAN amount is calculated from base USD price minus 20% discount
SWAN is deducted from the user's prepaid balance
Provider receives 95% in their preferred currency (USDC or SWAN)
5% goes to the Growth Fund
Provider-First Revenue Split (95/5)
During the bootstrap phase, Swan adopts an aggressive provider-first model to attract quality hardware:
Recipient
Share
Purpose
Computing Provider
95%
Direct payout in stablecoins (USDC/USDT)
Growth Fund
5%
Provider recruitment, integrations, dev tooling, liquidity
Protocol Treasury
0%
Deferred until network reaches sustainability threshold
The Growth Fund is reinvested into network expansion — provider onboarding bounties, DEX liquidity seeding, and integration grants. Spending is reported monthly with transaction hashes.
Dynamic Revenue Split Schedule
As network revenue grows, the split adjusts through governance votes:
Phase
Daily Revenue
Provider
Growth Fund
Treasury
Bootstrap
< $100
95%
5%
0%
Growth
$100 – $1,000
90%
5%
5%
Maturity
$1,000 – $10,000
85%
3% + 2% burn
10%
Scale
> $10,000
80%
2% + 3% burn
15%
Each phase transition requires a governance vote with a 7-day voting period. Revenue thresholds are measured as a 30-day rolling average.
Quality Assurance
Benchmarks
The benchmark worker runs periodically (default: every 24 hours) to verify provider quality. Benchmark results expire after 30 days — providers that miss benchmarks for 30+ days lose qualification and must re-benchmark to resume receiving traffic.
Test
Pass Threshold
Math Accuracy
≥ 50%
Code Generation
≥ 50%
Response Latency
≤ 5000ms
Slashing Conditions
Condition
Consequence
Benchmark failure (1st)
Warning + 24h suspension from task routing
Consecutive failure (2nd)
10% collateral slashed
Consecutive failure (3rd)
30% collateral slashed + network removal
Benchmark results expired (> 30 days)
Loses qualification until re-benchmarked
Inference success rate < 80%
Deprioritized in request routing
Uptime < 90% (30-day rolling)
Deprioritized in request routing
New Provider Grace Period: Providers registered within the last 7 days are exempt from uptime and success-rate deprioritization. This gives new providers time to build history without being penalized for insufficient data. Benchmark requirements and probation still apply during the grace period.
Health Monitoring
Automatic health checks with configurable thresholds for WebSocket and external endpoints
Circuit breaker to prevent cascading failures
Load balancing with health-aware routing (round-robin, least-connections, or health-aware strategies)
Model warmup to pre-load models and reduce cold-start latency
Provider Leaderboard
Providers are ranked by a performance-based leaderboard using availability metrics, success rates, and latency.
Provider Onboarding
Hardware Tiers
To receive inference traffic and earn revenue, providers must meet minimum hardware requirements:
Tier
Min VRAM
Example Hardware
Models
Status
S
38GB+
L40S, A100, H100
70B premium models
Recruiting new providers
A
24GB
RTX 4090, 3090, A6000
24B–32B agent models
Activate idle inventory
B
12GB
RTX 4070 Ti, 3080 Ti
8B–12B free tier
Some current providers
C
8GB
RTX 3070, 4060
Embedding, Whisper
Lowest qualifying tier
macOS
16GB+ unified
Apple Silicon M1/M2/M3/M4
8B–12B via Ollama
Entry-level providers
Rejected
< 8GB or legacy
TESLA P4, GTX 1050 Ti
None
No rewards
Legacy GPUs (TESLA P4, GTX 1050 Ti) served the network well during the ZK-task era but cannot serve AI inference workloads at acceptable quality.
macOS Support: Apple Silicon Macs can serve as providers using Ollama as the inference backend. While datacenter GPUs offer higher throughput, Macs are a low-friction entry point for new providers — no Docker, NVIDIA drivers, or Linux required.
Requirements
Linux (NVIDIA GPU):
GPU meeting at least Tier C requirements (≥ 8GB VRAM)
Docker 24.0+ with NVIDIA Container Toolkit
Inference server: SGLang (recommended), vLLM, or Ollama
macOS (Apple Silicon):
Apple Silicon Mac (M1/M2/M3/M4) with 16GB+ unified memory
Ollama installed (brew install ollama)
Both platforms:
Swan's computing-provider agent installed
No public IP, domain, or SSL setup required — providers connect via WebSocket behind NAT/firewall
Swan Inference uses a MerkleDistributor smart contract for gas-efficient batch settlement:
The platform aggregates provider earnings into daily settlement batches
A Merkle tree is computed from all provider balances
The Merkle root is submitted on-chain
Providers claim their earnings by submitting a Merkle proof
This approach minimizes gas costs by settling many provider payments in a single on-chain transaction.
Smart Contracts
Contract
Address (Swan Chain Mainnet)
ProviderCollateral
0x557f306f917009cf83c32b8b32a79202e79948e5
SWAN Token
0xAF90ac6428775E1Be06BAFA932c2d80119a7bd02
Swan Chain Mainnet operates on Chain ID 254 with RPC at https://mainnet-rpc01.swanchain.io. See Network Info for full details.
UBI Sunset (SIP-003)
Swan 2.0 eliminates UBI entirely per SIP-003. Providers earn solely from inference revenue (95% of fees). The taper schedule:
Period
UBI Level
SWAN/day
Notes
Swan 2.0 Launch (Mar 16 – Apr 9)
100%
58,369
Platform goes live, governance vote Apr 1–7
Month 1 post-vote (Apr 10 – May 9)
50%
29,185
Contribution-weighted; legacy hardware earns zero
Month 2 post-vote (May 10 – Jun 9)
20%
11,674
Providers earn primarily from inference
Month 3+ (Jun 10 onwards)
0%
0
UBI permanently off. Inference revenue only.
Why Stop UBI
Under Swan 1.0, 75% of daily UBI went to providers with 0% uptime. SIP-003 redirects all incentives toward GPUs that actually serve inference. The breakeven where inference revenue matches the best current UBI payout is just $25/day total network revenue.
Safety Valve
If the network cannot sustain minimum viable provider economics ($50/day revenue) by Month 3, governance can vote to extend UBI at 25% (contribution-weighted only) for 3 additional months.
SWAN Token Utility
After UBI stops, SWAN token utility is:
Utility
Description
Pay-with-SWAN
20% inference discount for consumers — creates organic buy pressure
Provider Collateral
Required deposit to join the network
Governance
Vote on protocol parameters, revenue splits, and phase transitions