Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.hyreagent.fun/llms.txt

Use this file to discover all available pages before exploring further.

LLM Cascade

HYRE uses a multi-model cascade to generate AI insights. Models are tried in priority order. If one fails (timeout, rate limit, content policy), the next model is tried automatically. If all models fail, raw data is returned with insight: null and HTTP 206 status.

Cascade Order

Primary: OpenRouter

PriorityModelTimeoutNotes
1DeepSeek V3.210sPrimary. Fast, high-quality JSON output.
2DeepSeek V3 (Chat)8sFallback. Slightly older model version.
3GLM 4.5 Air (Free)8sFree tier. Good for simple enrichment.
4Claude 3.5 Haiku8sPremium fallback. Highest quality reasoning.

Secondary: Venice AI

If all OpenRouter models fail, HYRE falls back to Venice AI:
PriorityModelTimeoutNotes
5Venice DeepSeek V3.210sSame model, different provider.
6Venice GLM 4.7 Flash8sFast, lightweight.

Chat Agent (Playground)

The Playground chat agent uses a separate model:
ModelProviderUse Case
Gemini 2.5 Flash-LiteGoogle AIConversation flow, tool selection, response summarization

Failure Modes

FailureBehavior
Timeout (>8-10s)Abort and try next model
HTTP 429 (rate limit)Skip and try next model
HTTP 5xx (server error)Skip and try next model
Content policy blockSkip and try next model
Empty responseSkip and try next model
All models failReturn raw data, insight: null, HTTP 206
HTTP 206 (Partial Content) indicates the data was fetched successfully but the LLM enrichment failed. The data field contains the full upstream data. The signal field will be absent or set to a default value.

LLM Call Configuration

Every LLM call uses these parameters:
{
  "temperature": 0.3,
  "max_tokens": 800,
  "response_format": { "type": "json_object" }
}
  • Low temperature (0.3) — Prioritizes consistent, factual output over creative variation.
  • JSON mode — Forces the model to return valid JSON, parsed into the response envelope.
  • 800 token limit — Keeps insights concise (1-2 sentences) and response times fast.

System Prompts

Each endpoint segment has a dedicated system prompt that instructs the LLM:
SegmentSignal VocabularyPrompt Focus
Trenchessnipe, watch, avoidToken risk assessment, sniper detection, dev behavior
Tradersfollow, ignoreWallet profitability, copy-trade worthiness
LPsadd_liquidity, rebalance, hold, removePool APR sustainability, IL risk, range optimization
DeFihigh_yield, medium_yield, low_yield, riskyTVL trends, yield opportunity assessment
deBridgeexecute, wait, avoid (quote) / migrate, stay, wait (yield)Bridge cost efficiency, cross-chain yield comparison
Nansenfollow, ignore, accumulate, distributeSmart money flow interpretation, wallet classification

Response Format

The LLM returns JSON matching this structure:
{
  "insight": "Solana TVL surged 12% this week, driven by...",
  "signal": "high_yield",
  "confidence": 0.87
}
The enrich() function merges this with the raw data:
{
  "data": { ... },
  "insight": "Solana TVL surged 12% this week, driven by...",
  "signal": "high_yield",
  "confidence": 0.87,
  "sources": ["defillama"],
  "model_used": "deepseek-v3.2",
  "latency_ms": 342,
  "timestamp": "2026-04-17T10:30:00.000Z"
}