Products

Five ways to put AI to work

Local AI infrastructure, tested agents, custom ML models trained on your data, a standalone safety engine for any LLM, and a P2P mesh that scales across whatever devices you already have. Everything runs on your hardware. Nothing touches the cloud.

RigRun

Complete AI server. One GPU. Zero cloud dependency.

122B Model Inference

Run a 122-billion parameter model on a single GPU. 105 tokens/second, 1M-token context window (YaRN-scaled). Zero per-token costs.

5-Layer Safety Stack

Prompt injection detection, action gating, trajectory anomaly detection, learned classification, and spillage prevention. Adversarial prompt injection blocked at confidence ≥ 0.7 across all test vectors.

Self-Improving

Overnight training pipeline learns from every conversation. DPO preference optimization on your own data. The model gets better at your specific workflows.

Rolling Memory — Effectively Infinite Context

Verified 2026-04-11

Beyond the 1M-token native window, Rolling Memory v4 extends effective context indefinitely through per-session disk-backed verbatim, summary, and embedding tiers. Latency stays flat with depth because retrieval keeps each forward pass roughly the same size regardless of total context length.

10M

Characters tested

2.5M

Tokens tested

100%

Needle recall

~10s

Latency at any depth

Test rig: needle-in-haystack 100% pass through 10M chars (8.9–9.1s), 5/5 fact consistency, 10/10 multi-turn retrieval, 17/17 smoke tests passed. Drop-in OpenAI/Anthropic-compatible reverse proxy on port 8096 with per-session isolation, dashboard, and Prometheus metrics.

RTX PRO 6000 Blackwell /// Qwen3.5-122B /// 105 tok/s /// 1M token context /// Runs on your hardware

Companion Apps

Talk to your stack from any surface

Native desktop and mobile clients that speak directly to your RigRun server. Both ship with the same OpenAI-compatible API contract auto-generated from the Go server's OpenAPI spec.

RigRun Desktop

Pre-release

Electron + Next.js 14 + React. Bundles its own RigRun Go backend so a single installer ships the server and the UI together. WebAssembly LLM in the renderer (via @mlc-ai/web-llm) provides a second inference path independent of the Go backend.

ElectronNext.js 14ReactTiptapRadix UIVitest + Playwright

~24,000 lines TypeScript / TSX · Windows / macOS / Linux

RigRun Mobile

Not yet released

Flutter native app for iOS and Android. Riverpod state, Hive storage, Dio HTTP, freezed models, flutter_secure_storage for credentials. Connects to your RigRun server over Tailscale or any private mesh — your data never touches a third-party cloud.

FlutterRiverpodHiveDioMaestro E2EFirebase Test Lab

~50,000 lines Dart · iOS / Android · ~170 test files

Both apps are in active development. Desktop pre-release builds available on request. Mobile public release timing TBD.

Pricing

Technical preview now. Production licensing soon.

3 seats — Technical Preview

Founding Access

Invite Only

Three founding seats. 90 days of unlimited inference on the same stack we use to ship Thornveil's products. Hand-selected previewers who shape the roadmap and lock in pricing when paid access opens.

Unlimited inference on Qwen3.5-122B (no metering)
All 6 domain agents included
OpenAI-compatible API endpoint
Direct line to the builder
Roadmap input
Locked-in pricing when paid tiers open
Your conversations are preserved — when you bring RigRun in-house, your local copy ships pre-trained on your own usage

Apply for Founding Access

Coming Soon

RigRun License

TBD

Deploy on your own hardware. Your data never leaves your building. Pricing finalizes after the Founding Access program closes.

Full server binary + desktop app
Unlimited local inference
5-layer safety proxy + routing
Self-training pipeline
Agent Factory (unlimited agents)
Self-regulating inference engine (7 autonomous optimization layers)
1 year of updates + support

Contact for Inquiries

Enterprise

Custom

Multi-node, classified environments, mesh networking.

Everything in RigRun License
Mycelium mesh networking
Classification routing (CUI–TS)
On-premise installation support
Custom agent development
Dedicated technical support

Domain Expert Agents

Built by the Agent Factory. 14-step pipeline. Included with every Founding Access seat. Custom agents available as standalone projects — contact us for scoping.

Code Reviewer

9 tools

Senior Principal Security Engineer

Finds real bugs, security vulnerabilities, and performance bottlenecks. Produces structured JSON reviews with severity ratings, file:line locations, impact analysis, and concrete fix suggestions.

Output Format

Structured JSON with verdict, severity, file:line, impact, fix suggestion

SecurityBug DetectionPerformanceStatic Analysis

Security Auditor

11 tools

Principal Application Security Engineer

Comprehensive code audits mapped to CWE, OWASP Top 10, and CVSS 3.1 scores. Traces data flow from source to sink. Checks auth, authz, crypto, and dependencies.

Output Format

Audit report with vulnerability table, data flow maps, attack paths, remediation

CWEOWASPCVSSData Flow Tracing

Documentation Writer

10 tools

Senior Technical Documentation Architect

Reads actual code before writing anything. Produces READMEs, API docs, architecture overviews, and setup guides. Every claim traceable to specific file:line references.

Output Format

Markdown documentation with file:line citations and verification steps

READMEAPI DocsArchitectureSetup Guides

Sprint Planner

10 tools

Senior Engineering Manager

Analyzes codebases and git history to create actionable sprint plans. P0/P1/P2 priority. Each task 30-120 minutes with concrete steps and verifiable done-when criteria.

Output Format

JSON sprint plan with priority, duration, steps, and completion criteria

AgileGit AnalysisTask DecompositionDelegation

SBIR Proposal Writer

7 tools

Senior Federal Grant Strategist

Writes SBIR/STTR proposals for DoD, DHS, and NSF. Leads with the agency problem. Quantifies with benchmarks. TAM/SAM/SOM analysis. Work plans with milestones.

Output Format

Complete proposal sections with compliance matrix and evaluation alignment

DoDDHSNSFTAM/SAM/SOMCompliance

Patent Drafter

10 tools

Software AI/ML Patent Attorney

Drafts provisional patent applications for software and AI/ML inventions. 15-25 claims with proper legal language. Analyzes prior art. Handles Alice Corp. eligibility.

Output Format

Complete provisional application with claims, specification, and abstract

USPTOClaimsAlice Corp.Prior Art

How agents are made

Every agent goes through a 14-step manufacturing pipeline. No manual prompt engineering. No guesswork.

Design

Requirements decomposition

Natural language in, structured spec out

Multi-pass prompt synthesis

Draft by 122B, critique by independent model

Tool selection

Auto-selects from 12-tool registry

Knowledge base assembly

ChromaDB with semantic chunking

Few-shot example generation

2-3 gold examples per agent

Tool choreography

Frontier model shows how to use tools

Test

Frontier calibration

Compare output against Claude Opus

Gap analysis

Identify and fix quality delta

Adversarial stress test

5 probes: injection, hallucination, scope

Prompt hardening

Auto-patch for discovered failure modes

Deploy

Functional testing

Keyword + structural validation

Quality gate

100% happy-path, 50% edge-case pass rate

Human approval

Pending until explicitly activated

Continuous improvement

DPO pairs from real usage

Need something specific?

The Agent Factory can build agents for any domain. Legal research, financial analysis, medical literature review, real estate, HR policy, marketing copy. Describe what you need.

Request a Custom Agent

HawkStack

Custom ML models trained on your data. Thornveil predicts performance before training, build sub-2M parameter models that beat 50M+ competitors, and deploy to edge hardware. Six domains validated. One architecture.

Proven Domains

ThermalHawk Thermal drone detection 82.95% mAP 1.77M params

SentryHawk IRST / missile warning 92.29% IoU 798K params

DepthHawk Sonar target detection 88.0% mAP 1.2M params

CardioHawk ECG arrhythmia (NSV classes; F/Q in development) 94.1% F1 8.9K params

ForgeHawk PCB defect inspection 97.63% mAP 82K params

WildHawk Wildlife camera trap ID Active R&D < 2M params

Predict Before Training

Topology analysis predicts the performance ceiling, optimal architecture, and training recipe before a single model is trained. You know what's achievable before paying for compute.

Tiny Models Beat Giants

Custom WEM (Weighted Expert Mixture) backbone with domain-adaptive receptive field branches. The result: models under 2M parameters that run on microcontrollers and SBCs (Jetson, Kria, COTS-ruggedized) while matching models 30x their size.

Performance Commitment

If the predicted ceiling is 92% mAP and the delivered model achieves 85%, deliverable acceptance is tied to the predicted performance ceiling. Thornveil only commits to delivering what the topology math supports.

6 domains /// 4 modalities /// 1 architecture /// Patent pending

Service Tiers

Your data. Your model. Your hardware.

Topology Audit

$2-5K

Know what's achievable before you commit. Send us your dataset characteristics — Thornveil predicts the ceiling.

3-parameter topology analysis
Predicted performance ceiling
Recommended architecture
Estimated training time
Dataset sufficiency assessment

Request Audit

Recommended

Custom Model Build

$10-25K

Full model design, training, and validation. Deliverable: trained model (usually <100KB), training code, deployment guide.

Everything in Topology Audit
WEM backbone with custom RF branches
SGDR cosine-restart training with loss surface basin analysis
Prototypical network heads for imbalanced classes
Trained model + training code
Performance commitment vs ceiling

Start Build

Edge Deployment

$25-50K

Full build plus production deployment. ONNX/TensorRT/CoreML export, latency benchmarking, drift detection.

Everything in Custom Model Build
ONNX / TensorRT / CoreML export
Target hardware benchmarking
Adaptive inference pipeline
Monitoring + drift detection
Integration support

Target markets: defense/intelligence, medical devices, industrial inspection, wildlife conservation.
ROI story: a $15K model that runs on a COTS MCU saves $500K/year in GPU deployment costs across your fleet.

Pyros Safety Engine

A 17,000-line pure-Go engine that wraps any LLM in a 7-pillar safety pipeline. Drop it in front of RigRun, OpenAI, Anthropic, vLLM, llama.cpp, or Ollama — Pyros doesn't care which model is downstream.

17K LOC

pure Go

7 pillars

consolidating 23+ packages

144 files

36 test files

Zero CGO

one binary

Zero Python

no runtime deps

Why this exists

Most LLM safety stacks are Python wrappers around a moderation API call. Pyros is a separate process you put between your application and the model, written in pure Go with no Python dependency, that implements the actual safety algorithms from the literature — SmoothLLM, isotonic calibration, EvoPrompt, negative-selection AIS, PID homeostasis — instead of forwarding the question to a hosted classifier.

Pyros does not require RigRun. It runs as a standalone HTTP service on port 3100 and accepts any backend that speaks the OpenAI-compatible chat completions format.

The seven pillars

Each pillar consolidates 2–5 algorithmic packages into a single Layer in the pipeline. The pipeline is variable-width — easy queries skip expensive pillars.

Oracle

Predictive intel

Forecasts request load via time-series, UCB1-routes by difficulty, emits skip signals so downstream pillars can short-circuit easy queries.

Tribunal

Adversarial verification

6-feature hallucination probe with weighted scoring. Optional reward-model integration. Courtroom-style structured verification workflow.

III

Fortress

Pre-inference defense

Prompt injection BLOCKS at confidence ≥ 0.7. SmoothLLM stability check (Robey 2023). Negative-selection AIS detectors learn your traffic's baseline.

MindEye

Metacognition

Knowledge-graph context injection. Epigenetic temperature/token suggestions. Pure-Go isotonic calibration writing a QualityScore on every response.

Forge

Self-improvement

EvoPrompt crossover for prompt evolution. MDL prompt compression. DICE training-exemplar capture. The pipeline gets sharper on your workload over time.

Crucible

Operational safety

Circuit breakers, OpenTelemetry trace spans, Shannon-entropy criticality scoring, pheromone-style stigmergic blackboard for inter-agent signaling.

VII

Singularity

Homeostatic regulation

PID controller over runtime vitals with anti-windup. Superposition activation for parallel candidates. Discrete-event digital-twin simulator.

Every feature has a file:line citation

Pyros doesn't have marketing claims. It has code. Each feature below points at the specific file and line in the source where the algorithm lives.

Prompt injection blocking

pillar/fortress.go:47

Blocks at confidence ≥ 0.7 before the request reaches the model. SmoothLLM perturbation + supermajority vote layered on top.

Hallucination probe

hallucination/probe.go:70

6 weighted features: 4-gram repetition, entity consistency, numeric density, hedge density, entropy variance, confidence gap.

Isotonic calibration

calibration/calibration.go:226

Pure-Go PAVA implementation, validated against sklearn 1.4. Honestly labeled "QualityScore" — not a calibrated probability.

Constant-time auth

server_auth.go:53

Mandatory bearer token for non-localhost binds. Compared via subtle.ConstantTimeCompare. Timing-attack hardened by default.

Admission + graceful shutdown

pyros.go:138

Sized inbound semaphore, in-flight WaitGroup, async PostWorker drained before persistence so bookkeeping writes never get lost.

Audit chain

audit/file_sink.go

JSON-Lines audit sink with mutex serialization. Pluggable backend interface — file sink ships, syslog/network sinks slot in.

Why pure Go

No Python.
No CGO.
One binary.

Pyros reimplements the algorithms it needs from primary sources rather than importing sklearn, numpy, chromadb, or simpy. PAVA isotonic regression. UCB1 bandit. Cosine vector store. Discrete-event simulator. MDL compressor. EvoPrompt crossover.

Ships as a single statically-linked artifact. Drops onto an air-gapped system without an installer. PAVA implementation is bit-for-bit validated against sklearn 1.4 in calibration/pava_reference_test.go.

Research grounding

Built on the literature, not on vibes

SmoothLLM

Fortress stability check

Robey, Wong, Hassani, Kolter (2023)

EvoPrompt

Forge prompt evolution

Guo et al. (2023)

PAVA / isotonic regression

MindEye calibration

Zadrozny & Elkan (2002)

UCB1 contextual bandit

Oracle cascade routing

Auer, Cesa-Bianchi, Fischer (2002)

Negative Selection (AIS)

Fortress immune detectors

Forrest, Perelson, Allen (1994)

PID with anti-windup

Singularity homeostasis

Aström & Hägglund (2006)

Minimum Description Length

Forge prompt compression

Rissanen (1978)

Works with anything that speaks chat completions

Provider packages ship for Ollama, vLLM, and llama.cpp. Anything else that speaks the OpenAI-compatible chat completions format slots in through the same adapter.

Ollama

vLLM

llama.cpp

OpenAI

Anthropic

Azure

Groq

OpenRouter

Want to put Pyros in front of your stack?

Pyros is in early access. If you're running an LLM in production and want a real safety perimeter — not a moderation API call — get in touch.

Request Pyros Access

github.com/jeranaias/pyros · v0.1.0

Mycelium Mesh

A peer-to-peer mesh network that turns heterogeneous devices — phones, laptops, workstations, servers — into nodes in a distributed AI inference system. Built for defense, enterprise, and edge environments where centralized cloud AI is not viable. No center, no config, no exfiltration, no connectivity requirement, no capability floor.

65K LOC

Go server

8 layers

P2P stack

5 phases

complete

40 claims

THRN-022

Phase 6

in progress

Why this exists

Most distributed inference systems assume a datacenter — InfiniBand, RDMA, dedicated edge servers, or trusted hardware enclaves. Mycelium assumes none of that. It's built for the WiFi you already have, the laptops your team already owns, and nodes you don't fully trust. Designed for DoD contested communications, disaster response, privacy-sensitive environments, and developing regions where the centralized cloud is unreliable, expensive, or hostile.

Three core innovations are backed by USPTO provisional THRN-022 (40 claims): distributed expert routing with predictive prefetching, post-generation style normalization for cross-model consistency, and a 3-layer proof-of-inference protocol that catches lazy or compromised nodes.

Three core innovations

Each is implemented in source. Each has a USPTO provisional claim attached. Each one solves a problem that has stopped prior distributed-MoE work from running on consumer hardware.

Distributed MoE Expert Routing

The mesh expert system tracks which peers host which model experts via the gossip protocol. When a query arrives, the expert registry locates the best peer for each required expert computation by load and latency. An expert proxy dispatches computation requests with predictive prefetching — firing async requests during the attention computation window to hide network latency.

mesh/expert_proxy.go

Cross-Model Style Consistency

Different nodes may run different model sizes (0.6B to 122B parameters). The style normalizer applies post-generation processing to ensure consistent tone, formatting, and verbosity regardless of which node generated the response. Hedging removal, preamble stripping, code-block normalization, verbosity control. Seamless escalation uses entropy monitoring to detect when a small model is uncertain and transparently hand off mid-generation to a larger peer.

mesh/style.go

Proof-of-Inference (PoI)

Three-layer cryptographic verification ensuring mesh nodes perform honest computation. Behavioral fingerprinting catches model substitution. Merkle execution traces catch computation shortcuts. Economic reputation with 1% spot-check rate makes persistent dishonesty unprofitable. Combined verdict: 0.3·fingerprint + 0.3·trace + 0.4·reputation.

mesh/poi.go

Architecture

An 8-layer P2P stack

Each layer is a standalone Go package. The full stack runs as a single self-contained inference server on every node. Hardware detection at the bottom assigns roles automatically; the application layer at the top exposes a drop-in OpenAI-compatible API.

The mesh layer alone is 19 Go files / 11K LOC covering DHT, gossip, expert proxy, NAT traversal, load balancing, proof-of-inference, and style normalization. Phase 1 reuses 32 packages extracted from RigRun (auth, router, backend, security, training, memory, RAG).

Application

OpenAI-compatible API, chat, files

Routing & Verification

Cascade router, PoI, reputation

Style Consistency

Post-generation normalization

Expert Orchestrator

Registry, proxy, prefetching

Transport

HTTP/SSE forwarding, NAT traversal

Network Topology

Kademlia DHT, gossip, mDNS

Backend Abstraction

Ollama, llama.cpp, SGLang

Hardware Detection

GPU/CPU/NPU discovery & tiering

Hardware tiers

Mycelium auto-detects hardware capability and assigns each node a role. A phone running a 1B model is just as much a mesh participant as a multi-GPU fortress server — they just route different workloads.

Micro

Raspberry Pi, Mobile

0.6B – 3B

Edge routing, simple queries

Edge

Laptop (8GB), Jetson

3B – 8B

Token gen, classification

Standard

Desktop GPU (RTX 3060+)

8B – 14B

General inference

Power

Workstation (24GB+ VRAM)

14B – 70B

Complex reasoning, expert hosting

Fortress

Multi-GPU server

70B+

Heavy inference, training, aggregation

Discovery + propagation protocols

How nodes find each other, share trained adapters, and distribute model files across the mesh.

Discovery (mDNS + DHT + STUN)

LAN: mDNS advertisement under _mycelium._tcp.local. with capability metadata. DHT: Kademlia 160-bit node IDs, k=20 buckets. WAN: UPnP port mapping + STUN (RFC 5389) for NAT traversal. Manual peer registration available for cross-subnet connectivity.

Adapter Gossip Protocol (AGP)

Nodes share locally-trained LoRA/QLoRA adapters via epidemic gossip. Announcements propagate with fanout=3 every 2 minutes. Adapters carry loss metrics, dataset size, and weight hashes. Receiving nodes evaluate against local test sets. BitTorrent-style chunked transfer (64MB chunks, SHA-256 verified).

Model Distribution Protocol (MDP)

Full model files distributed across the mesh. Chunked transfer with parallel downloads (4 concurrent, 64MB per chunk), SHA-256 per-chunk and full-file verification, resume support for interrupted transfers.

Project status

Five of six phases complete. Phase 6 (production hardening) is in progress — end-to-end integration tests across multi-node deployments.

Core Protocol

Complete

mDNS discovery, DHT, peer registry, health checking

Query Routing

Complete

7-stage cascade, SSE forwarding, load balancing, gossip

Distributed Inference

Complete

Expert registry, computation proxy, prefetching, PoI

Consistency

Complete

Style normalization, seamless escalation, adapter sharing

WAN Scaling

Complete

NAT traversal (UPnP + STUN), DHT-based WAN discovery

Production Hardening

In Progress

End-to-end integration tests, multi-node stress testing

Security baseline

Auth JWT, OAuth/OIDC, API keys, MFA (TOTP)

Authz Classification-based (UNCLASS → TS)

Audit HMAC-chained tamper-evident log

Crypto TLS 1.3 in transit, AES-256-GCM at rest

Compliance NIST 800-53 Rev 5 controls

Verification Cryptographic proof-of-inference

Want to deploy a mesh?

Mycelium licensing is available for defense, enterprise, and research deployments. Tell Thornveil what you're trying to put on the mesh for a scoping conversation.

Request Mycelium Licensing

github.com/jeranaias/mycelium · private · THRN-022

Five ways to put AI to work

RigRun

122B Model Inference

5-Layer Safety Stack

Self-Improving

Rolling Memory — Effectively Infinite Context

Talk to your stack from any surface

RigRun Desktop

RigRun Mobile

Technical preview now. Production licensing soon.

Founding Access

RigRun License

Enterprise

Domain Expert Agents

Code Reviewer

Security Auditor

Documentation Writer

Sprint Planner

SBIR Proposal Writer

Patent Drafter

How agents are made

Need something specific?

HawkStack

Proven Domains

Predict Before Training

Tiny Models Beat Giants

Performance Commitment

Your data. Your model. Your hardware.

Topology Audit

Custom Model Build

Edge Deployment

Pyros Safety Engine

The seven pillars

Oracle

Tribunal

Fortress

MindEye

Forge

Crucible

Singularity

Every feature has a file:line citation

Prompt injection blocking

Hallucination probe

Isotonic calibration

Constant-time auth

Admission + graceful shutdown

Audit chain

No Python.No CGO.One binary.

Built on the literature, not on vibes

Works with anything that speaks chat completions

Want to put Pyros in front of your stack?

Mycelium Mesh

Three core innovations

Distributed MoE Expert Routing

Cross-Model Style Consistency

Proof-of-Inference (PoI)

An 8-layer P2P stack

Hardware tiers

Discovery + propagation protocols

Discovery (mDNS + DHT + STUN)

Adapter Gossip Protocol (AGP)

Model Distribution Protocol (MDP)

Project status

Core Protocol

Query Routing

Distributed Inference

Consistency

WAN Scaling

Production Hardening

Want to deploy a mesh?

No Python.
No CGO.
One binary.