# Rust/WASM Runtime

> The runtime is the verified execution surface: memory transfer, model loading, tokenizer dispatch, quantized tensor access, generation, diagnostics, and explicit error states.

- Site: TinyRustLM
- Canonical: https://TinyRustLM.mirust.com/runtime/
- Version: 1.3.0
- Updated UTC: 2026-06-29T22:59:10Z

## No-crate implementation posture

The active runtime direction is a no-third-party-crate Rust/WASM target. The strongest supply-chain control is the smallest dependency graph possible. Unsafe code is confined to the unavoidable WASM memory boundary and must be paired with validation.

- No ML framework
- No tokenizer dependency
- No general-purpose web framework
- Isolated unsafe boundary

## Generation loop

Generation uses preallocated runtime structures: model-owned scratch arenas, reusable logits, fixed-buffer sampling, bounded decoding, and clean request state. This keeps the hot path stable and makes failure states testable.

- ForwardScratch
- Reusable logits
- Fixed top-k candidate cap
- 64 KiB output cap

## Diagnostics

The runtime exposes information the UI can display: model load state, quantization mode, prompt token count, generated token count, KV length, scratch usage, adapter apply count, and assembly checksums. These are part of correctness, not decorative telemetry.

- Local-only diagnostics
- No external analytics
- Escaped JSON output
- Reset and free-model state handling
