Runtime

Rust/WASM Runtime

The runtime is the verified execution surface: memory transfer, model loading, tokenizer dispatch, quantized tensor access, generation, diagnostics, and explicit error states.

01

No-crate implementation posture

The active runtime direction is a no-third-party-crate Rust/WASM target. The strongest supply-chain control is the smallest dependency graph possible. Unsafe code is confined to the unavoidable WASM memory boundary and must be paired with validation.

  • No ML framework
  • No tokenizer dependency
  • No general-purpose web framework
  • Isolated unsafe boundary

02

Generation loop

Generation uses preallocated runtime structures: model-owned scratch arenas, reusable logits, fixed-buffer sampling, bounded decoding, and clean request state. This keeps the hot path stable and makes failure states testable.

  • ForwardScratch
  • Reusable logits
  • Fixed top-k candidate cap
  • 64 KiB output cap

03

Diagnostics

The runtime exposes information the UI can display: model load state, quantization mode, prompt token count, generated token count, KV length, scratch usage, adapter apply count, and assembly checksums. These are part of correctness, not decorative telemetry.

  • Local-only diagnostics
  • No external analytics
  • Escaped JSON output
  • Reset and free-model state handling

Plain PHP deployment notes

No WordPress bootstrap, theme system, database requirement, composer package, npm build, CDN script, Bootstrap class dependency, or jQuery dependency.

Routes are handled by index.php and server rewrites. Core content lives in data/pages.php. HTML and Markdown share the same page data.

Upload the package to the subdomain root, point Apache/Nginx to the folder, and keep .htaccess enabled for clean routes on Apache.