01
Why not generic model files
Generic model formats are flexible, but that flexibility usually requires broader parsers, larger metadata surfaces, and more runtime interpretation. TinyRustLM uses a narrower artifact contract so the browser runtime can validate and load only what it knows how to execute.
- Header validation
- Tensor directory validation
- Tokenizer checksum
- Layout checksum
- Finite payload checks
02
Structural invariants
The format records explicit model dimensions and tensor entries. Header fields must be non-zero, tensor entries must point inside the file, declared shapes must match payload counts, and checksums must bind the artifact before generation is allowed.
- 108-byte header target
- 64-byte tensor entries
- Non-zero checksum
- No NaN or Infinity payloads
03
Tokenizer inclusion
The runtime supports byte-level fallback and compiled BPE metadata. Tokenizer data belongs inside the model artifact or its verified sidecar path so generation does not depend on Python-only tokenizer files at runtime.
- Byte fallback
- BPE metadata
- Bounded decode
- Tokenizer drift checks