EmergentBit

~/book

LLMs, Rust &
Functional Programming

A ground-up guide to building intelligent systems with Rust — no Python, no magic, no hand-waving. Just types, tensors, and transformers.

Status: Work in Progress

Writing in public. Early chapters released as blog posts. Subscribe to follow along.

Follow along

About This Book

Most ML literature assumes Python. Most systems literature ignores ML. This book lives at the intersection: building real language model infrastructure using Rust's type system, ownership model, and performance characteristics.

The approach is bottom-up. We start with the functional programming patterns that make Rust code composable and correct — monads, algebraic types, parser combinators — and build toward a fully functional inference engine.

By the end, you'll have implemented attention from scratch, built a KV cache, written a sampler, and assembled a production-ready inference server — all in safe (mostly) Rust.

No prior ML experience required. Familiarity with Rust fundamentals is assumed. A curiosity about how language models actually work is essential.

What You'll Learn

Write composable, correct Rust using functional patterns
Implement a transformer from scratch — no frameworks
Memory-map large models with zero-copy tensor views
Build a production inference server with predictable latency
Design RAG pipelines using monadic composition
Understand attention, positional encoding, and the KV cache
Use Rust's type system to make illegal states unrepresentable
Evaluate LLM systems rigorously on real task distributions

Table of Contents

Part I — Foundations

  1. 01Why Rust for AI SystemsUpcoming
  2. 02Types as Contracts: Rust's Ownership ModelUpcoming
  3. 03Functional Patterns in Rust: From Iterators to MonadsUpcoming
  4. 04Algebraic Data Types and Pattern MatchingUpcoming

Part II — Language Models from the Ground Up

  1. 01Tokenization: Bytes, Characters, and BPEUpcoming
  2. 02Attention Is All You Need — Implemented in RustUpcoming
  3. 03The Transformer Architecture End-to-EndUpcoming
  4. 04KV Cache, Positional Encodings, and EfficiencyUpcoming

Part III — Inference & Production

  1. 01Memory-Mapped Models and Zero-Copy LoadingUpcoming
  2. 02SIMD and Hardware-Accelerated InferenceUpcoming
  3. 03Sampling Strategies: Greedy, Top-k, Top-p, and BeamUpcoming
  4. 04Building a Production Inference Server in RustUpcoming

Part IV — Agents and Composition

  1. 01RAG Pipelines with Functional CompositionUpcoming
  2. 02Tool Use and Structured OutputsUpcoming
  3. 03Multi-Agent Coordination without a Central OrchestratorUpcoming
  4. 04Evaluating LLM Systems RigorouslyUpcoming

Stay in the loop

Get notified when chapters drop

Early chapters will be released as blog posts. Subscribe to the newsletter to get them first.

Subscribe