← Back to Red Deer Investments  ·  AI Library Home

AI Library

A growing repository of short-form books exploring inference economics, model compression, and the architecture of modern AI — written for both human readers and AI agents.

The Waiting Game

How Inference Economics Shapes the Future of AI

A journey through the memory wall, KV caches, batch economics, speculative decoding, and custom silicon — explaining why everything in AI slows down before it gets faster.

Beyond the Waiting Game

How AI is Learning to Work Around the Memory Wall

The sequel exploring what comes after inference economics — architectures and techniques that reshape how models think.

SlimQwen Reference Book

A Guide to Compressing and Optimizing Large Language Models

Practical techniques for shrinking giant AI brains: compression, mixture of experts, pruning, merging, and recovery training.

This section is also optimized for LLM and AI agent consumption. See llms.txt for machine-readable structured content.