Performance

17 articles

All articles 17 total

Speculative Decoding New An inference speedup where a small draft model proposes several tokens …

Added 29 Jun · Upd 29 Jun ·5 min

What is Big O Notation? New A plain-English guide to Big O notation, the language for describing how … Basics

Added 23 Jun · Upd 23 Jun ·8 min

AI-Optimized Cache Invalidation AI predicts optimal cache TTLs and invalidation timing based on access … Ideas

Added 28 Mar · Upd 30 May ·2 min

AI-Recommended Database Indexes AI analyzes query patterns and execution plans to recommend optimal … Ideas

Added 28 Mar · Upd 30 May ·2 min

Building gRPC Microservices for ML Inference How to build gRPC-based microservices for ML inference: proto … Guides

Added 28 Mar · Upd 30 May ·3 min

CDN Content Delivery Network Glossary

Added 28 Mar · Upd 30 May ·2 min

CPU Scheduling Operating system algorithms that determine which process or thread runs … Glossary

Added 28 Mar · Upd 30 May ·3 min

gRPC What gRPC is, how Protocol Buffers and streaming RPCs work, and why gRPC … Glossary

Added 28 Mar · Upd 30 May ·2 min

Hardware Constraints for AI Systems CPU vs GPU, VRAM limits, memory bandwidth, and how hardware choices … Glossary

Added 26 Mar · Upd 30 May ·4 min

KPI Framework for AI Measuring AI Impact Frameworks

Added 28 Mar · Upd 30 May ·4 min

Model Distillation Patterns for Production AI Using large model outputs to train smaller, cheaper, faster models for … Patterns

Added 28 Mar · Upd 30 May ·3 min

Performance Engineering for AI Systems A comprehensive guide to latency optimization, GPU memory management, … Guides

Added 28 Mar · Upd 30 May ·3 min

Prompt Caching Server-side caching of attention key/value tensors for repeated prompt … Glossary

Added 8 May · Upd 30 May ·5 min

Redis What Redis is, how it provides in-memory data storage, and common use … Glossary

Added 28 Mar · Upd 30 May ·3 min

Semantic Caching for AI Applications Caching AI model responses based on semantic similarity rather than … Patterns

Added 28 Mar · Upd 30 May ·3 min

Token Optimization Patterns for LLM Applications Strategies for reducing token usage without sacrificing output quality. … Patterns

Added 28 Mar · Upd 30 May ·3 min

Vector Search Optimization Patterns Improving vector search quality and performance. Index tuning, hybrid … Patterns

Added 28 Mar · Upd 30 May ·4 min

Open source projects

Freelancer Templates Contracts, proposals, SOWs

Freelancer Automation Workflow recipes, AI playbooks

Work with Linda

Workshop Series €2,000/mo x 3

1:1 Consulting 60 min session