Architecture
Recent articles
Showing 24 of 69
From Zero to Production: The Complete Path
A structured learning path and architectural progression for shipping a real AI-powered product: from demo to …Event Storming - Collaborative Domain Exploration
A complete guide to Event Storming, Alberto Brandolini's technique for exploring complex business domains …Mixture of Experts (MoE)
A neural network architecture in which only a small subset of parameters is activated for each input, enabling …LLM Routing
Architectures that direct each request to one of several available language models based on cost, capability, …AI Systems Are Software Systems
Why production AI requires the same engineering discipline as any distributed system, and how this wiki covers …Twelve-Factor App
What the twelve-factor methodology is, how it guides cloud-native application design, and which factors matter …Software Architecture for AI Systems
Architecture decisions, ADRs, and trade-offs for AI systems covering serving patterns, training …Single Agent vs Multi-Agent Architectures
When to use a single AI agent versus a multi-agent system, covering complexity, reliability, cost, and …REST vs GraphQL for AI Application APIs
Comparing REST and GraphQL API designs for AI applications, covering streaming support, query patterns, …Response Streaming Patterns for AI Applications
Implementing streaming responses from LLMs for improved perceived latency. Server-sent events, chunked …Real-Time vs Batch AI Processing - Choosing the Right Pattern
Decision framework for choosing between real-time and batch AI processing. Latency requirements, cost …Prompt Chaining - Breaking Complex Tasks into Steps
How to design and implement prompt chains for complex AI tasks, covering chain architecture, error handling, …Ports and Adapters
What the ports and adapters pattern is, how it structures application boundaries, and its relationship to …Orchestrator-Worker Pattern
An orchestrator LLM decomposes complex tasks and delegates subtasks to specialized worker models or agents, …Multi-Tenant AI Architecture Patterns
Serving multiple customers from shared AI infrastructure while maintaining data isolation, fair resource …Multi-Region Data Sovereignty Pattern
Architecture pattern for deploying AI systems across multiple regions while respecting data sovereignty …Multi-Model Routing Patterns
Strategies for routing requests to different AI models based on task complexity, cost constraints, and latency …Model Tier Routing - Matching Request Complexity to Model Cost
Route AI requests to different model tiers based on complexity, cost sensitivity, and quality requirements. …Model Ensemble Patterns for AI Applications
Combining multiple models for improved accuracy, reliability, and coverage. Voting, cascading, and …Microservices vs Monolith for AI Applications
Comparing microservice and monolithic architectures for AI applications, covering deployment patterns, team …Memory Patterns for Conversational AI - Short-Term and Long-Term
Architectural patterns for giving AI systems memory across conversations, from sliding context windows to …LLM Gateway Architecture
How to design a centralized LLM access layer that handles routing, rate limiting, cost tracking, caching, and …Inverse Conway Maneuver for AI - Designing Teams to Shape Systems
Using Conway's Law strategically to design AI team structures that produce the desired system architecture, …Hexagonal Architecture
What hexagonal architecture is, how ports and adapters decouple business logic from infrastructure, and …
69 articles in this section. Search for a specific topic.
Open source projects