AI Systems Are Software Systems
Why production AI requires the same engineering discipline as any distributed system, and how this wiki covers the full stack of AI …
Why production AI requires the same engineering discipline as any distributed system, and how this wiki covers the full stack of AI …
A comprehensive guide to GitHub Actions security vulnerabilities, common exploit patterns, and how to audit and harden your CI/CD pipelines …
A step-by-step guide to creating professional demo and explainer videos entirely in code using Remotion, React, and AI-assisted development. …
The principle of defining infrastructure, configuration, documentation, policy, video, and design as version-controlled code artifacts - and …
The AWS ML Lens extends the Well-Architected Framework to cover ML lifecycle phases, ML pipeline automation, model security, inference …
Semantic caching, Anthropic prompt caching, response caching, and embedding caching for AI applications. Cost savings analysis and …
CPU vs GPU, VRAM limits, memory bandwidth, and how hardware choices determine what AI models you can run and at what cost.
What hybrid cloud is, why it matters for AI workloads with data gravity and compliance constraints, and AWS hybrid options including FSx for …
A practical guide to the three languages used across a modern AI stack: Python for agents and models, TypeScript for frontends and video …
Practical prompt engineering patterns for production AI systems: system prompts, few-shot examples, chain-of-thought, structured output, …
Exponential backoff with jitter, retry budgets, and idempotency patterns for production AI systems. Why AI services require different retry …
How sorting and search algorithms underpin AI pipeline design: complexity trade-offs, partial sorting for top-k selection, tiered analysis …
What the Well-Architected Framework is, its origins at AWS, how Azure and GCP adopted it, its six pillars, and why it matters especially for …
Apply cheap analysis first, score results, then apply expensive analysis only to candidates that pass a threshold. Reduces AI API costs by …
How AI system architecture evolves from monolithic single-model deployments through microservices to collaborative multi-agent systems, with …
How the four cloud deployment models apply to AI workloads: when to use managed models, platform endpoints, GPU instances, or serverless …
Using Amazon CloudWatch for AI workloads: custom metrics for LLM cost and token usage, alarms for model quality, log insights for inference …
Using Amazon EventBridge to connect AWS AI services, trigger pipelines from S3 events, and build loosely coupled multi-step workflows.
Using Amazon OpenSearch Service for vector search, full-text search, and log analytics in AI-powered applications.
Using AWS Elemental MediaConvert for transcoding, format conversion, and video processing in AI media pipelines.
What blue-green deployment is, how it works, why it matters for zero-downtime AI model updates, and how it compares to canary and rolling …
Zero-downtime model updates using blue-green deployment: how it works, AWS implementation with Lambda aliases and SageMaker variants, and …
What canary deployment is, how gradual traffic shifting works, which metrics to watch, and how to configure automatic rollback triggers for …
Gradual traffic shifting to new model versions: how to implement canary deployments with Lambda weighted aliases and SageMaker production …
Building reliable CI/CD pipelines for AI projects: model artifact management, automated evaluation gates, GitHub Actions workflows, and …
What the circuit breaker pattern is, why AI services need it for handling model timeouts and rate limits, and how to implement it with AWS …
Handling model failures gracefully in production AI systems: fallback strategies, degraded mode operation, retry with backoff, and …
What event sourcing is, why it matters for AI audit trails and pipeline replay, its relationship to CQRS, and when to apply it in AI …
How to run an Event Storming workshop specifically for discovering AI automation opportunities: domain events, commands, policies, and …
What feature flags are, how they enable safe AI model rollouts, A/B testing, and instant rollback - and the tools available for implementing …
Using feature flags to safely roll out AI model changes: A/B testing models, canary deployments, gradual traffic shifting, and instant …
Applying the Why-Who-How-What Impact Mapping framework to AI projects: grounding AI initiatives in measurable business outcomes and avoiding …
Why IaC matters for AI reproducibility, multi-environment consistency, and cost tracking. Terraform and CDK patterns for Bedrock agents, …
What the Model Context Protocol is, how it enables AI agents to use tools through a standard interface, and server/client architecture.
What drift is, the three types (data, concept, prediction), how to detect them using SageMaker Model Monitor, and when to trigger model …
Why model versioning matters and how to implement it: S3 for artifacts, Git for configuration, SageMaker Model Registry, Bedrock model …
What observability means, the three pillars of logs, metrics, and traces, and why AI systems need specialized observability for token costs, …
Applying the three pillars of observability to AI workloads: CloudWatch for metrics and alarms, Langfuse for LLM tracing, OpenTelemetry for …
Applying Open Practice Library practices to AI: Event Storming for AI use case discovery, Impact Mapping for AI value, User Story Mapping …
What property-based testing is, why it is ideal for AI systems that cannot be tested with exact-output assertions, and the tools available …
Using Pydantic AI to build AI agents with validated inputs and outputs, Bedrock backend support, and Python type annotations.
What Strands Agents is, how it differs from CrewAI and LangGraph, and when to use it for AWS-hosted agent applications.
A practical testing strategy for AI systems: property-based testing, integration testing with mocked models, evaluation frameworks, and …
How AWS shared responsibility applies to AI and ML workloads: data, model, and infrastructure responsibilities across Bedrock and SageMaker.
Model selection by task, caching strategies, batch vs real-time processing, and tiered inference with Haiku, Sonnet, and Opus.
Chatbot-based citizen inquiries, form pre-filling, status tracking, and multilingual support for government agencies.
How AI assists recruitment teams with resume screening, candidate matching, and interview scheduling - with guidance on bias mitigation and …
What AI guardrails are, the types of controls they enforce, how to implement them in enterprise applications, and Amazon Bedrock Guardrails …
A five-dimension self-assessment to understand where your organization stands before committing to an AI program.
How to achieve production-quality multi-speaker transcription with speaker diarization, using AWS Transcribe and Bedrock post-processing.
Automated subtitle generation, audio descriptions, sign language overlay detection, and WCAG compliance checking for broadcast and media …
A practical comparison of Amazon Bedrock and Azure OpenAI Service for enterprise AI deployments, covering model selection, pricing, …
Amazon Cognito User Pools and Identity Pools: JWT token structure and expiry, MFA options, SAML/OIDC federation, Lambda triggers, rate …
Sentiment analysis, entity extraction, topic modeling, and language detection with Amazon Comprehend. When to use Comprehend vs Bedrock for …
What Rekognition does, which features work well in enterprise applications, accuracy considerations, pricing, and common integration …
When to use SageMaker for custom ML versus Bedrock for managed foundation models - a practical comparison for enterprise AI teams.
A reference guide to Amazon Textract: OCR capabilities, table and form extraction, query-based extraction, and integration patterns for …
Amazon Transcribe capabilities, accuracy characteristics, pricing, and the integration patterns that work well for enterprise transcription …
Auto-tagging video and audio content, scene classification, topic extraction, and SEO metadata generation for media libraries.
A service-by-service map of AWS AI and ML services to their Azure AI equivalents, covering language models, speech, vision, and MLOps.
A service-by-service map of AWS AI and ML services to their Google Cloud equivalents, covering language models, speech, vision, and MLOps.
Using AWS Amplify to deploy front-end applications, host static sites, and connect to AWS AI backends.
Serverless inference, event-driven processing, and integration patterns with Bedrock, SageMaker, and Step Functions. Cost optimization for …
How Step Functions orchestrates multi-step AI workflows, handles retries and errors, and integrates with other AWS services - with practical …
When to use state machines vs direct invocation for AI workflows. Error handling, retry patterns, cost comparison, and visibility …
Practical guidance for building customer-facing AI chatbots that deliver real value - architecture, knowledge base design, escalation …
How the discipline of preparing conference talks produces better AI prototypes, clarifies system design, and accelerates learning. Covers …
Summarization, sliding window, retrieval-augmented, and hierarchical context patterns for handling conversations and documents that exceed …
What CrewAI is, how it models multi-agent systems as crews with roles and tasks, integration with LLM backends, and when to use it versus …
Architecture differences, AWS integration, and decision criteria for choosing between CrewAI and Strands Agents for multi-agent AI systems.
SageMaker custom training vs Bedrock foundation models. Data requirements, cost, accuracy trade-offs, and maintenance burden.
Practical patterns for building reliable data pipelines that feed AI and ML systems - ingestion, transformation, feature engineering, and …
How to prepare data for AI projects: assessing what you have, cleaning and normalizing it, building evaluation datasets, and setting up …
What embeddings are, how they enable semantic search, which embedding models to use, and how to choose vector database infrastructure.
What event-driven architecture is, how S3 triggers, EventBridge, and Step Functions patterns enable scalable AI pipelines.
The three main approaches to customizing LLM behavior for specific use cases - when each is appropriate and how they compare.
A structured three-workshop methodology that takes an organization from AI curiosity to a validated, buildable prototype with stakeholder …
Preparation, agenda design, stakeholder management, use case brainstorming techniques, prioritization exercises, and gap management between …
A practical architecture for extracting structured data from invoices, contracts, and forms - combining OCR, classification, and LLM-based …
Using Langfuse to trace LLM calls, evaluate outputs, and monitor AI application quality in production.
How LangGraph models AI agent workflows as stateful graphs, enabling cyclic execution, human-in-the-loop, and complex multi-step agent …
Using LlamaIndex for retrieval-augmented generation, data connectors, and agent workflows, with Bedrock and OpenSearch integration.
A practical introduction to multi-agent AI architectures: when to use them, how they work, and which frameworks are production-ready.
Definition, architecture patterns, and frameworks for multi-agent AI systems - and the signals that indicate a single-agent approach is no …
Proven prompt patterns for enterprise AI applications: structured output, chain-of-thought, few-shot examples, guardrails, and system prompt …
A practical framework for deciding between retrieval augmented generation and fine-tuning to customize LLM behavior for enterprise …
Using Remotion to generate videos programmatically from React components, with Lambda rendering for scalable AI-driven video production.
When to use Remotion (React-based programmatic video) vs FFmpeg (command-line video processing) for AI video pipelines.
Using Terraform to provision and manage AWS infrastructure for AI projects: modular design, state management, and multi-environment …
When to use Terraform vs AWS CDK for AI project infrastructure: pros, cons, and decision criteria for each tool.
A structured WSJF-inspired scoring methodology to cut through workshop noise and identify the AI use cases worth building first.
What vector databases are, how they enable semantic search, popular options including Pinecone, Weaviate, and pgvector, and when to use …
The difference between prompting and grounding. Five stages from zero context to production-ready assets. The Personal Inference Pack …
The cloud architecture review methodology used by AWS, Azure, and Google Cloud to evaluate workloads against proven best practices across …