Gpu

8 articles

All articles

vLLM - High-Performance LLM Serving Engine vLLM is an open-source library for high-throughput, low-latency serving of large language models using …

open-source llm

Performance Engineering for AI Systems A comprehensive guide to latency optimization, GPU memory management, throughput engineering, and model …

performance latency

GPU vs TPU for AI Training and Inference Comparing GPUs and TPUs for AI model training and inference, covering performance, cost, ecosystem, and …

GPU Pooling Shared GPU infrastructure with intelligent scheduling: maximizing GPU utilization across teams, managing …

gpu infrastructure

Deep Learning What deep learning is, how it differs from traditional machine learning, and when deep learning is the right …

deep-learning neural-network

Capacity Planning for AI Inference How to right-size GPU and TPU clusters, configure autoscaling for inference workloads, manage GPU memory, and …

capacity-planning gpu

AI Hardware Comparing GPUs, TPUs, and custom ASICs from NVIDIA, Google, Groq, and Cerebras for training and inference …

Hardware Constraints for AI Systems CPU vs GPU, VRAM limits, memory bandwidth, and how hardware choices determine what AI models you can run and …

cs-fundamentals intermediate

Open source projects

Freelancer Templates Contracts, proposals, SOWs

Freelancer Automation Workflow recipes, AI playbooks

Work with Linda

Workshop Series €2,000/mo x 3

1:1 Consulting 60 min session