Model-Architecture

2 articles
Mixture of Experts - Routing Queries to Specialist Sub-Networks How Mixture of Experts architecture enables large-scale AI models by activating only a subset of parameters …Inference-Time Scaling - Optimizing Reasoning at Inference Rather Than Training How inference-time compute scaling enables AI models to improve performance by thinking longer on hard …