Red Hat OpenShift AI
A hybrid cloud MLOps platform for building, training, serving, and monitoring AI and ML models on Red Hat OpenShift, from data center to edge.

Red Hat OpenShift AI (formerly Red Hat OpenShift Data Science) is a platform for building, training, serving, and monitoring AI and ML models on top of Red Hat OpenShift. It packages the tools a data science team needs into one Kubernetes-based environment, so you avoid stitching notebooks, pipelines, and model serving together yourself. Its main draw is portability: you run the same MLOps workflow in a public cloud, in your own data center, at the edge, or in a disconnected environment.
The problem it solves is fragmentation. Most teams assemble Jupyter, a pipeline engine, a serving runtime, and GPU scheduling from separate projects, then maintain that glue forever. OpenShift AI integrates those components as a supported product and keeps them consistent wherever OpenShift runs.
Where it sits in the stack
OpenShift AI is a layer on top of Red Hat OpenShift , which is itself a layer on top of Kubernetes. The lower layers handle containers, GPUs, and cluster operations. OpenShift AI adds the AI and ML tooling above them.
How it fits and how to use it
OpenShift AI does not ship as a command-line install you drop onto a laptop. You add it to an existing OpenShift cluster as an operator, then work through its dashboard and Kubernetes-native resources. You can run it on OpenShift you manage yourself (self-managed) or on a managed OpenShift service from a cloud provider. Red Hat lists AWS, Azure, Google Cloud, and IBM among the environments it runs in, alongside hardware partners including NVIDIA, AMD, Intel, Dell, and Lenovo.
The platform maps onto the standard MLOps lifecycle. Each stage uses an open-source project underneath, so skills and artifacts transfer between clusters.
Model serving and inference
Serving is where the open-source foundation matters most. OpenShift AI uses KServe to orchestrate serving workloads and to autoscale model servers based on load. It ships vLLM runtime templates for efficient large-model inference on GPUs. KServe supports both a serverless mode and a raw deployment mode, so multiple workloads can share GPU resources and scale down when idle. This matters because idle GPUs are expensive, and autoscaling means you pay for them mainly while they serve traffic.
The hybrid and portability angle
The reason to choose OpenShift AI over a single-cloud service is the same reason to choose OpenShift itself: you run one platform everywhere. Red Hat positions it as a way to develop, train, and deploy models in a common environment whether on site, in the cloud, or at the edge, including disconnected environments with no internet access. If regulation, data residency, or latency keeps some workloads out of a public cloud, that portability is the point. For the wider decision, see the hybrid and multi-cloud AI guide .
How it compares
The main alternatives are managed MLOps platforms tied to a single cloud. OpenShift AI trades some of their turnkey convenience for portability across environments.
| OpenShift AI | Amazon SageMaker | Azure ML | Vertex AI | |
|---|---|---|---|---|
| Runs where | Any OpenShift cluster | AWS | Azure | Google Cloud |
| Foundation | Kubernetes, open source | AWS managed services | Azure managed services | Google managed services |
| On premise and edge | Yes | Limited | Limited | Limited |
| Serving runtime | KServe, vLLM | SageMaker endpoints | Managed endpoints | Managed endpoints |
| Best for | Hybrid and regulated estates | AWS-native teams | Azure-native teams | Google-native teams |
For a broader view of running AI across providers, see Amazon Bedrock and Azure OpenAI .
When not to use it
OpenShift AI is a good fit for organisations that already run OpenShift or need workloads in more than one environment. It is a poor fit in several cases.
- You have no Kubernetes or OpenShift footprint. Standing up OpenShift purely to get MLOps is heavy. A managed cloud platform is faster to start.
- You are all-in on one cloud. If everything lives in AWS, Azure, or Google Cloud and will stay there, that cloud’s native platform removes more operational work.
- You want a fully hosted, zero-operations service. OpenShift AI still runs on a cluster your team or a provider operates. It is not a serverless API you call and forget.
- Your models are tiny and infrequent. If you serve one small model occasionally, a container or a hosted endpoint is simpler than a full MLOps platform.
Further reading
- Red Hat OpenShift : the application platform OpenShift AI is built on.
- What is MLOps? : the lifecycle this platform is designed to support.
- What is inference? : what happens when a served model answers a request.
- Hybrid and multi-cloud AI : how to decide where AI workloads should run.
- How AI models are evaluated : judging model quality before and after you serve it.
- Red Hat OpenShift AI product page : the official product overview.
- Red Hat OpenShift AI documentation : install, configure, and operate the self-managed edition.
Sources
- Red Hat OpenShift AI product page: https://www.redhat.com/en/technologies/cloud-computing/openshift/openshift-ai
- Red Hat OpenShift AI product page (products path): https://www.redhat.com/en/products/ai/openshift-ai
- Red Hat OpenShift AI for developers: https://developers.redhat.com/products/red-hat-openshift-ai
- Red Hat OpenShift AI self-managed documentation: https://docs.redhat.com/en/documentation/red_hat_openshift_ai_self-managed/
- KServe project: https://github.com/kserve/kserve
- Autoscaling vLLM with OpenShift AI (Red Hat Developer): https://developers.redhat.com/articles/2025/10/02/autoscaling-vllm-openshift-ai