Self-Hosting RagaAI Catalyst on Kubernetes
This guide will walk you through deploying the Catalyst platform to an existing Kubernetes cluster using Helm, with support for Azure Kubernetes Service (AKS), Google Kubernetes Engine (GKE), and Amazon Elastic Kubernetes Service (EKS).
Contact us at [email protected] for more information.
Supported Kubernetes Distributions
Catalyst has been successfully tested on the following Kubernetes distributions:
Azure Kubernetes Service (AKS)
Google Kubernetes Engine (GKE)
Amazon Elastic Kubernetes Service (EKS)
Prerequisites
Ensure the following tools and resources are ready:
A working Kubernetes cluster (AKS, GKE, or EKS) accessible via
kubectl
, meeting these minimum requirements:Kubernetes version 1.28 or higher
At least 3 nodes, each with:
8 vCPUs
16 GiB RAM
Recommended: Use a cluster autoscaler to dynamically scale nodes based on resource usage
Recommended: Install the metrics server to enable autoscaling
Catalyst uses Elasticsearch, Kibana, Redis (caching), all requiring persistent storage.
Verify storage class availability by running:
kubectl get storageclass
Example output:
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE default (default) disk.csi.azure.com Delete WaitForFirstConsumer true 120d
Helm
Install Helm version 3.13 or higher.
See the official Helm documentation for instructions
Docker Personal Access Token (PAT)
Obtain this from your RagaAI representative.
Contact [email protected] for details
Ingress
Nginx Ingress is recommended for managing external traffic to Catalyst services
Cloud Setup
Note: Secure connection between your Kubernetes cluster and cloud resources is achieved using cloud-native identity mechanisms:
AWS: IAM Roles for Service Accounts (IRSA)
Azure: Federated Identity Credential
GCP: Workload Identity
AWS
S3 Bucket Setup:
Create an S3 Bucket for object storage.
Set up IRSA (IAM Roles for Service Accounts)
Role Name:
raga-role
Permissions: Access to the S3 bucket
Trust relationship: EKS OIDC provider
Service account:
system:serviceaccount:raga:raga-role
Configure CORS settings:
Allowed Methods: GET, PUT
Allowed Origins: *
Allowed Headers: *
Exposed Headers: none
Max Age: 3000 seconds
Database Setup:
Database Version: MySQL 8.0 or later
Network Configuration: Deploy the database within the same VPC as your EKS cluster using a private endpoint
Storage: At least 50GB of SSD storage with automatic storage scaling enabled
Connectivity: Ensure EKS nodes can access the MySQL.
Azure
Azure Blob Storage Setup:
Create an Azure Blob Storage Account.
Create an Azure Storage Container within the Blob Storage Account.
Configure CORS:
Allowed Methods: GET, PUT
Allowed Origins: *
Allowed Headers: *
Set up Federated Identity Credential (Azure AD Workload Identity) for secure access from Kubernetes:
Assign the following roles to the Service Principal for the storage account:
Storage Blob Data Contributor
Kubernetes Service Account:
raga-role
Enable Azure Workload Identity in your AKS cluster.
Database (MySQL) Requirements:
Database Version: MySQL 8.0 or later
Network Configuration: Deploy within the same VNet as your AKS cluster using private endpoint
Storage: At least 50GB SSD storage with automatic storage increase enabled
Ensure the AKS Nodes can access the Azure Database for MySQL Server
GCP
Google Cloud Storage Setup:
Create a GCS Bucket for object storage.
Configure CORS with the following settings:
Allowed Methods: GET, PUT
Allowed Origins: * (all origins)
Allowed Headers: * (all headers)
Set up Workload Identity to access this bucket from GKE:
Create a Google Service Account and grant it the following roles:
roles/storage.admin
roles/storage.objectAdmin
roles/iam.serviceAccountTokenCreator
GKE Namespace:
raga
GKE Service Account:
raga-role
Bind the GKE service account to the Google service account using Workload Identity.
Enable Workload Identity and the GCE Persistent Disk CSI Driver in your GKE cluster.
Database (MySQL) Requirements:
Database Version: MySQL 8.0 or later
Network Configuration: Deploy within the same VPC as your GKE cluster using private IP
Storage: At least 50GB SSD storage with automatic storage increase enabled
Set
SSL_mode = "ALLOW_UNENCRYPTED_AND_ENCRYPTED"
Ensure the GKE Nodes can access the CloudSQL Server.
Configuration
Docker Hub Access
To deploy Catalyst, you must configure access to private Docker Hub repositories hosted by RagaAI.
Obtain Docker PAT:
Contact your RagaAI representative at [email protected] to obtain a Docker Hub Personal Access Token (PAT).
Log in to Docker Hub:
Use the provided PAT to authenticate with Docker Hub:
docker login -u ragaai -p <docker-pat>
Firewall Rules
Inbound Ports:
Port 80 (HTTP): Required at the Load Balancer for accessing APIs and UI
Outbound Ports:
Port 443 (HTTPS): Required if connecting to public LLMs (e.g., OpenAI, Anthropic). If not needed, deploy local models within the network
SMTP (Optional): Required for email alerts
Deploying to Kubernetes
Networking & Traffic Management
Ensure the Nginx Ingress Controller pods are up and running to manage external traffic. Verify by running:
kubectl get pods -n ingress-nginx
Refer to the Nginx Ingress installation instructions for details on setup and troubleshooting.
Deploy the Catalyst initialization Helm chart:
helm install raga-init oci://registry-1.docker.io/ragaai/raga-init \ --version 0.1.0 \ --set dockerpat=<docker-pat>
Successful output example:
NAME: raga-init LAST DEPLOYED: Thu Jun 26 15:34:00 2025 NAMESPACE: default STATUS: deployed REVISION: 1 TEST SUITE: None
Verify:
kubectl get ns raga kubectl get secret regcred -n raga
Deploy the Catalyst Helm chart:
helm install raga-catalyst oci://registry-1.docker.io/ragaai/raga-catalyst \ --version 0.1.0 \ -n raga \ --set releaseTag=<release-tag> \ --set storageClass=<storage-class> \ --set endpoint=<http://loadbalancer-endpoint> \ --set mysql.host=<mysql-host> \ --set mysql.user=<mysql-user> \ --set mysql.password=<mysql-password>
Replace
<mysql-host>
,<mysql-user>
, and<mysql-password>
with your MySQL instance details.These parameters are required for connecting Catalyst to your external MySQL database.
- EKSclusterName
- AzureBlobStorageName
- GcpServiceAccountName
- ClusterAutoscalerRoleARN
- AzureBlobStorageContainerName
- GcsBucketname
- AWSRoleARN
- azWorkloadIdentity.tenantId
- S3BucketName
- azWorkloadIdentity.clientId
- AWSRegion
Based on your cloud environment, you must set only the parameters relevant to your provider during Helm installation.
Successful output example:
NAME: raga-catalyst LAST DEPLOYED: Thu Jun 26 15:35:00 2025 NAMESPACE: raga STATUS: deployed REVISION: 1 TEST SUITE: None
It may take a few minutes to create Kubernetes resources and initialize services
Check pods:
kubectl get pods -n raga
Example output:
litellm-76bd8cdd67-4brtw 1/1 Running 0 22h llm-data-loader-5d4858fcc5-n6qj8 1/1 Running 0 22h llm-platform-api-79d44b7b6d-hsfdz 1/1 Running 0 61m llm-platform-esservice-74b4cf4876-gl7bf 1/1 Running 0 22h llm-platform-operators-869959f965-gdpjc 1/1 Running 0 22h llm-platform-raga-catalyst-sdk-66f5bcc494-fwxfs 1/1 Running 0 22h llm-platform-status-updater-7bd749b98f-tn55b 1/1 Running 0 22h llm-platform-ui-784c69c459-wqq26 1/1 Running 0 7h20m
Validate your Deployment
Run:
kubectl get services -n raga
Example output:
litellm ClusterIP 10.103.99.23 <none> 80/TCP 22h llm-data-loader ClusterIP 10.96.198.238 <none> 80/TCP 22h llm-platform-api ClusterIP 10.99.5.206 <none> 80/TCP 64m llm-platform-api-nodeport NodePort 10.109.130.90 <none> 80:31200/TCP 64m llm-platform-esservice ClusterIP 10.96.80.164 <none> 80/TCP 22h llm-platform-operators ClusterIP 10.98.17.72 <none> 80/TCP 22h llm-platform-raga-catalyst-sdk ClusterIP 10.111.108.37 <none> 80/TCP 22h llm-platform-status-updater ClusterIP 10.107.89.124 <none> 80/TCP 22h llm-platform-ui ClusterIP 10.105.47.0 <none> 80/TCP 7h23m
Access the platform using the external IP of the
raga-catalyst-frontend
service:curl <external-ip>/api/healthcheck
Expected output:
{"status":"healthy"}
Visit the external IP in your browser to confirm the Catalyst UI is operational
Example:
http://<external-ip>
The UI should be visible and functional
Final Notes
Ensure proper IAM permissions for your cloud provider's storage, Kubernetes service, and Helm deployments
Monitor cluster health using ELK and
kubectl logs
Check Helm release status:
helm list -n raga helm status raga-catalyst -n raga
For issues, contact the RagaAI team at [email protected]
Last updated
Was this helpful?