RAY

Distributed AI compute engine for scaling ML workloads

Apache-2.0

ABOUT

Scaling machine learning workloads from a single machine to a cluster typically requires rewriting code with complex distributed systems primitives like MPI, Kubernetes, or Spark. Ray eliminates this barrier by providing a simple, universal API for distributed computing that works the same on a laptop and a 1000-node cluster. Its ecosystem of libraries (Ray Train, Ray Tune, Ray RLlib, Ray Serve) lets teams scale training, tuning, and serving without changing their PyTorch, TensorFlow, or XGBoost code, dramatically reducing the engineering effort needed for production ML infrastructure.

INSTALL

pip install ray

INTEGRATION GUIDE

1. Scaling PyTorch or TensorFlow training across multiple GPUs or nodes with minimal code changes 2. Running large-scale hyperparameter searches with Ray Tune on a distributed cluster 3. Deploying and serving ML models in production with Ray Serve alongside existing applications

RAY

ABOUT

INTEGRATION GUIDE

TAGS