Install

TOC

Requirements

Hardware

  • At least two nodes with 16 cores and 32 GB of memory in total.
  • Additional resources for runtime serving are determined by the actual business scale: 10x7B-sized LLM inference instances at the same time, requires at least 10 GPUs and corresponding CPU, memory, disk storage, and object storage.
  • 200G free disk space for each worker node.

Software

  • CUDA Toolkit Version: 12.6 or higher.
INFO

If your GPU does not support CUDA 12.6, you may still use lower versions of the CUDA Toolkit.However, after deploying Alauda AI, it is necessary to add a custom inference runtime that is adapted for older CUDA versions. This can be done by referring to Extend LLM Inference Runtimes,since the built-in vLLM inference runtime only supports CUDA 12.6 or later versions.

Installing

Installing Alauda AI involves the following high-level tasks:

  1. Confirm and configure your cluster to meet all requirements. Refer to Pre-installation Configuration.
  2. Install Alauda AI Essentials. Refer to Install Alauda AI Essentials.
  3. Install Alauda AI. Refer to Install Alauda AI.

Then, the core capabilities of Alauda AI have been successfully deployed. If you want to quickly experience the product, please refer to the Quick Start.