Serverless GPUs in Private Preview: L4, L40S, V100, and more
Today, we're excited to share that GPU Instances designed to support AI inference workloads are available in private preview. These GPUs provide up to 48GB of vRAM, 733 TFLOPS and 900GB/s of memory bandwidth to support large models including LLMs and text-to-image models.