What types of GPU resources does Modal offer?
Modal provides access to a wide range of Nvidia GPUs including B200, H200, H100, RTX PRO 6000, A100 (80GB and 40GB), L40S, A10, L4, and T4. This elastic GPU capacity allows users to scale their compute resources based on their specific workload needs without quotas or reservations.
How does Modal's pricing model work for compute resources?
Modal uses a usage-based pricing model where you only pay for actual compute time, by the CPU cycle or GPU second. You are not charged for idle resources. This serverless approach means costs automatically scale with your usage, making it efficient for spiky or unpredictable workloads.
Can I use Modal for both inference and training of AI models?
Yes, Modal supports both inference and training workloads. You can deploy and scale inference for various models like LLMs, audio, and image/video generation. For training, it allows fine-tuning open-source models on single or multi-node clusters instantly.
What is included in Modal's free Starter plan?
The Starter plan offers $30 per month in free compute credits, 3 workspace seats, 100 containers, and 10 GPU concurrency. It also includes limited crons and web endpoints, real-time metrics, logs, and region selection, making it suitable for small teams and independent developers.
Does Modal offer any programs for startups or academics?
Yes, Modal provides credit grants for early-stage startups, offering up to $25,000 in free compute credits. Graduate students, labs, and researchers can also apply for academic grants, receiving up to $10,000 in free compute credits to support their work.