Benchmark: Stable Diffusion Image Generation

Key Result: 112 Images/Second
Achieve high-throughput performance for generative AI services, creative tools, and content platforms.
The Challenge
Generative AI applications, especially those providing real-time image generation, face immense computational demands. Services built on models like Stable Diffusion must handle concurrent user requests with minimal latency to provide a good user experience. Scaling these services with traditional cloud instances can be prohibitively expensive and complex.
The DGX Spark Solution
The NVIDIA DGX Spark is engineered for high-throughput inference. Leveraging its NVIDIA Blackwell architecture and optimized software like TensorRT, it can process large batches of generation requests in parallel. This makes it an ideal, cost-effective solution for deploying generative AI models at the edge or as a dedicated, at-your-desk development and testing platform before scaling to the cloud.
Technical Deep Dive
The benchmark was conducted using Stable Diffusion v1.5. To maximize performance, we utilized PyTorch with xFormers for memory-efficient attention and AITemplate to compile the model into a highly optimized engine. The test measured sustained throughput for generating 512x512 pixel images with 50 inference steps, simulating a real-world API workload.
Quantifiable Results
Under these conditions, a single DGX Spark unit achieved a sustained generation speed of 112 images per second. This level of performance enables the development of robust, responsive applications that can serve thousands of users without the high costs of large-scale cloud infrastructure.
Benchmark Settings
- Model: Stable Diffusion 1.5
- Software: PyTorch, TensorRT
- Resolution: 512x512 pixels
- Inference Steps: 50
- Batch Size: 16
- Performance: 112 images/sec
Power Your Generative AI Vision
Start building the next generation of creative tools with the DGX Spark.
View Pricing & Order Now