CUDA Stencil Benchmark

High-performance CUDA kernel generation and benchmarking framework

View the Project on GitHub jasonlarkin/cuda-stencil-benchmark

Project Goal

Overview

CUDA Stencil Benchmark is a framework for systematically generating, validating, and optimizing CUDA kernels for 3D finite-difference stencil computations using LLM-guided code generation.

Primary Objectives

  1. LLM-Guided Kernel Generation: Use language models to generate optimized CUDA kernels from task specifications and reference implementations.

  2. Correctness-First Validation: Ensure numerical parity between generated CUDA kernels and CPU reference implementations before performance optimization.

  3. Systematic Performance Analysis: Characterize kernel performance using roofline methodology to understand memory-bound vs compute-bound behavior.

  4. Iterative Optimization Loop: Establish a feedback-driven workflow where kernel generation, correctness testing, and performance analysis inform subsequent optimization attempts.

Target Applications

Success Criteria

Non-Goals