GPU Computing
The use of graphics processing units for general-purpose parallel computation, now the dominant hardware platform for training and running AI models.
What is GPU Computing?
GPU computing harnesses the massively parallel architecture of graphics processing units—originally designed for rendering pixels—to accelerate computational workloads like AI model training, scientific simulation, and data processing. A single modern GPU contains thousands of cores that can execute operations simultaneously, making it orders of magnitude faster than CPUs for parallelisable tasks.
Why GPUs Dominate AI
Training a large language model involves trillions of matrix multiplications—exactly the kind of operation GPUs were designed for. NVIDIA's CUDA ecosystem has become the de facto standard, with their A100 and H100 GPUs powering the vast majority of AI training runs worldwide.
- Training — Large-scale model training across GPU clusters (days to weeks)
- Inference — Serving model predictions in production (milliseconds per request)
- Fine-tuning — Adapting models on smaller GPU setups (hours to days)
The GPU Landscape
The market is evolving rapidly. NVIDIA dominates with H100/H200 and the upcoming B200 Blackwell series. AMD is gaining ground with MI300X. Cloud providers (AWS, Azure, GCP) offer GPU instances on demand, while dedicated GPU cloud providers (Lambda, CoreWeave, Together AI) compete on price and availability.
The Blue Note Logic Perspective
Our GPU Infrastructure product and management service helps clients navigate the complex GPU landscape—from right-sizing inference hardware (you rarely need an H100 for inference) to managing multi-GPU training clusters. The most common mistake we see: over-provisioning GPU resources for inference workloads that could run on much cheaper hardware with proper model quantisation. We benchmark and optimise before scaling.