Anamorphic bloom creates those beautiful horizontal light streaks you see in films like Blade Runner and Star Trek. Unlike standard bloom which spreads evenly in all directions, anamorphic bloom stretches dramatically along the horizontal axis, mimicking the optical artifacts from anamorphic cinema lenses.
The Multi-Pass Pipeline
Real-time anamorphic bloom uses a multi-pass approach: threshold extraction, progressive downsampling with asymmetric blur, and upsampling with accumulation. Each pass operates at decreasing resolutions for efficiency.
Step 1: Threshold Extraction
First, we extract pixels above a luminance threshold. The key is using a soft knee - a smooth transition that prevents harsh cutoffs and visible banding artifacts.
// Calculate luminance float lum = dot(color, vec3(0.2126, 0.7152, 0.0722)); // Soft knee - smooth transition around threshold float kneeWidth = uSoftKnee * 0.5 + 0.1; float lowerBound = uThreshold - kneeWidth; float upperBound = uThreshold + kneeWidth; // S-curve for gradual falloff float contribution = smoothstep(lowerBound, upperBound, lum); // Cubic smoothing for even softer edges contribution = contribution * contribution * (3.0 - 2.0 * contribution); // Preserve HDR intensity for very bright areas float excess = max(0.0, lum - lowerBound); vec3 bloom = color * contribution * (0.5 + excess);
Step 2: Asymmetric Gaussian Blur
The magic of anamorphic bloom is in the blur kernel. We apply different scales to horizontal and vertical passes - stretching horizontally while compressing vertically. This creates the characteristic "streak" look.
// Determine pass direction bool isHorizontal = uDirection.x > 0.5; // Anamorphic scaling: stretch H, compress V float anamorphicScale; if (isHorizontal) { // Horizontal pass - stretch for streaks anamorphicScale = 1.0 + uAnamorphic * 3.0; // Up to 4x stretch } else { // Vertical pass - compress to keep thin anamorphicScale = max(0.2, 1.0 - uAnamorphic * 0.8); } // Apply to blur kernel spread float baseSpread = 0.5 * uBloomRadius * anamorphicScale; vec2 step = uDirection * texelSize * baseSpread;
Interactive Demo
Adjust the sliders below to see how blur amount and anamorphic stretch affect the final result. Notice how increasing the anamorphic value creates longer horizontal streaks while keeping the vertical spread tight.
The 25-Tap Gaussian Kernel
We use a 25-tap (12 + center + 12) Gaussian kernel with carefully chosen weights. The weights follow a Gaussian distribution (sigma ~4) which ensures smooth falloff without visible banding.
// Gaussian weights (sigma ~4) const int TAPS = 12; float weights[13]; weights[0] = 1.0; // Center weights[1] = 0.96; weights[2] = 0.88; weights[3] = 0.77; weights[4] = 0.64; weights[5] = 0.51; weights[6] = 0.38; weights[7] = 0.27; weights[8] = 0.18; weights[9] = 0.11; weights[10] = 0.06; weights[11] = 0.03; weights[12] = 0.01; // Center sample result += texture2D(uTexture, uv).rgb * weights[0]; totalWeight += weights[0]; // Symmetric taps - both directions from center for (int i = 1; i <= TAPS; i++) { float w = weights[i]; vec2 offset = step * float(i); result += texture2D(uTexture, uv + offset).rgb * w; result += texture2D(uTexture, uv - offset).rgb * w; totalWeight += w * 2.0; } result /= totalWeight;
Step 3: Progressive Downsampling
To achieve wide blur radius without excessive samples, we downsample progressively. Each level halves the resolution, effectively doubling the blur coverage. A 4-level pyramid gives us 16x the effective blur radius.
// 13-tap pattern avoids fireflies while preserving energy vec3 a = texture2D(uTexture, uv + texelSize * vec2(-1.0, -1.0)).rgb; vec3 b = texture2D(uTexture, uv + texelSize * vec2( 0.0, -1.0)).rgb; vec3 c = texture2D(uTexture, uv + texelSize * vec2( 1.0, -1.0)).rgb; vec3 d = texture2D(uTexture, uv + texelSize * vec2(-0.5, -0.5)).rgb; vec3 e = texture2D(uTexture, uv + texelSize * vec2( 0.5, -0.5)).rgb; vec3 f = texture2D(uTexture, uv + texelSize * vec2(-1.0, 0.0)).rgb; vec3 g = texture2D(uTexture, uv).rgb; // Center vec3 h = texture2D(uTexture, uv + texelSize * vec2( 1.0, 0.0)).rgb; vec3 i = texture2D(uTexture, uv + texelSize * vec2(-0.5, 0.5)).rgb; vec3 j = texture2D(uTexture, uv + texelSize * vec2( 0.5, 0.5)).rgb; vec3 k = texture2D(uTexture, uv + texelSize * vec2(-1.0, 1.0)).rgb; vec3 l = texture2D(uTexture, uv + texelSize * vec2( 0.0, 1.0)).rgb; vec3 m = texture2D(uTexture, uv + texelSize * vec2( 1.0, 1.0)).rgb; // Weighted average - center weighted more heavily vec3 result = g * 0.125; result += (d + e + i + j) * 0.125; result += (a + b + f) * 0.0625; result += (b + c + h) * 0.0625; result += (f + k + l) * 0.0625; result += (h + l + m) * 0.0625;
Step 4: Upsample & Accumulate
Finally, we upsample back through the pyramid, additively blending each level. Hardware bilinear filtering handles the upscale, while we accumulate bloom from all mip levels for a natural, multi-scale glow.
// Sample upsampled lower mip (bilinear does the upscale) vec3 upsampled = texture2D(uTexture, uv).rgb; // Sample existing content at this level vec3 existing = texture2D(uHigherMip, uv).rgb; // Additive blend: accumulate bloom across mips vec3 result = upsampled + existing;
Performance Considerations
- Separable blur: 2N samples instead of N² for equivalent quality
- Progressive mips: 4 levels at 1/2, 1/4, 1/8, 1/16 resolution
- Linear sampling: Use GPU bilinear filtering to halve tap count
- Early-out: Skip blur passes for pixels below threshold
- Mobile: Reduce to 2-3 mip levels, use 9-tap kernel instead of 25
The Complete Pipeline
The final composite adds the accumulated bloom to the original scene. A slight warm tint (1.1, 1.0, 0.9) can make the bloom feel more natural and filmic, simulating the warmth of actual lens flares.