Spent several days working out a cool optimization for a denoising diffusion model (addressing the noise continuously, modeling its full distribution at every step of t instead of working with quantized cumulative sums) and have now discovered why people don’t do it this way.
Turns out the derivative of sqrt(x) does some colorful things around x=0. I prefer not to pry, but it seems not to want to be bothered around there. I’m going to leave it be.
This hacky lightweight 4× super-resolver has learned that I want things to look sharp but not too sharp, so it sort of scatters edges around the image like spice in a sauce.