WickedP text logo

Huge speed-up

Original project: Hyperion
Previous project post: Missing model parts fixed
Next project post: First surface colours
History: View complete project post history

There's some sections of Hyperion code that get run repeatedly. Many of these functions and code are required, and there isn't really much you can do to make them faster. If you have to do some vector math, it has to be done. I'd been wondering for some time if it might be possible to make some of the code run faster, or just make it more efficient. So I dug into the structure of the data a little, and found some ways I thought it could improve.

I was looking mostly at the ray collision testing, when I realised that some of the heavy calculations being done, were possibly unnecessary. I setup a test by adding in some booleans (yes/no flags) to test whether certain calculations needed to be made. It took me a couple of hours to set this up correctly, because I had to change the way I accessed the data a little. This did impact the render time to begin with and did make the engine a little slower, but I had another trick up my sleeve.

To test my theory, I ran a series of 5 renders on a sample scene running the old calculations, and 5 running the newer calculations.

The first image below was my benchmark render, running on the old data calculations. Here's what Hyperion spat out:

Hyperion rendering the spectral refraction test scene with updated structure, but using the old calculation methods. This test was repeated 5 times and, surprisingly, the times varied between low 6 minutes, to mid 6 minutes. I'm not sure why the times varied so much, it may have been because of micro tweaks I'd made in between builds, but the average time came out to 6 minutes and 26.912 seconds.

Once I had a base time to work with, the change in the code could now be tested by adjusting the boolean flags I had setup so that it ran with the new calculation methods. Here's what Hyperion gave me:

Hyperion rendering the spectral refraction test scene with the updated data structure. This test was repeated 5 times and gave an average time of 2 minutes and 11.3834 seconds.

The two images above are almost identical, with exceptions to the noise, which is uniquely different each time. This was a significant speed up, something in the order of 300%. And because of the way I designed the code, I could turn the more efficient calculations on and off as I please.

I'm not sure at this point why you would render under the old calculations, but maybe there's something I haven't thought of, that may be useful in that context. So I'll leave the engine with the option to set the flag either way.

To complete the testing, I thought it'd be good to see a quality sample comparison. The rendered image on the left is before the change, and the one on the right is with the more efficient code turned on. The image on the right was rendered with more samples, to approximate the same time as renders before the change. This was done to make a quality comparison of the renders against a time-for-time likeness.

Left ImageRight Image

Left: 6 minutes, 26.912 seconds, 10 passes, old calculation methods.
Right: 6 minutes, 30.755 seconds, 29 passes, new calculation methods.