RetroRGB Interview on Vint
I recently had an interview with Bob from RetroRGB about Vint, 24 FPS video, and display temporals.
Check it out here:
https://retrorgb.com/interview-with-vint-creator-william-sokol-erhard.html
The world of cinematography has long held on to the axiom that 24 FPS with a 180° shutter is cinema and any derivation therefrom is sacrilege because of everything from the 'soap opera' effect to props looking fake to inadequate levels of motion blur.
The few challenges to that self-evident truth like The Hobbit are endlessly criticized. The half-baked attempts to mitigate the problems of the low framerate standard, like interpolation, are rebuked by big name film makers who advocate for a 'film maker mode' so you get to see each any every one of those 24 frames and the illustrious chop and flicker between each one.
The force of progress, however, is insuperable and resisting the benefits of higher framerates can only go so far. Filmmakers are able to use high framerate recordings and downsample them in the post production to create a result that is indistinguishable from a low framerate recording but allows them to modify various parameters after filming is complete. This process has gone so far that industry tools not only allow for simple downscaling but also the artificial introduction of literal judder or variance in frame times.
I will not define shutter angle and exposure length in this piece but their definitions are crucial to understanding these concepts.
Obviously when exposure time is infinitely short, downsampling framerate is as simple as discarding frames. Let's say however that you are using a 180° shutter and are filming at 144FPS. That would result in a 1/288th of a second exposure time. Let's also say you wish to downscale to 24FPS with a 180° shutter. That is no longer simple nor even possible to do without causing unnatural artifacts. As the following image shows (if teal indicates time when a frame is exposed), there simply isn't information available in the original recording to reproduce the lower framerate video.
This problem however disappears if we simply use a 360° shutter for the higher framerate source footage. We simply have three frames of exposure for each frame of 24fps 180° shutter angle exposure.
The video linked here shows a professional example of this process: https://vimeo.com/105838602#t=1m51s
The logic of such a process is that by simply combining the images of multiple frames shot with a 360° shutter you are able to create an effective shutter speed multiple times longer than the one is was originally shot with. So with a 1/144th second exposure frame from a 144FPS video one can combine it with both its neighbors to create a 1/48th second exposure theoretically indistinguishable from a 1/48th second exposure created by native 24FPS 180° shutter recordings.
The natural corollary of that capability is that with a simple rolling average, one is capable of creating a video with an effective shutter angle greater than 360° directly contrary to traditional cinematographic understanding. Even camera manufacturer RED claims unequivocally that:
"The larger the angle, the slower the shutter speed, all the way up to the limit of 360°, where the shutter speed could become as slow as the frame rate."
They are far from alone in their misconception. The ability to create effective shutter angles greater than 360° allows filmmakers to create nearly arbitrarily high framerate videos without reducing the level of motion blur whatsoever compared to something like 24FPS with a 180° shutter.
Ultimately, the only honest arguments for using 24FPS in modern film making boil down to cheapening out or clinging to an objectively reduced level of fidelity like not unlike SDTV to prevent the viewer from being able see the content clearly and pretending like it's a creative choice.
Vision comes with a few components that give you a sense of depth. Parallax is the effect where (over time) moving a perspective shows near objects moving faster than far objects giving the effect of depth. Stereoscopic vision is similar but you have two perspectives in different places which place near objects further apart than far objects and your mind creates a sense of depth from that. Finally there is focal distance. Focal distance can give you an effect of scale and distance from a single perspective. Much like 'tilt-shift' makes things look small, messing with the focus can emphasize the relative and even the absolute distance and size of objects.
The Index & Rift have a set focal distance of about 2 meters while the vive has a focal distance of 0.75m. That means that if you look at an object at that exact distance then it will be perfectly in focus in addition to having correct stereo and parallax effects. When you look at an object at a different distance then the dissonance in the focal distance from the other effects causes discomfort and 'blurriness'.
Focus on your finger a few cm from your eye with a background a few meters away. Move your finger further and further until you notice the background come into focus. You'll notice that there's a bit of a logarithmic scale in terms of how much the focal effect presents itself based on distance. The difference in focus between 2 meters and infinity is a lot less significant than the difference between 5cm and 25cm.
Effectively, current headsets are well tuned to keep objects in the same virtual room as you in focus until you get to close. Objects in the far distance aren't easy to make out because of low resolution but if they weren't, you would see that they are also slightly blurry because they're out of focus. Near objects like your hands that you bring to your face will go out of focus a lot more easily and they're also large relative to the display so resolution isn't an issue. They will be stereoscopically accurate but completely out of focus. You should be able to notice that if you close one eye and try and focus on an still object (like a book with text) near the camera, you should be able to focus on it clearly and even read it easily but the scale will appear to be incorrect and very large.
Varifocal lenses can either move the lenses like a camera to change the focal distance based on your eye movements and where in the virtual world you are looking or they can toggle lenses on and off to match the focal distance you should be seeing as closely as possible. The new issue that arises is that now all objects in the scene are going to be in focus all the time regardless of where you are focused. While better, this is still quite inaccurate. The solution can be digital by artificially blurring all objects as accurately as possible but there are also ways to have multiple displays with different focal distances or even change the focal distance differently in different parts of the screen.
Eye tracking is a prerequisite for such tech so foveated rendering will appear first in consumer headsets then we'll probably have to wait another generation to get this tech. I've used the Half-dome oculus prototype varifocal headset and while it's nice to be able to focus correctly, the digital blurring and eye tracking leave a lot to be desired and honestly, I think improving refresh rate and FOV along with adding effective foveated rendering will provide far larger improvements than varifocal lenses.
![]() |
| Simple high raycount per pixel |
![]() |
| Lower rays/pixel and no Gaussian blur but high temporal accumulation |
![]() |
| With light Gaussian blur, lower rays/pixel, and heavy temporal accumulation |