I think most people will use 4 models on a daily basis:
hzwer wrote:v4.4 model is specially tuned for animation
https://github.com/megvii-research/ECCV … 1225279362
2 models automatically profile-switched by SVP for animation:
for real-time HD animation interpolation - the fastest version of the v4.4 model:
rife-v4.4 (ensemble=False / fast=True)
for real-time SD animation interpolation and HD animation transcoding - highest quality v4.4 model:
rife-v4.4 (ensemble=True / fast=False)
2 models automatically profile-switched by SVP for live action:
for real-time HD live action interpolation - fastest version of the model v4.5 or v4.6:
rife-v4.5 or v4.6 (ensemble=False)
for real-time SD live action interpolation and HD live action transcoding - highest quality version of the model v4.5 or v4.6:
rife-v4.5 or v4.6 (ensemble=True)
I don't know which of the v4.5 or v4.6 models would be better. However, there must have been some reason that they don't use contextnet and unet anymore: https://github.com/styler00dollar/Vapou … 1271982617 I also suspect that perhaps one of the v4.5 or v4.6 models is the successor to v4.4 for animation.
We would have to ask hzwer for more details. I plan to do this and ask him to evaluate the models using one of the perceptual quality metrics:
FloLPIPS https://arxiv.org/abs/2207.08119 https://github.com/danielism97/flolpips
VFIPS https://arxiv.org/abs/2210.01879 https://github.com/hqqxyy/vfips
LPIPS https://arxiv.org/abs/1801.03924 https://github.com/richzhang/PerceptualSimilarity
Before I do that I would like to complete my "Rankings of Video Frame Interpolation Models Based on PSNR, SSIM, LPIPS, FloLPIPS Metrics" beforehand: https://github.com/AIVFI/Rankings-of-Vi … PS-Metrics with rankings based on LPIPS. I will later ask some questions of RIFE models based on it.
If anyone reading this post and would like to test the RIFE models themselves with FloLPIPS or VFIPS, which this year have just been developed as perceptual quality metrics specifically for video footage as opposed to LPIPS, which is more suitable for single images then of course this would be appreciated
In any case, we are only at the beginning of the development of AI interpolation methods and the best is yet to come.
It would be good if the SVP had the possibility to manually select through the GUI any RIFE model and the possibility to set profiles for automatic model selection depending on video resolution - faster for HD and UHD and higher quality for SD. Maybe one day it will also be possible to automatically recognise whether we are dealing with animation or with live action and automatically select the RIFE model to match - that would be the real deal.
I think this is what should distinguish SVP with RIFE compared to using only VapourSynth-RIFE-ncnn-Vulkan directly with mpv: ease and speed of use thanks to the GUI, automation of model selections via profiles and the wide selection of different settings and features that SVP is famous for.