Enif wrote:

In cuda only mode it is still a lot slower than vulkan.

That's why we are all so excited about the TensorRT version, which is 50% faster than the ncnn version big_smile

ncnn vs. TensorRT (NVIDIA RTX 3050, 1080p, FP16, model 4.6):
30.71 fps vs. 45.91 fps

https://github.com/HolyWu/vs-rife/discu … nt-4117604

kellykline wrote:

So can we do RIFE with 4K now?

4K 8-bit: yes, especially with TensorRT. However, we are still waiting for someone to test how it is with 4K 10-bit HDR.

Over a month ago, I provided a short demo for testing. To increase the chance of someone testing HDR, I even provided a way to do it with a weaker graphics card:


LG 4K HDR Demo - New York.ts
File size: 448 MiB
Duration: 1 min 13 sec
Overall bit rate: 51.4 Mbps
HDR format: SMPTE ST 2086, HDR10 compatible
Width: 3 840 pixels
Height: 2 160 pixels
Frame rate: 25.000 FPS
Color space: YUV
Chroma subsampling: 4:2:0
Bit depth: 10 bits
Direct link: https://drive.google.com/file/d/1dfR5TT … _bGfEXUvJ/
Source: http://hdr4k.blogspot.com/

The file of course needs to be re-encoded to 1080p 10-bit HDR for example as described in this thread:
https://www.reddit.com/r/ffmpeg/comment … intaining/

Chainik, I think it would also be useful in SVP to have a similar function in the GUI to test RIFE interpolation speed without the need for encoding:

"*" indicates benchmarks which were done with vspipe file.py -p . instead of piping into ffmpeg and rendering to avoid cpu bottleneck.

https://github.com/styler00dollar/VSGAN … benchmarks

As you can see from the test results there, this makes a big difference when it comes to performance on fast graphics cards:

for example:

Rife 4.6 ensemble True - the highest quality!!! but slower
NVIDIA GeForce RTX 4090, TensorRT8.5

320 / 401.6* fps (720p)
160 / 207* fps (1080p)

...and such a result would be more revealing of a configuration's potential for real-time video interpolation with RIFE.


and just as a reminder:

Rife 4.6 ensemble False - the fastest
NVIDIA GeForce RTX 4090, TensorRT8.5

541* fps (720p)
288* fps (1080p)

It can help you:
https://github.com/styler00dollar/VSGAN … docker#vfr

Lots of new options, lots of tweaking.

Most importantly for the performance boost is the use of FP16 mode on GPUs with Tensor Cores. To set FP16 mode it is now essential to convert the colour format to RGBH.

Remove fp16 parameter and now it's controlled by the format of the clip. RGBH format uses FP16 mode and RGBS format uses FP32 mode.

https://github.com/HolyWu/vs-rife/releases/tag/v3.0.0


Apart from that, we'll probably need lots of VRAM and an ultra-fast SSD or RAM Disk:

:param trt_max_workspace_size: Maximum workspace size for TensorRT engine.

https://github.com/HolyWu/vs-rife/blob/ … t__.py#L48

 trt_max_workspace_size: int = 1 << 30,

https://github.com/HolyWu/vs-rife/blob/ … t__.py#L26

:param trt_cache_path:          Path for TensorRT engine and timing cache files.
                                    Engine will be cached when it's built for the first time.
                                    Note each engine is created for specific settings.

https://github.com/HolyWu/vs-rife/blob/ … t__.py#L49

 trt_cache_path: str = dir_name,

https://github.com/HolyWu/vs-rife/blob/ … t__.py#L27

We will probably also need to test this:

 :param num_streams: Number of CUDA streams to enqueue the kernels.

https://github.com/HolyWu/vs-rife/blob/ … t__.py#L42

In other words, there can be many bottlenecks and performance gains can vary not only depending on the graphics card, but also on the speed of the SSD, RAM and CPU.

Xenocyde wrote:

Then I transcoded the latest episode from Star Girl in around 2 h and 15 minutes before the end of the episode the video stuttered and the audio desynced a bit. Not sure what caused this.

This can be caused by a variable frame rate: https://en.wikipedia.org/wiki/Variable_frame_rate

+40~50% comes from the developer vs-rife quote. This is a new filter published last night, so no independent tests yet.

However, we have tests of the TensorRT-optimised version of RIFE in VSGAN-tensorrt-docker, which seem to confirm this: 288fps/164fps-1=+75%

288 new fps (1080p) for 4090+13900k (TensorRT8.5+vs_threads=4+fp16) (rife46) (num_streams=10) (benchmark was done with vspipe file.py -p . instead of piping into ffmpeg and rendering to avoid cpu bottleneck)

164 new fps (1080p) for 4090+5950x (ncnn+2 threads+4 vs threads+ffmpeg (ultrafast) (rife4.6)

Source: https://github.com/styler00dollar/VSGAN-tensorrt-docker

Amazing news!!! Literally 2 hours ago!!!

vs-rife is again much faster than RIFE-ncnn-Vulkan!!!!!!!!


HolyWu wrote:

With the usage of TensorRT, it should run at least 40~50% faster than previous version or RIFE-ncnn-Vulkan implementation using FP16 mode on GPUs with Tensor Cores.

https://github.com/HolyWu/vs-rife/releases/tag/v3.0.0

In fact, using the TensorRT-optimised version of RIFE in VSGAN-tensorrt-docker you can achieve 288 new interpolated frames per second for an NVIDIA GeForce RTX 4090 graphics card and 1080p files: https://github.com/styler00dollar/VSGAN-tensorrt-docker

First was RIFE-ncnn-Vulkan.
Next was vs-rife and was about 3 times faster.
Later on, RIFE-ncnn-Vulkan regained its leading position....
only to lose it again today to vs-rife big_smile

For UHD (yes, for me wink ) this gives theoretically even 4K x4!!! (288/4=72; 72+24=96). In practice it will probably be x3, but this will improve the quality of the interpolation considerably, because as we know for RIFE the highest quality is given by x3, x5, x7 multipliers.

And in two years RTX 5090 and 4K x5 with RIFE!!!

It's good to know that all is well, as I too intend to use Windows 11 and model 4.6 in the future.

Most important: for real-time testing, it is best to use mpv player and for 4K HDR files only mpv player: https://www.svp-team.com/wiki/SVP:4K_an … DR_display

Pezede wrote:

Just tried it with the MPC-HC default video renderer, footage is very choppy with frameskipping in *2.

Please try 3 variations of the test for the above 4K HDR file:

1. Normal playback without interpolation on a monitor or TV set up with HDR. Only pay attention to the colours in this test.

2A. Real-time interpolation with RIFE. Even if footage is very choppy with frameskipping it is interesting to see if colours have changed compared to test number 1. Try using Triple Bufferinghttps://tweakguides.pcgamingwiki.com/images/GGDSG_20.jpg
in the graphics card settings to improve smoothness: https://tweakguides.pcgamingwiki.com/Graphics_10.html

2B. Same as 2A only if it continues to stutter reduce the size before interpolation e.g. "Decrease to HD" https://www.svp-team.com/wiki/Manual:Resizing for testing purposes whether HD colours will be retained when interpolating in real time using RIFE.

3. Reencoding with RIFE. I'm interested in the fps during transcoding 4K file, and whether later playing back normally this file, which will already have a fixed 50 fps will retain the colours from test 1 or 2.

Try not to use madVR, as it actually takes up a lot of GPU resources sometimes. If madVR and RIFE are fighting for GPU resources this could be a bottleneck.

Very strange...

Could you please check if it is possible to interpolate x2 in real time with RIFE:

LG 4K HDR Demo - New York.ts
File size: 448 MiB
Duration: 1 min 13 sec
Overall bit rate: 51.4 Mbps
HDR format: SMPTE ST 2086, HDR10 compatible
Width: 3 840 pixels
Height: 2 160 pixels
Frame rate: 25.000 FPS
Color space: YUV
Chroma subsampling: 4:2:0
Bit depth: 10 bits
Direct link: https://drive.google.com/file/d/1dfR5TT … _bGfEXUvJ/
Source: http://hdr4k.blogspot.com/

Have you tried to do a real-time RIFE interpolation test for 1080p:
x3
x4
x5

?

Pezede wrote:

I have tried the ultrafast encoding preset, performance did not change but I will be using it from now on as it is the fastest.

"performance did not change" and "is the fastest" how is this possible at the same time?

Have you checked threads 5, 6, 7 and 3?

Try 4.6 ensemble False, there should be more fps.

Now a question for everyone: is it possible to set the number of GPU threads higher than 4?

GREAT!!!

Try using the fastest H264/AVC encoding preset and model 4.6


Pezede wrote:

transcoding doesn't take too long with a lossless NVENC H265 preset

Lossless encoding is best for enjoying videos later, but for testing it's best to use the fastest encoding that doesn't overtax the CPU and SSD. See the RIFE test results at the bottom of the page: https://github.com/styler00dollar/VSGAN-tensorrt-docker The way you encode has an impact on fps.

Pezede wrote:

Hey, I will stay around and provide as much feedback as necessary, as I'd be quite happy to have as much performance as possible :-)

Thanks a lot!

Pezede wrote:

I'm using the rife version (ncnn unless I'm wrong) that comes embedded with SVP, I'm currently trying to set up other models and see if that changes anything.

Make sure you are using the latest version of SVP: 4.6.0.220 2022-10-10: https://www.svp-team.com/news/

It only has the NCNN RIFE version: https://www.svp-team.com/forum/viewtopi … 052#p81052

Pezede wrote:

(Transcoding speed is around 45fps for 1920*1080@23.976 --> 1920@1080@119.88)

Pezede wrote:

https://i.imgur.com/1QMZ4fL.png

45/5*4=36
75,8/2*1=37,9

The results are repeatable, but clearly something is wrong!

Even NVIDIA GeForce RTX 3070 Ti graphics card performs better - 45 new interpolated frames per second:
https://www.svp-team.com/forum/viewtopi … 588#p80588

On the other hand, NVIDIA GeForce RTX 4090 in combination with RIFE is a real tornado!

Below the benchmark tests for RIFE interpolation (fastmode True, ensemble False):

38 new fps (1080p) for 3090+5950x (CUDA+ffmpeg+FrameEval+vs_threads=20) (rife4.0)

63 new fps (1080p) for 3090+5950x (C++ NCNN+vs_threads=20+ncnn_threads=8) (rife4.0)

164 new fps (1080p) for 4090+5950x (ncnn+2 threads+4 vs threads+ffmpeg (ultrafast)) (rife4.6)

Source: https://github.com/styler00dollar/VSGAN-tensorrt-docker

Pezede, I hope you will stay with us for some time on this forum. I have a lot of questions for you! However, the most important thing now is to find the reason for such a large disparity in test results.

Hence the first and most important question:

Are you using the NCNN version of RIFE or the CUDA version?

I think most people will use 4 models on a daily basis:

hzwer wrote:

v4.4 model is specially tuned for animation

https://github.com/megvii-research/ECCV … 1225279362

2 models automatically profile-switched by SVP for animation:

for real-time HD animation interpolation - the fastest version of the v4.4 model:
rife-v4.4 (ensemble=False / fast=True)

for real-time SD animation interpolation and HD animation transcoding - highest quality v4.4 model:
rife-v4.4 (ensemble=True / fast=False)

2 models automatically profile-switched by SVP for live action:

for real-time HD live action interpolation - fastest version of the model v4.5 or v4.6:
rife-v4.5 or v4.6 (ensemble=False)

for real-time SD live action interpolation and HD live action transcoding - highest quality version of the model v4.5 or v4.6:
rife-v4.5 or v4.6 (ensemble=True)

I don't know which of the v4.5 or v4.6 models would be better. However, there must have been some reason that they don't use contextnet and unet anymore: https://github.com/styler00dollar/Vapou … 1271982617 I also suspect that perhaps one of the v4.5 or v4.6 models is the successor to v4.4 for animation.

We would have to ask hzwer for more details. I plan to do this and ask him to evaluate the models using one of the perceptual quality metrics:
FloLPIPS https://arxiv.org/abs/2207.08119 https://github.com/danielism97/flolpips
VFIPS https://arxiv.org/abs/2210.01879 https://github.com/hqqxyy/vfips
LPIPS https://arxiv.org/abs/1801.03924 https://github.com/richzhang/PerceptualSimilarity

Before I do that I would like to complete my "Rankings of Video Frame Interpolation Models Based on PSNR, SSIM, LPIPS, FloLPIPS Metrics" beforehand: https://github.com/AIVFI/Rankings-of-Vi … PS-Metrics with rankings based on LPIPS. I will later ask some questions of RIFE models based on it.

If anyone reading this post and would like to test the RIFE models themselves with FloLPIPS or VFIPS, which this year have just been developed as perceptual quality metrics specifically for video footage as opposed to LPIPS, which is more suitable for single images then of course this would be appreciated smile

In any case, we are only at the beginning of the development of AI interpolation methods and the best is yet to come.

It would be good if the SVP had the possibility to manually select through the GUI any RIFE model and the possibility to set profiles for automatic model selection depending on video resolution - faster for HD and UHD and higher quality for SD. Maybe one day it will also be possible to automatically recognise whether we are dealing with animation or with live action and automatically select the RIFE model to match - that would be the real deal.

I think this is what should distinguish SVP with RIFE compared to using only VapourSynth-RIFE-ncnn-Vulkan directly with mpv: ease and speed of use thanks to the GUI, automation of model selections via profiles and the wide selection of different settings and features that SVP is famous for.

For some reason I can't download from this source. Hence I have provided an alternative source for the same demo in this post: https://www.svp-team.com/forum/viewtopi … 008#p81008


dlr5668 wrote:

https://i.imgur.com/nht2Q49.png

18 fps or 1/3 realtime for 4K HDR x2 fps 3070ti

Thanks a lot for the precise data!

174 TFLOPS Peak FP16 using the Sparsity feature - NVIDIA GeForce RTX 3070 Ti

1321 TFLOPS Peak FP8 using the Sparsity feature - NVIDIA GeForce RTX 4090
so
660 TFLOPS Peak FP16 using the Sparsity feature - NVIDIA GeForce RTX 4090
so
3,79x
so
69 fps !!!

I'll have to go down to 250W TDP for passive cooling anyway, so there's a good chance of getting 50 fps.

Thanks for the test!


dlr5668 wrote:

https://i.imgur.com/6kKfAQS.png

and it lost HDR data

As I understand it this is the result after reencoding.
This is what I expected after reading this part of the tutorial:

At the time (mid '2021) there's no known way to transcode HDR source while keeping HDR data.

https://www.svp-team.com/wiki/SVP:4K_and_HDR

I don't know if this is still valid. That's why in that post https://www.svp-team.com/forum/viewtopi … 008#p81008 I suggested downscaling to 1080p, checking without interpolation that everything is OK (10-bit, HDR) and only from that 1080p file trying RIFE interpolation in real time.

I think there will be a lot of people who will sacrifice the resolution reduction from 4K to 1080p if it helps to keep 10-bit HDR, chroma subsampling: 4:4:4 (1080p) and real-time RIFE interpolation. I am very curious to see if this is possible....

This is great news!

Until now, I thought there would be some problems based on statements from n00mkrad, who uses RIFE in his Flowframes:

n00mkrad wrote:

HDR is currently not supported as the neural networks only work with 8bpp content.

I can't say if there will ever be HDR support.

https://github.com/n00mkrad/flowframes/ … -803571016

Chainik wrote:

no idea about quality, but 4.1-4.4 performance is the same as 4.0
4.5-4.6 are 10% _slower_

Thanks! Looks like there were some significant changes done to these two models: https://github.com/styler00dollar/Vapou … n/issues/3
In any case, 10% is not much and it will all depend on which model gives the highest quality.


Chainik wrote:

buy a video card at last! big_smile

I already have a video card and have it for several years. Unfortunately my ATI Radeon HD 3450 graphics card does not support the Vulkan sad

NVIDIA GeForce RTX 4090 graphics cards are expected to start shipping on October 12, 2022, but I will still have to wait with my purchase for the availability of custom spreaders for passive cooling. Once they are available I will choose a particular model of graphics card based on the availability of the spreaders and other users' opinions about coil whine. In other words, it will still be a few months before I can start shopping. If I'm going to build a new HTPC for the next decade or so it needs to be done right. I'll miss my energy efficient ATI Radeon HD 3450 and its 25W TDP big_smile

Chainik, since it will still be a long time before I can share my test results on this forum, maybe you can tell me the big secret that nobody wants to talk about on this forum and nowhere else either:

Does real-time x2 RIFE interpolation of 1080p or 720p 10-bit HDR video files is possible without losing 10-bit colour depth and HDR?

I revealed myself with my secret graphics card cool

Good news!

Today AviSynthPlus-RIFE has been updated with all the newest RIFE models: https://github.com/Asd-g/AviSynthPlus-RIFE

These models were added earlier by styler00dollar to his version of VapourSynth-RIFE-ncnn-Vulkan:
https://github.com/styler00dollar/Vapou … cnn-Vulkan

Now both AviSynth as well as VapourSynth filters include all current RIFE models: v4.2, v4.3, v4.4, v4.5 and v4.6.

About the v4.0 and v4.1 models, their author hzwer wrote:

hzwer wrote:

We hide some models because they received some serious bug feedback or could be totally replaced by new models.

https://github.com/hzwer/Practical-RIFE

I would like to once again give a big thank you to everyone who shared the v4.0 test results on this forum.

Now, I would very much encourage everyone to share their observations on the new v4.2, v4.3, v4.4, v4.5 and v4.6 models compared to the old v4.0 model. Is there an improvement in quality and speed of interpolation?

What is worth noting is that styler00dollar has also added the latest models in their higher quality version, which of course also involves a longer inference time for a given model. What difference does this make?

styler00dollar has provided examples of RIFE AI inference times for the v4.0 model in various quality variants

fastmode False, ensemble True
fastmode True, ensemble False
fastmode True, ensemble True

here: https://github.com/styler00dollar/VSGAN-tensorrt-docker

You will also find there inference times for other AI-based interpolation methods. In addition to RIFE, there are also: Sepconv, FILM, eisai and... IFRNet, which I already mentioned here: https://www.svp-team.com/forum/viewtopic.php?id=6666 and... M2M, which I already mentioned here: https://www.svp-team.com/forum/viewtopi … 345#p80345

I am glad that we finally have a choice of a given RIFE model in the fastest version as well as in the highest quality version. It will be possible to boost the quality even more for files with lower resolutions or if we have graphics cards with high AI inference performance smile