351 (edited by UHD 06-11-2022 19:18:44)

Re: New RIFE filter - 3x faster AI interpolation possible in SVP!!!

Amazing news!!! Literally 2 hours ago!!!

vs-rife is again much faster than RIFE-ncnn-Vulkan!!!!!!!!


HolyWu wrote:

With the usage of TensorRT, it should run at least 40~50% faster than previous version or RIFE-ncnn-Vulkan implementation using FP16 mode on GPUs with Tensor Cores.

https://github.com/HolyWu/vs-rife/releases/tag/v3.0.0

In fact, using the TensorRT-optimised version of RIFE in VSGAN-tensorrt-docker you can achieve 288 new interpolated frames per second for an NVIDIA GeForce RTX 4090 graphics card and 1080p files: https://github.com/styler00dollar/VSGAN-tensorrt-docker

First was RIFE-ncnn-Vulkan.
Next was vs-rife and was about 3 times faster.
Later on, RIFE-ncnn-Vulkan regained its leading position....
only to lose it again today to vs-rife big_smile

For UHD (yes, for me wink ) this gives theoretically even 4K x4!!! (288/4=72; 72+24=96). In practice it will probably be x3, but this will improve the quality of the interpolation considerably, because as we know for RIFE the highest quality is given by x3, x5, x7 multipliers.

And in two years RTX 5090 and 4K x5 with RIFE!!!

352 (edited by Xenocyde 07-11-2022 13:23:51)

Re: New RIFE filter - 3x faster AI interpolation possible in SVP!!!

- Moved to a different topic -

Re: New RIFE filter - 3x faster AI interpolation possible in SVP!!!

No way it can be 50% faster. lets wait for Chainik install script since SVP lost compatibility with cuda rife

354

Re: New RIFE filter - 3x faster AI interpolation possible in SVP!!!

+40~50% comes from the developer vs-rife quote. This is a new filter published last night, so no independent tests yet.

However, we have tests of the TensorRT-optimised version of RIFE in VSGAN-tensorrt-docker, which seem to confirm this: 288fps/164fps-1=+75%

288 new fps (1080p) for 4090+13900k (TensorRT8.5+vs_threads=4+fp16) (rife46) (num_streams=10) (benchmark was done with vspipe file.py -p . instead of piping into ffmpeg and rendering to avoid cpu bottleneck)

164 new fps (1080p) for 4090+5950x (ncnn+2 threads+4 vs threads+ffmpeg (ultrafast) (rife4.6)

Source: https://github.com/styler00dollar/VSGAN-tensorrt-docker

355 (edited by UHD 07-11-2022 15:06:05)

Re: New RIFE filter - 3x faster AI interpolation possible in SVP!!!

Lots of new options, lots of tweaking.

Most importantly for the performance boost is the use of FP16 mode on GPUs with Tensor Cores. To set FP16 mode it is now essential to convert the colour format to RGBH.

Remove fp16 parameter and now it's controlled by the format of the clip. RGBH format uses FP16 mode and RGBS format uses FP32 mode.

https://github.com/HolyWu/vs-rife/releases/tag/v3.0.0


Apart from that, we'll probably need lots of VRAM and an ultra-fast SSD or RAM Disk:

:param trt_max_workspace_size: Maximum workspace size for TensorRT engine.

https://github.com/HolyWu/vs-rife/blob/ … t__.py#L48

 trt_max_workspace_size: int = 1 << 30,

https://github.com/HolyWu/vs-rife/blob/ … t__.py#L26

:param trt_cache_path:          Path for TensorRT engine and timing cache files.
                                    Engine will be cached when it's built for the first time.
                                    Note each engine is created for specific settings.

https://github.com/HolyWu/vs-rife/blob/ … t__.py#L49

 trt_cache_path: str = dir_name,

https://github.com/HolyWu/vs-rife/blob/ … t__.py#L27

We will probably also need to test this:

 :param num_streams: Number of CUDA streams to enqueue the kernels.

https://github.com/HolyWu/vs-rife/blob/ … t__.py#L42

In other words, there can be many bottlenecks and performance gains can vary not only depending on the graphics card, but also on the speed of the SSD, RAM and CPU.

356 (edited by UHD 07-11-2022 16:23:17)

Re: New RIFE filter - 3x faster AI interpolation possible in SVP!!!

Chainik, I think it would also be useful in SVP to have a similar function in the GUI to test RIFE interpolation speed without the need for encoding:

"*" indicates benchmarks which were done with vspipe file.py -p . instead of piping into ffmpeg and rendering to avoid cpu bottleneck.

https://github.com/styler00dollar/VSGAN … benchmarks

As you can see from the test results there, this makes a big difference when it comes to performance on fast graphics cards:

for example:

Rife 4.6 ensemble True - the highest quality!!! but slower
NVIDIA GeForce RTX 4090, TensorRT8.5

320 / 401.6* fps (720p)
160 / 207* fps (1080p)

...and such a result would be more revealing of a configuration's potential for real-time video interpolation with RIFE.


and just as a reminder:

Rife 4.6 ensemble False - the fastest
NVIDIA GeForce RTX 4090, TensorRT8.5

541* fps (720p)
288* fps (1080p)

Re: New RIFE filter - 3x faster AI interpolation possible in SVP!!!

I think it is possible to blend the tensorrt version of RIFE into svp, you can write the code in c++ and generate dll for SVP to load. The additional dll dependencies required (cuda, cudnn and tensorrt, about 2GB without compression) can be placed directly in a folder without letting customers download and install them from Nvidia. Use tensorrt and enable fp16 indeed can boost FPS greatly(INT8 quantization is even faster, but will influence the performance, while FP16 just shows minor effects). Though generating .engine file can cost some time, but once the .engine file is generated, there is no need to generate again, so I don't think it is a problem.
The future of real time interpolation for video definitely belongs to AI(maybe everything in computer vision).

Re: New RIFE filter - 3x faster AI interpolation possible in SVP!!!

Now I am using a project named mpv_lazy in github(https://github.com/hooke007/MPV_lazy, in Chinese), and it enables RFIE v4.6 using tensorrt. My video card is 3070 laptop and it works totally fine in 1080p real-time doubling FPS, the performance of which is much better than the SVP4. Hope the svp4 can support the trt RFIE as soon as possble.

Re: New RIFE filter - 3x faster AI interpolation possible in SVP!!!

Thanks a lot! This is the perfect solution now. I'm using a 2080TI, and I can get full 1080p interpolated with RIFE 4.6+FSRCNNX+Krig running flawlessly. I cannot believe it! GPU memory usage is ~6GB and utilization ~80%. No need to get 4090 for now thank god!

VeniVediVeci wrote:

Now I am using a project named mpv_lazy in github(https://github.com/hooke007/MPV_lazy, in Chinese), and it enables RFIE v4.6 using tensorrt. My video card is 3070 laptop and it works totally fine in 1080p real-time doubling FPS, the performance of which is much better than the SVP4. Hope the svp4 can support the trt RFIE as soon as possble.

360 (edited by Pezede 16-11-2022 23:47:05)

Re: New RIFE filter - 3x faster AI interpolation possible in SVP!!!

weiw26 wrote:

Thanks a lot! This is the perfect solution now. I'm using a 2080TI, and I can get full 1080p interpolated with RIFE 4.6+FSRCNNX+Krig running flawlessly. I cannot believe it! GPU memory usage is ~6GB and utilization ~80%. No need to get 4090 for now thank god!

VeniVediVeci wrote:

Now I am using a project named mpv_lazy in github(https://github.com/hooke007/MPV_lazy, in Chinese), and it enables RFIE v4.6 using tensorrt. My video card is 3070 laptop and it works totally fine in 1080p real-time doubling FPS, the performance of which is much better than the SVP4. Hope the svp4 can support the trt RFIE as soon as possble.

Can confirm that this project gives me frame doubling at around ~15% GPU use with MPV. I however haven't found how to go beyond doubling in that build, since everything's in chinese.

This is extremely promising though!

Edit : I'm not entirely sure how relevant that is but I've tried speeding up the footage and I can do up to 4.5* footage speed before stutters happen (compute use goes to 65%).

https://i.imgur.com/Fxt24Ek.png

361 (edited by dlr5668 17-11-2022 13:45:21)

Re: New RIFE filter - 3x faster AI interpolation possible in SVP!!!

Pezede wrote:
weiw26 wrote:

Thanks a lot! This is the perfect solution now. I'm using a 2080TI, and I can get full 1080p interpolated with RIFE 4.6+FSRCNNX+Krig running flawlessly. I cannot believe it! GPU memory usage is ~6GB and utilization ~80%. No need to get 4090 for now thank god!

VeniVediVeci wrote:

Now I am using a project named mpv_lazy in github(https://github.com/hooke007/MPV_lazy, in Chinese), and it enables RFIE v4.6 using tensorrt. My video card is 3070 laptop and it works totally fine in 1080p real-time doubling FPS, the performance of which is much better than the SVP4. Hope the svp4 can support the trt RFIE as soon as possble.

Can confirm that this project gives me frame doubling at around ~15% GPU use with MPV. I however haven't found how to go beyond doubling in that build, since everything's in chinese.

This is extremely promising though!

Edit : I'm not entirely sure how relevant that is but I've tried speeding up the footage and I can do up to 4.5* footage speed before stutters happen (compute use goes to 65%).

https://i.imgur.com/Fxt24Ek.png

mpv-lazy\portable_config\vs\rife_cuda.vpy
Find FPS_num = 2 line and increase. Thats assuming you run cuda model (800 mb archive)

Works a bit faster that current SVP for me as well

Re: New RIFE filter - 3x faster AI interpolation possible in SVP!!!

The benchmark in https://github.com/HolyWu/vs-rife/discu … nt-4117604 showed that vs-rife's trt is 2x~3x% faster than vsmlrt's trt. Besides, vs-rife supports rational framerate factor and ensemble mode.

Re: New RIFE filter - 3x faster AI interpolation possible in SVP!!!

dlr5668 wrote:
Pezede wrote:
weiw26 wrote:

Thanks a lot! This is the perfect solution now. I'm using a 2080TI, and I can get full 1080p interpolated with RIFE 4.6+FSRCNNX+Krig running flawlessly. I cannot believe it! GPU memory usage is ~6GB and utilization ~80%. No need to get 4090 for now thank god!

Can confirm that this project gives me frame doubling at around ~15% GPU use with MPV. I however haven't found how to go beyond doubling in that build, since everything's in chinese.

This is extremely promising though!

Edit : I'm not entirely sure how relevant that is but I've tried speeding up the footage and I can do up to 4.5* footage speed before stutters happen (compute use goes to 65%).

https://i.imgur.com/Fxt24Ek.png

mpv-lazy\portable_config\vs\rife_cuda.vpy
Find FPS_num = 2 line and increase. Thats assuming you run cuda model (800 mb archive)

Works a bit faster that current SVP for me as well

Hey thanks, I can achieve *4 after that edit !

Re: New RIFE filter - 3x faster AI interpolation possible in SVP!!!

Is it possible to have low-latency real-time upscaling with RIFE?

I'm currently playing around with it on a directshow input stream but there's a couple of seconds of latency whereas the typical SVP interpolation is near instant.

Re: New RIFE filter - 3x faster AI interpolation possible in SVP!!!

So can we do RIFE with 4K now?

366 (edited by UHD 28-11-2022 18:15:50)

Re: New RIFE filter - 3x faster AI interpolation possible in SVP!!!

kellykline wrote:

So can we do RIFE with 4K now?

4K 8-bit: yes, especially with TensorRT. However, we are still waiting for someone to test how it is with 4K 10-bit HDR.

Over a month ago, I provided a short demo for testing. To increase the chance of someone testing HDR, I even provided a way to do it with a weaker graphics card:


LG 4K HDR Demo - New York.ts
File size: 448 MiB
Duration: 1 min 13 sec
Overall bit rate: 51.4 Mbps
HDR format: SMPTE ST 2086, HDR10 compatible
Width: 3 840 pixels
Height: 2 160 pixels
Frame rate: 25.000 FPS
Color space: YUV
Chroma subsampling: 4:2:0
Bit depth: 10 bits
Direct link: https://drive.google.com/file/d/1dfR5TT … _bGfEXUvJ/
Source: http://hdr4k.blogspot.com/

The file of course needs to be re-encoded to 1080p 10-bit HDR for example as described in this thread:
https://www.reddit.com/r/ffmpeg/comment … intaining/

Re: New RIFE filter - 3x faster AI interpolation possible in SVP!!!

Well I tested with 4K HDR h265, and RIFE AI on 2x on a 2080Ti 5950x does not work - stutters.

368

Re: New RIFE filter - 3x faster AI interpolation possible in SVP!!!

In cuda only mode it is still a lot slower than vulkan.

369

Re: New RIFE filter - 3x faster AI interpolation possible in SVP!!!

Enif wrote:

In cuda only mode it is still a lot slower than vulkan.

That's why we are all so excited about the TensorRT version, which is 50% faster than the ncnn version big_smile

ncnn vs. TensorRT (NVIDIA RTX 3050, 1080p, FP16, model 4.6):
30.71 fps vs. 45.91 fps

https://github.com/HolyWu/vs-rife/discu … nt-4117604

Re: New RIFE filter - 3x faster AI interpolation possible in SVP!!!

UHD wrote:
kellykline wrote:

So can we do RIFE with 4K now?

4K 8-bit: yes, especially with TensorRT. However, we are still waiting for someone to test how it is with 4K 10-bit HDR.

Over a month ago, I provided a short demo for testing. To increase the chance of someone testing HDR, I even provided a way to do it with a weaker graphics card:


LG 4K HDR Demo - New York.ts
File size: 448 MiB
Duration: 1 min 13 sec
Overall bit rate: 51.4 Mbps
HDR format: SMPTE ST 2086, HDR10 compatible
Width: 3 840 pixels
Height: 2 160 pixels
Frame rate: 25.000 FPS
Color space: YUV
Chroma subsampling: 4:2:0
Bit depth: 10 bits
Direct link: https://drive.google.com/file/d/1dfR5TT … _bGfEXUvJ/
Source: http://hdr4k.blogspot.com/

The file of course needs to be re-encoded to 1080p 10-bit HDR for example as described in this thread:
https://www.reddit.com/r/ffmpeg/comment … intaining/

https://cdn.discordapp.com/attachments/290709370600423424/1054371537752440923/image.png
1440p x 2.


with TensortRT, I can archive x5 720p with 3070ti .
https://media.discordapp.net/attachments/290709370600423424/1054364653762920478/image.png

Re: New RIFE filter - 3x faster AI interpolation possible in SVP!!!

VeniVediVeci wrote:

Now I am using a project named mpv_lazy in github(https://github.com/hooke007/MPV_lazy, in Chinese), and it enables RFIE v4.6 using tensorrt. My video card is 3070 laptop and it works totally fine in 1080p real-time doubling FPS, the performance of which is much better than the SVP4. Hope the svp4 can support the trt RFIE as soon as possble.

is there a extra setup needed, rife_cuda doesnt get enable in my case

Re: New RIFE filter - 3x faster AI interpolation possible in SVP!!!

VeniVediVeci wrote:

Now I am using a project named mpv_lazy in github(https://github.com/hooke007/MPV_lazy, in Chinese), and it enables RFIE v4.6 using tensorrt. My video card is 3070 laptop and it works totally fine in 1080p real-time doubling FPS, the performance of which is much better than the SVP4. Hope the svp4 can support the trt RFIE as soon as possble.

Any guide on how to install this? I downloaded the files but doesn't seem to install it when I run install.bat. It displays some warning message in Chinese so no idea what to do.

Re: New RIFE filter - 3x faster AI interpolation possible in SVP!!!

guys, just a reminder that this is a SVP's forum

Re: New RIFE filter - 3x faster AI interpolation possible in SVP!!!

I recently updated to 22H2 build and my mpv RIFE started lagging with >50% GPU usage. Solved that with these tweaks

vo = gpu
gpu-context = d3d11
d3d11-exclusive-fs = yes
hwdec = auto-copy

I also finished https://github.com/vadash/mpv-lazy-en/b … e_cuda.vpy auto scale script. Check README how to calculate constant. Now I can watch 4K HDR downscaled to 2432x1376 @ 80-85% load

Re: New RIFE filter - 3x faster AI interpolation possible in SVP!!!

What GPU is needed for 4k (8-bit)?

Do we need an RTX 4090 or is a “slow” 3080 Ti good enough?