Re: New RIFE filter - 3x faster AI interpolation possible in SVP!!!

aloola wrote:
DragonicPrime wrote:

Tried to do both. Same result. Not sure what's wrong

try to use AIDA64 to benchmark your DDRAM first.
https://cdn.discordapp.com/attachments/290709370600423424/1081482725715873792/image.png

Almost a 1 to 1 copy of yours it seems
https://i.imgur.com/gOVpK8D.png

Re: New RIFE filter - 3x faster AI interpolation possible in SVP!!!

UHD wrote:

Who wants to interpolate 4K HDR videos x5 in real time with RIFE?
Perhaps a more appropriate question is who can afford it?

After reading all the posts on this thread I come to the conclusion that it is possible, and the limitation is not the NVIDIA GeForce RTX 4090 graphics card but the RAM bandwidth.

For someone with unlimited financial resources this is the solution:

ASUS Pro WS W790E-SAGE SE motherboard
Intel Xeon W9-3495X processor
G.SKILL DDR5-6400 CL32-39-39-102 octa-channel R-DIMM memory modules

https://www.gskill.com/img/pr/2023.02.23-zeta-r5-rdimm-ddr5-announce/06-zeta-r5-rdimm-spec-table-eng.png
Source: https://www.gskill.com/community/150223 … erformance

The result?

303.76GB/s read, 227.37 GB/s write, and 257.82 GB/s copy speed in the AIDA64 memory bandwidth benchmark, as seen in the screenshot below:

https://www.gskill.com/img/pr/2023.02.23-zeta-r5-rdimm-ddr5-announce/04-zeta-r5-rdimm-ddr5-6400-c32-16gbx8-bandwidth.png
Source: https://www.gskill.com/community/150223 … erformance

Of course, the Intel Xeon W9-3495X is completely out of my reach....

The cheapest unlocked Intel Xeon would be the W5-2455X at a suggested price of $1039: https://www.anandtech.com/show/18741/in … -5-0-lanes Should be enough. If the current dual-channel DDR5-6000 allows x3 interpolation then the quad-channel DDR5-6400 should be enough for x5 real-time interpolation.

I am looking for someone who has an NVIDIA GeForce RTX 4090 graphics card, at least DDR5-6000 memory and some spare time to test how RIFE real-time interpolation scales at different RAM speeds.

Alternatively, someone who would like to build an HTPC based on quad-channel or octa-channel DDR5-6400 R-DIMM memory. Octa-channel is for the 4K 240Hz screens that will soon go on sale big_smile

I am using 12900K at 5.2 ghz all core overclock and I overclocked my ram from 6000mhz 40 40 40 xmp 3 profile to 6400 36 38 38 after reading your text. I was using RIFE tensor rt with 3040x1710 downscaling for 60 fps interpolation, after overclocking the RAM, with my rtx 4090 and hags off, I can interpolate 3840x2160 60 fps with no frame drops! Thanks for the heads up.

678 (edited by UHD 04-03-2023 15:41:53)

Re: New RIFE filter - 3x faster AI interpolation possible in SVP!!!

onurco wrote:

I am using 12900K at 5.2 ghz all core overclock and I overclocked my ram from 6000mhz 40 40 40 xmp 3 profile to 6400 36 38 38 after reading your text. I was using RIFE tensor rt with 3040x1710 downscaling for 60 fps interpolation, after overclocking the RAM, with my rtx 4090 and hags off, I can interpolate 3840x2160 60 fps with no frame drops! Thanks for the heads up.

Thanks for sharing the test results.

Did I understand correctly?

6000mhz 40 40 40, hags off,
Fail 3840x2160
OK 3840x2160 with downscaling to 3040x1710

6400mhz 36 38 38, hags off,
OK 3840x2160 without downscaling

I mean, was the only change an overclocking of the RAM? Just hags off with 6000mhz was not enough?

Second question.

Are you writing about interpolation:
24 fps → 60 fps
25 fps → 60 fps
30 fps → 60 fps
60 fps → 120 fps

?

679 (edited by onurco 04-03-2023 19:17:18)

Re: New RIFE filter - 3x faster AI interpolation possible in SVP!!!

At 6000mhz hags was on. But with hags off, I also did test in the past, at that time only difference was 3200x1800 was possible at 24fps to 60fps.
with 6400mhz 36 38 38 hags off, 3840x2160 24fps to 60fps possible and I didn't test further because I am on a 60hz 4k projector. Also I checked now that with HAGS on, it is still working 3840x2160 24 to 60fps without any frame drops.

I use MPC-HC with MPC Video Renderer. In the past I was using MadVR but now with the new RTX Video Super Resolution, I changed to MPC Video Renderer. For 4K Content it makes no difference but for 1080p and below content, I see much more improvement than the MadVR upscaling.

https://github.com/emoose/VideoRenderer … ag/rtx-1.1
This is  the release github link for  MPC-HC and MPC-BE players.

I specifically use this build because the latest release  didn't work on  my system strangely.
https://github.com/emoose/VideoRenderer … f8c3e2.zip

Also  as RAM, I  am using Kingston  Fury 6000mhz 2x16gb.
https://www.amazon.com/Kingston-2x16GB- … B09N5M5PH3
When I bought it, there was  chip crisis,  I paid around 500 usd, damn it got much cheaper now big_smile

My motherboard has IC SK Hynix. It has excellent price/performance.
https://www.amazon.com/MSI-Z690-ProSeri … B09KKYS967

In this link you can see that SK Hynix is the best IC for DDR5 overclocking. When I bought it, I had no idea what the motherboard had for  IC but now I am  glad to see that it  has SK Hynix big_smile
https://www.overclockers.com/ddr5-overclocking-guide/

I read that also all DDR5 rams have built in Error Correction in it, so overclocking DDR5 is much safer now for system stability.
On-Die ECC
We typically only see the Error Correction Code (ECC) implemented in the server and workstation world. Traditional performance DDR4 UDIMM does not come with ECC capabilities. The CPU is responsible for handling the error correction. Due to the frequency limitations, we’ve only seen ECC on lower-spec kits. DDR5 changes everything in this regard because ECC comes standard on every DDR5 module produced. Therefore, the system is then unburdened as it no longer needs to do the error correction.

680 (edited by UHD 04-03-2023 20:26:46)

Re: New RIFE filter - 3x faster AI interpolation possible in SVP!!!

Thanks for clarifying the test information and sharing additional interesting information. I think many people reading this forum will benefit from it.
Regarding ECC, below is a short quote that I think explains the matter in the shortest and best way:

DDR5's on-die ECC doesn't correct for DDR channel errors and enterprise will continue to use the typical side-band ECC alongside DDR5's additional on-die ECC.

https://www.quora.com/What-is-the-diffe … d-real-ECC

Back to the tests. Here's a summary, if I've misunderstood something then correct me.

6000mhz 40 40 40, hags on,
Fail 3840x2160
OK 3840x2160 with downscaling to 3040x1710

6000mhz 40 40 40, hags off,
Fail 3840x2160
OK 3840x2160 with downscaling to 3200x1800

6400mhz 36 38 38, hags off,
OK 3840x2160 without downscaling

6400mhz 36 38 38, hags on,
OK 3840x2160 without downscaling

681

Re: New RIFE filter - 3x faster AI interpolation possible in SVP!!!

30.5% increase in performance with HAGS OFF during encoding:
https://www.svp-team.com/forum/viewtopi … 819#p81819

On the other hand, with real-time playback we have:

10.8% increase in performance with HAGS OFF
=(3200*1800)/(3040*1710)

over 59.6% performance increase with 6.7% more RAM bandwidth
=(3840*2160)/(3040*1710)

I know that with a more precisely run test the performance increase is unlikely to exceed the increase in RAM bandwidth, but I think this sufficiently confirms that with a top-of-the-line PC for the present times the bottleneck for 4K HDR files is RAM bandwidth.

682

Re: New RIFE filter - 3x faster AI interpolation possible in SVP!!!

Traditional performance DDR4 UDIMM does not come with ECC capabilities. The CPU is responsible for handling the error correction. Due to the frequency limitations, we’ve only seen ECC on lower-spec kits.

https://www.overclockers.com/ddr5-overclocking-guide/


Mushkin Redline ECC Black DIMM Kit 32GB, DDR4-3600, CL16-19-19-39, ECC (MRC4E360GKKP16GX2)

Mushkin Redline ECC White DIMM Kit 32GB, DDR4-3600, CL16-19-19-39, ECC (MRD4E360GKKP16GX2)

683 (edited by Mardon85 07-03-2023 19:55:00)

Re: New RIFE filter - 3x faster AI interpolation possible in SVP!!!

This is a decent and more consistent way of checking latency using an official intel back end tool:


https://github.com/FarisR99/IMLCGui

6400mhz Dual Rank (64Gb) 32-38-38-85

https://i.imgur.com/wQsUhOv.jpg

684 (edited by Mardon85 07-03-2023 22:08:54)

Re: New RIFE filter - 3x faster AI interpolation possible in SVP!!!

Ok i've figured it out.
HAGS made zero difference BTW.

The key is Nvidia power management in control panel. Set this for MPV player (i've not got MPC working yet) not globally:

https://i.imgur.com/Uh2nwx5.png

As soon as you set this to high the playback is smooth at x3 however look at the power useage vs x2. Perhaps I can play around to find the lowest possible settings to run at 3x's when I have some more time:

x3 (Maximum Performance)
https://i.imgur.com/HVREPrR.png

x2 (Normal Power usage)
https://i.imgur.com/BY6uCqX.png

My SVP settings are as follows:

https://i.imgur.com/SBPaBcz.png

x4 does not work.

So I guess pay your money take your choice.

Edit I've played around with this now and get power usage down somewhat. I can't go any lower on the core than 2015mhz a or else the voltage jumps back up to 1.05v for some reason? I've also downclocked the memory.

https://i.imgur.com/KwLcrTq.png

685 (edited by UHD 08-03-2023 00:05:27)

Re: New RIFE filter - 3x faster AI interpolation possible in SVP!!!

Thanks Mardon85 for the further tests and I'm very glad you found a way to interpolate x3 in real time. I think the setting you showed might also help someone.

One more thing can increase efficiency:

scripts/vsmlrt.py: added support for rife v2 implementation
(experimental) rife v2 models can be downloaded on https://github.com/AmusementClub/vs-mlr … nal-models ("rife_v2_v{version}.7z"). It leverages onnx's shape tensor to reduce memory transaction from cpu to gpu by 36.4%. It also handles padding internally so explicit padding is not required.

This update came out a few days ago: https://github.com/AmusementClub/vs-mlr … c649dfb212

I guess all you need to do is replace the vsmlrt.py file and download the rife v2 models. And maybe remove explicit padding, I am not a programmer.

Share your impressions on how this modification, and what Mardon85 suggests, affects performance.

Re: New RIFE filter - 3x faster AI interpolation possible in SVP!!!

Unable to download Rife TensorRT when reinstalling Svp4, it seems that there is something wrong with the server.

https://i.imgur.com/ZuDP4ig.jpeg

Re: New RIFE filter - 3x faster AI interpolation possible in SVP!!!

UHD wrote:

Thanks Mardon85 for the further tests and I'm very glad you found a way to interpolate x3 in real time. I think the setting you showed might also help someone.

One more thing can increase efficiency:

scripts/vsmlrt.py: added support for rife v2 implementation
(experimental) rife v2 models can be downloaded on https://github.com/AmusementClub/vs-mlr … nal-models ("rife_v2_v{version}.7z"). It leverages onnx's shape tensor to reduce memory transaction from cpu to gpu by 36.4%. It also handles padding internally so explicit padding is not required.

This update came out a few days ago: https://github.com/AmusementClub/vs-mlr … c649dfb212

I guess all you need to do is replace the vsmlrt.py file and download the rife v2 models. And maybe remove explicit padding, I am not a programmer.

Share your impressions on how this modification, and what Mardon85 suggests, affects performance.

I'll give this a go and report back.

Re: New RIFE filter - 3x faster AI interpolation possible in SVP!!!

zerosoul9901

fixed

Re: New RIFE filter - 3x faster AI interpolation possible in SVP!!!

Mardon85 wrote:

My SVP settings are as follows:

https://i.imgur.com/SBPaBcz.png

What is this user defined option with Tensor RT?

690 (edited by Mardon85 08-03-2023 18:52:59)

Re: New RIFE filter - 3x faster AI interpolation possible in SVP!!!

Erm if this isn't supposed to be there it's possibly left over from tweaks we had to make before it was officially implemented?

Re: New RIFE filter - 3x faster AI interpolation possible in SVP!!!

UHD wrote:

Thanks Mardon85 for the further tests and I'm very glad you found a way to interpolate x3 in real time. I think the setting you showed might also help someone.

One more thing can increase efficiency:

scripts/vsmlrt.py: added support for rife v2 implementation
(experimental) rife v2 models can be downloaded on https://github.com/AmusementClub/vs-mlr … nal-models ("rife_v2_v{version}.7z"). It leverages onnx's shape tensor to reduce memory transaction from cpu to gpu by 36.4%. It also handles padding internally so explicit padding is not required.

This update came out a few days ago: https://github.com/AmusementClub/vs-mlr … c649dfb212

I guess all you need to do is replace the vsmlrt.py file and download the rife v2 models. And maybe remove explicit padding, I am not a programmer.

Share your impressions on how this modification, and what Mardon85 suggests, affects performance.

From my tests, using ensemble models improves frame interpolation quality, while v2 models significantly reduces the playback seek time, improve performance and should also reduce RAM bandwidth requirements.

692

Re: New RIFE filter - 3x faster AI interpolation possible in SVP!!!

zerosoul9901 wrote:

From my tests, using ensemble models improves frame interpolation quality, while v2 models significantly reduces the playback seek time, improve performance and should also reduce RAM bandwidth requirements.

Thanks, that's very good news smile

693 (edited by Mardon85 09-03-2023 09:38:58)

Re: New RIFE filter - 3x faster AI interpolation possible in SVP!!!

UHD wrote:
zerosoul9901 wrote:

From my tests, using ensemble models improves frame interpolation quality, while v2 models significantly reduces the playback seek time, improve performance and should also reduce RAM bandwidth requirements.

Thanks, that's very good news smile

Please can you elaborate on "And maybe remove explicit padding, I am not a programmer." So I can get this installed and tested please?

Cheers

Update I tried the files about (its probably user error) but I couldn't get them to work. Hopefully the SVPTeam will issue an offical update with them in?

Re: New RIFE filter - 3x faster AI interpolation possible in SVP!!!

Isnt ensemble twice as slow?

Re: New RIFE filter - 3x faster AI interpolation possible in SVP!!!

> while v2 models significantly reduces the playback seek time, improve performance

dunno, I can't see any improvements
not "significantly", not even slightly

----
unpack "rife_v2" into SVP 4\rife\models (i.e. there must be SVP 4\rife\models\rife_v2\rife_v4.6.onnx)
replace SVP 4\rife\vsmlrt.py

switch back to "v1" models - rename "rife_v2" folder to any other name, like "_rife_v2"

Post's attachments

vsmlrt.py 53.63 kb, 566 downloads since 2023-03-09 

Re: New RIFE filter - 3x faster AI interpolation possible in SVP!!!

Chainik wrote:

> while v2 models significantly reduces the playback seek time, improve performance

dunno, I can't see any improvements
not "significantly", not even slightly

----
unpack "rife_v2" into SVP 4\rife\models (i.e. there must be SVP 4\rife\models\rife_v2\rife_v4.6.onnx)
replace SVP 4\rife\vsmlrt.py

switch back to "v1" models - rename "rife_v2" folder to any other name, like "_rife_v2"

v2 takes 7.5/8 and crashes at seek, v1 stays at 3.5/8 as it should

697 (edited by UHD 09-03-2023 13:24:21)

Re: New RIFE filter - 3x faster AI interpolation possible in SVP!!!

Mardon85 wrote:
UHD wrote:
zerosoul9901 wrote:

From my tests, using ensemble models improves frame interpolation quality, while v2 models significantly reduces the playback seek time, improve performance and should also reduce RAM bandwidth requirements.

Thanks, that's very good news smile

Please can you elaborate on "And maybe remove explicit padding, I am not a programmer." So I can get this installed and tested please?

Cheers

Update I tried the files about (its probably user error) but I couldn't get them to work. Hopefully the SVPTeam will issue an offical update with them in?



IPaddingLayer is deprecated in TensorRT 8.2 and will be removed in TensorRT 10.0. Use ISliceLayer to pad the tensor, which supports new non-constant, reflects padding mode and clamp, and supports padding output with dynamic shape.

https://docs.nvidia.com/deeplearning/te … dding.html

Chainik will probably know better what this is all about.

698 (edited by Criptaike 09-03-2023 16:47:50)

Re: New RIFE filter - 3x faster AI interpolation possible in SVP!!!

Hello, so I've wanted to try the rife engine, the vulkan version works fine, but I sure want the tensorRT version, so when selecting rife after opening a movie, a cmd window pops up and mpc-hc crashes, it just goes to shit and it doesn't really say that there's an error or something, try to search the problem on google and this forum and no luck. Here is a gif of all that happening -> https://imgur.com/a/VHVCvBZ

Any ideas on what's happening? The vulkan one works.

Thanks

Re: New RIFE filter - 3x faster AI interpolation possible in SVP!!!

dlr5668 wrote:
Chainik wrote:

> while v2 models significantly reduces the playback seek time, improve performance

dunno, I can't see any improvements
not "significantly", not even slightly

----
unpack "rife_v2" into SVP 4\rife\models (i.e. there must be SVP 4\rife\models\rife_v2\rife_v4.6.onnx)
replace SVP 4\rife\vsmlrt.py

switch back to "v1" models - rename "rife_v2" folder to any other name, like "_rife_v2"

v2 takes 7.5/8 and crashes at seek, v1 stays at 3.5/8 as it should

Don't use dynamic shapes, it doesn't seem to work well with v2 models and requires more VRAM usage compared to v1, use static shapes instead, and use flag force_fp16 instead of fp16 during inference, reducing memory usage and engine build time during engine build, you can also add _implementation=2 flag to easily switch v2 models.

Re: New RIFE filter - 3x faster AI interpolation possible in SVP!!!

Criptaike
> Any ideas on what's happening?

don't touch anything
wait