UHD wrote:
Pezede wrote:

I've tried enabling and disabling and I'm not seeing any color differences.

For what it's worth even under 4K@25*3 I'm not seeing cuda use above 50% either...

Thank you!

Pezede, I think you have the perfect setup to test what really matters in the success of your interpolation.

I think with 4K HDR x3 in real time with RIFE, there is no point in playing around with encoding tests anymore.

What everyone is probably interested in is what the minimum configuration allows for:
1. 4K HDR x3 in real time with RIFE
2. 4K HDR x2 in real time with RIFE

My test proposal:

LG 4K HDR Demo - New York.ts
Direct link: https://drive.google.com/file/d/1dfR5TT … _bGfEXUvJ/
Source: http://hdr4k.blogspot.com/

1. Reduce only the TDP of GPU in steps of 10%

100%  4K HDR x3 in real time pass or fail and if fail 4K HDR x2 in real time
90% 4K HDR x3 in real time pass or fail and if fail 4K HDR x2 in real time
80% 4K HDR x3 in real time pass or fail and if fail 4K HDR x2 in real time
70% 4K HDR x3 in real time pass or fail and if fail 4K HDR x2 in real time
60% 4K HDR x3 in real time pass or fail and if fail 4K HDR x2 in real time
50% 4K HDR x3 in real time pass or fail and if fail 4K HDR x2 in real time

50% interests me personally, as I will be able to passively cool the graphics card to around 250W TDP

2. Reduce only the TDP of CPU in steps of 10%

100%  4K HDR x3 in real time pass or fail and if fail 4K HDR x2 in real time
90% 4K HDR x3 in real time pass or fail and if fail 4K HDR x2 in real time
80% 4K HDR x3 in real time pass or fail and if fail 4K HDR x2 in real time
70% 4K HDR x3 in real time pass or fail and if fail 4K HDR x2 in real time
60% 4K HDR x3 in real time pass or fail and if fail 4K HDR x2 in real time
50% 4K HDR x3 in real time pass or fail and if fail 4K HDR x2 in real time

3. Reduce only the MHz of RAM in steps of 400 MHz

6000 MHz 4K HDR x3 in real time pass or fail and if fail 4K HDR x2 in real time
5600 MHz 4K HDR x3 in real time pass or fail and if fail 4K HDR x2 in real time
5200 MHz 4K HDR x3 in real time pass or fail and if fail 4K HDR x2 in real time
4800 MHz 4K HDR x3 in real time pass or fail and if fail 4K HDR x2 in real time
4400 MHz 4K HDR x3 in real time pass or fail and if fail 4K HDR x2 in real time
4000 MHz 4K HDR x3 in real time pass or fail and if fail 4K HDR x2 in real time
3600 MHz 4K HDR x3 in real time pass or fail and if fail 4K HDR x2 in real time

4800 MHz is of particular interest to me because this is currently the maximum with true ECC

I know this is probably too many tests, so maybe a simple 4 tests to start with and that will explain a lot from all of the above:

1.  GPU 50% TDP 4K HDR x3 in real time pass or fail and if fail 4K HDR x2 in real time

2.  CPU 50% TDP (or 105W TDP or 65W TDP) 4K HDR x3 in real time pass or fail and if fail 4K HDR x2 in real time

3. RAM 4800 MHz 4K HDR x3 in real time pass or fail and if fail 4K HDR x2 in real time

4. RAM 3600 MHz 4K HDR x3 in real time pass or fail and if fail 4K HDR x2 in real time

Of course, it doesn't have to be today or tomorrow. It doesn't have to be anything at all. But I think you could help a lot of people to choose the optimal configuration for RIFE interpolation smile


Hey, I'm afraid that I don't really have the time to do those tests, it doesn't help that tinkering with ram on AM5 is far from a pleasant experience (I've tried)... sad

UHD wrote:
Pezede wrote:

movie *3 :

https://i.imgur.com/cSYcCPx.jpeg

Thanks a lot! 4K HDR x3 in real time with RIFE!!! Unbelievable!!!

And I so wanted to save on RAM and CPU sad

Do you see any colour difference on the HDR screen watching this demo without interpolation and with RIFE interpolation?

You're welcome! smile

I've tried enabling and disabling and I'm not seeing any color differences.

For what it's worth even under 4K@25*3 I'm not seeing cuda use above 50% either...

UHD wrote:
Pezede wrote:

movie *3 :

https://i.imgur.com/cSYcCPx.jpeg

Thanks a lot! 4K HDR x3 in real time with RIFE!!! Unbelievable!!!

And I so wanted to save on RAM and CPU sad

Do you see any colour difference on the HDR screen watching this demo without interpolation and with RIFE interpolation?

You're welcome! smile

I've tried enabling and disabling and I'm not seeing any color differences.

UHD wrote:
DragonicPrime wrote:
UHD wrote:

Could you check if the video below can now be interpolated in real time with RIFE?
Second question: is it possible to preserve the 10 bit colour depth and HDR when interpolating in real time with RIFE the video below?
In other words, could you compare the colours of the video below played back without interpolation and with RIFE interpolation.

LG 4K HDR Demo - New York.ts
File size: 448 MiB
Duration: 1 min 13 sec
Overall bit rate: 51.4 Mbps
HDR format: SMPTE ST 2086, HDR10 compatible
Width: 3 840 pixels
Height: 2 160 pixels
Frame rate: 25.000 FPS
Color space: YUV
Chroma subsampling: 4:2:0
Bit depth: 10 bits

Direct link: https://drive.google.com/file/d/1dfR5TT … _bGfEXUvJ/
Source: http://hdr4k.blogspot.com/

4k HDR still doens't work in real time. I just updated my previous message as well. 4K SDR seems to work with no problems in real time though. 4K with HDR, I only get around 35-40fps


Pezede, would you please check the above demo in real time 4K HDR and fps, we would have an interesting comparison.

I just ran it on my LG C1:

movie *2 :

https://i.imgur.com/m6NLcak.jpeg

movie *3 :

https://i.imgur.com/cSYcCPx.jpeg


I start seeing dropped frames past 80fps, playback seems perfect up to that point.

80fps :

https://i.imgur.com/KlLKxWX.jpeg

Movie * 4 is unwatcheable.

I also tried with 4 threads, no difference.

I've done a reinstall of SVP and I'm now getting ~280fps on 1080P transcoding with the new guide, that seems almost miraculous...

I've gotten the console window and there are lock files in the rife folder so it seems to be used.

Hardware is 4090 paired with a 7950X and DDR5 6000 ram.

Chainik wrote:

That one - https://github.com/AmusementClub/vs-mlrt - is MUUUUCH better
-------------
Let's try this:

0. nothing to do with Python big_smile
1. download --> https://www.svp-team.com/files/temp/rife-trt-0701-1.7z <--, unpack into SVP 4\rife (you should already have this folder), so vstrt.dll must be in the root, i.e. SVP 4\rife\vstrt.dll
2. replace generate.js, base.py in SVP 4\script; restart SVP to be sure
3. menu -> Applications settings -> Additional options -> All settings, go to 'User defined options'
Title: TensortRT; Script name: rife_trt; Other values by default - "FRC profile", "ON or OFF"
Click 'Add option'
Go to the RIFE video profile, see the added TensortRT on/off swtich.
Make a copy of the RIFE video profile, one with TensorRT ON and another with OFF.

Pros:
- no Pytorch!
- works in a real-time
- even faster (?) than vsrife [could be just because of some TensorRT options]
Cons:
- even slower (?) first-time initialization [could be just because of some TensorRT options]
- a nasty command-line window will pop-up for every new video resolution


I'm getting 150fps transcoding with 1080p content with this (4090), this is really good! I'm using 4 gpu threads.

Real time is still slow and choppy but I haven't taken the time to check that everything's properly set-up, I will do it tomorrow.

dlr5668 wrote:
Pezede wrote:
weiw26 wrote:

Thanks a lot! This is the perfect solution now. I'm using a 2080TI, and I can get full 1080p interpolated with RIFE 4.6+FSRCNNX+Krig running flawlessly. I cannot believe it! GPU memory usage is ~6GB and utilization ~80%. No need to get 4090 for now thank god!

Can confirm that this project gives me frame doubling at around ~15% GPU use with MPV. I however haven't found how to go beyond doubling in that build, since everything's in chinese.

This is extremely promising though!

Edit : I'm not entirely sure how relevant that is but I've tried speeding up the footage and I can do up to 4.5* footage speed before stutters happen (compute use goes to 65%).

https://i.imgur.com/Fxt24Ek.png

mpv-lazy\portable_config\vs\rife_cuda.vpy
Find FPS_num = 2 line and increase. Thats assuming you run cuda model (800 mb archive)

Works a bit faster that current SVP for me as well

Hey thanks, I can achieve *4 after that edit !

weiw26 wrote:

Thanks a lot! This is the perfect solution now. I'm using a 2080TI, and I can get full 1080p interpolated with RIFE 4.6+FSRCNNX+Krig running flawlessly. I cannot believe it! GPU memory usage is ~6GB and utilization ~80%. No need to get 4090 for now thank god!

VeniVediVeci wrote:

Now I am using a project named mpv_lazy in github(https://github.com/hooke007/MPV_lazy, in Chinese), and it enables RFIE v4.6 using tensorrt. My video card is 3070 laptop and it works totally fine in 1080p real-time doubling FPS, the performance of which is much better than the SVP4. Hope the svp4 can support the trt RFIE as soon as possble.

Can confirm that this project gives me frame doubling at around ~15% GPU use with MPV. I however haven't found how to go beyond doubling in that build, since everything's in chinese.

This is extremely promising though!

Edit : I'm not entirely sure how relevant that is but I've tried speeding up the footage and I can do up to 4.5* footage speed before stutters happen (compute use goes to 65%).

https://i.imgur.com/Fxt24Ek.png

Actually I just noticed that compute use is basically non-existent in realtime:

https://i.imgur.com/7l7JD1N.png

It goes to and stays at 100% while transcoding, but maxes out at 40-50% and often goes to 0-18% in real-time.

UHD wrote:

Very strange...

Could you please check if it is possible to interpolate x2 in real time with RIFE:

LG 4K HDR Demo - New York.ts
File size: 448 MiB
Duration: 1 min 13 sec
Overall bit rate: 51.4 Mbps
HDR format: SMPTE ST 2086, HDR10 compatible
Width: 3 840 pixels
Height: 2 160 pixels
Frame rate: 25.000 FPS
Color space: YUV
Chroma subsampling: 4:2:0
Bit depth: 10 bits
Direct link: https://drive.google.com/file/d/1dfR5TT … _bGfEXUvJ/
Source: http://hdr4k.blogspot.com/

Just tried it with the MPC-HC default video renderer, footage is very choppy with frameskipping in *2.

UHD wrote:

Have you tried to do a real-time RIFE interpolation test for 1080p:
x3
x4
x5

?

Realtime doesn't seem to work well beyond x2 on that same anime source, there are desyncs and choppy frames on x2.5, x3, x4,...

Using the madVR renderer with no fancy settings applied.

UHD wrote:
Pezede wrote:

I have tried the ultrafast encoding preset, performance did not change but I will be using it from now on as it is the fastest.

"performance did not change" and "is the fastest" how is this possible at the same time?

What I mean is that the amount of fps I was getting didn't change between the lossless and ultrafast encoding settings because I probably wasn't being bottlenecked by the encoding capabilities of my system, despite the encoder ultrafast setting probably being capable of outputting more frames outside of interpolation scenarios.

For example when I transcode with ultrafast (non Rife interpolation) I get 400+ fps.

I'm not entirely sure how all the elements of the chain work though so that's just me theorizing.

UHD wrote:

Have you checked threads 5, 6, 7 and 3?

Try 4.6 ensemble False, there should be more fps.

Performance seems to max out at 3 threads and stay constant beyond that (I have tried all the settings).


On 4.6 EnsembleTrue gives me around half the performance (55fps) while EnsembleFalse gives me the results I previously reported (slightly lower than 4.4 performance)

UHD wrote:

Now a question for everyone: is it possible to set the number of GPU threads higher than 4?

I did try setting it higher than 4 in here:

https://i.imgur.com/Od6Onlx.png

https://i.imgur.com/OmIXqC4.png

Performance is unchanged from 4 to 8 threads, and it is not possible to go higher than 8.

I have tried the ultrafast encoding preset, performance did not change but I will be using it from now on as it is the fastest.

Rife 4.6 seems a tiny bit slower on my end, averaging 105fps with it but it has occasional drops to 95-100.

I have reinstalled SVP (4.6.0.220 2022-10-10) and added the RIFE AI engine component again (Can confirm that this is the ncnn 4.4), performance is as follows now with the same file and encoder settings:

With 1 gpu thread:

https://i.imgur.com/iX3LJka.png

With 2 gpu threads:

https://i.imgur.com/5ofWuhj.png

With 4 gpu threads:

https://i.imgur.com/e7RmuWD.png


It seems that reinstalling did the trick, as this is significantly better than before :-)

UHD wrote:
Pezede wrote:

(Transcoding speed is around 45fps for 1920*1080@23.976 --> 1920@1080@119.88)

Pezede wrote:

https://i.imgur.com/1QMZ4fL.png

45/5*4=36
75,8/2*1=37,9

The results are repeatable, but clearly something is wrong!

Even NVIDIA GeForce RTX 3070 Ti graphics card performs better - 45 new interpolated frames per second:
https://www.svp-team.com/forum/viewtopi … 588#p80588

On the other hand, NVIDIA GeForce RTX 4090 in combination with RIFE is a real tornado!

Below the benchmark tests for RIFE interpolation (fastmode True, ensemble False):

38 new fps (1080p) for 3090+5950x (CUDA+ffmpeg+FrameEval+vs_threads=20) (rife4.0)

63 new fps (1080p) for 3090+5950x (C++ NCNN+vs_threads=20+ncnn_threads=8) (rife4.0)

164 new fps (1080p) for 4090+5950x (ncnn+2 threads+4 vs threads+ffmpeg (ultrafast)) (rife4.6)

Source: https://github.com/styler00dollar/VSGAN-tensorrt-docker

Pezede, I hope you will stay with us for some time on this forum. I have a lot of questions for you! However, the most important thing now is to find the reason for such a large disparity in test results.

Hence the first and most important question:

Are you using the NCNN version of RIFE or the CUDA version?


Hey, I will stay around and provide as much feedback as necessary, as I'd be quite happy to have as much performance as possible :-)

I'm using the rife version (ncnn unless I'm wrong) that comes embedded with SVP, I'm currently trying to set up other models and see if that changes anything.

Chainik wrote:

I mean, switch the video profile to movie x2 and start transcoding. Will it only give 72 fps?

Yes that's the case:

https://i.imgur.com/1QMZ4fL.png

Chainik wrote:

so with a frame doubling from 24 to 48 it will only give like 72 fps?

Oh sorry I meant to say that going from 2 threads to 1 halves performance, the 45 fps performance figure I initially gave was already using 2 threads (default parameters)

So 2 or 4 threads: ~45fps

1 thread: ~20-25fps

Although maybe the seemingly low performance might be due to one of the computes (Compute_0) not being used at all?

Chainik wrote:

> Transcoding speed is around 45fps for 1920*1080@23.976 --> 1920@1080@119.88

sounds too slow hmm which means it can only do 36 interpolated frames per second
how high is the GPU/Compute_0 load? what if increase threads count?

Compute_0 is at 0% use, Compute_1 is at 100% use.

Changing threads from 1 to 2 doubles performance (Compute_1 goes from ~55 to 100% use), 2 to 4 doesn't do anything.

Having acquired a 4090 I've been using the RIFE interpolation the past few days, I'm blown away by how good it is, greatest interpolation method I've experienced so far!

Performance isn't quite there yet for 1080p 120fps realtime (from an anime source) but transcoding doesn't take too long with a lossless NVENC H265 preset (Transcoding speed is around 45fps for 1920*1080@23.976 --> 1920@1080@119.88)