Search options

12-11-2021 10:47:08

UHD wrote:

lwk7454, I would be very grateful if you could find some time to run these tests.

Hi, I finally got some time this weekend, I've tested all the versions except 3.5.
I was using Flowframes because it's very easy to switch versions and it installs different versions automatically.

Results:

FP32 v2.4
FPS 58.22
Cuda ~70%

FP32 v3.1
FPS 59.43
Cuda ~50%

FP32 v2.3
FPS 58.39
Cuda ~70%

FP32 v1.8
FPS 55.98
Cuda 87% Very stable

FP16 v2.4
FPS 57.06
Cuda ~50%

FP16 v3.1
FPS 60.03
Cuda ~40%

FP16 v2.3
FPS 57.20
Cuda ~55%

FP16 v1.8
FPS 58.87
Cuda ~70%

Interesting to see there were not visible differences in transcoding speed, only Cuda usage differed. That could prove there were some kind of bottleneck (bandwidth?) limiting the output.

Also note that nearly all Cuda usage graphs were not stable and fluctuated in a wide range, with only one exception i.e. FP32 v1.8.

06-11-2021 04:35:58

UHD wrote:

Lwk7454 I have a request: could you please download Flowframes - there is a free version or donation option available at this address: https://nmkd.itch.io/flowframes and run comparison tests for the 720p file mentioned above?
All parameters would of course be identical to tests with SVP with RIFE filter for VapourSynth (PyTorch version) ie:
RIFE CUDA/PyTorch version
re-encoding with x2 interpolation
RIFE model: 3.8
scale=1.0
Variable test parameters:
FP16 - "Fast Mode" checked in "AI Specific Settings"
FP32 - "Fast Mode" unchecked in "AI Specific Settings"
Take a look here: https://www.youtube.com/watch?v=vApZg5EO2j4&t=33s
RIFE CUDA Fast Mode: Utilizes Half-Precision (fp16) to speed things up and reduce VRAM usage, but can be unstable
https://github.com/n00mkrad/flowframes
Test results:
re-encoding speed [FPS]
GPU utilisation [%]

So I've tried the free version of Flowframes. Seems its RIFE Pytorch implementation isn't any better than SVP, except GPU usage was a bit unstable, therefore I'm writing down approximated average usages here. Results:

720p, FP16
FPS: 61.32
Cuda: ~45%
720p, FP32
FPS: 63.28
Cuda: ~45%

1080p, FP16
FPS: 26.84
Cuda: ~50%

1080p, FP32
FPS: 27.02
Cuda: ~50%

I've tried Vulkan implementation too, but it only has RIFE 3.1 instead of 3.8:
720p, FP16
FPS: 36.79
Compute_1: 98%
Cuda: 0%
720p, FP32
FPS: 38.85
Compute_1: 98%
Cuda: 0%

Interesting to see Flowframes RIFE with Vulkan has much better performance than SVP.

04-11-2021 05:38:59

UHD wrote:

I'm back.

Welcome back!

Here are the test results:

Parameters:
Test-Time Augmentation: Enabled [sets RIFE filter for VapourSynth (PyTorch)]
re-encoding with x2 interpolation
RIFE model: 3.8
scale=1.0
Encoder: NVIDIA NVENC H.264

720p, FP16
FPS: 63.5
Cuda: 56%

720p, FP32
FPS: 69.8
Cuda: 58%

1080p, FP16
FPS: 26.9
Cuda: 62%

1080p, FP32
FPS: 28.1
Cuda: 66%

I've also tried TTA Disabled for comparison, just 1 test:
720p, FP32
FPS: 25.1
Compute_1: 100%
Cuda: 15%

31-10-2021 09:00:47

UHD wrote:

About virtual nodes:

Thanks for explanation, I see how it works.
My windows 11 had hardware scheduling enabled by default, so all nodes were merged into 3D. I managed to turn it off so I could see Graphics_1, Compute_0, Compute_1 and CUDA virtual nodes.

I've tested all combinations of 720p, 1080p, FP16 and FP32:

FP16 720p: Cuda ~40%, SVP index 1.0
FP16 1080p: Cuda jumps between 35% - 51%, SVP index N/A

FP32 did not work at all, images were all frozen and all nodes were at 0%.

Turns out only Cuda node was used.

30-10-2021 04:23:46

UHD wrote:

Thank you very much for the testing I requested and thank you very much for your willingness to continue to help on this thread.
I have read your post and the test results very carefully. I will come back to certain aspects of these results, but for now I am bothered by a few points

No problem

I only fixed the scale to 1.0 since your request, i.e. after my first post. Here's the changed line 23 in base.py:

smooth = RIFE(input_m,model_ver=3.8,fp16=not rife_precision,scale=1.0,device_index=rife_gpu)

Though as I said before GPU threads was one of the parameters did not affect results, I tested them again, here are the results:

Video: 720p (1280x720), 25FPS, 53 s 680 ms, 4:2:0 YUV, 8 bits

Test A (GPU threads = 2)
GPU: ~47%
VRAM: 4.57 GB
SVP index: 1.00

Test B (GPU threads = 6)
GPU: ~49%
VRAM: 4.59 GB
SVP index: 1.00

I'm not sure what are the virtual nodes you are referring to? I'm always using the same GPU device. Or could you suggest a tool that monitors those virtual nodes separately?
I know it's a bit odd but it seem this parameter doesn't matter...

29-10-2021 22:49:49

egandt wrote:

Still unable to do anything even using the 240p sample

Hi Eric, could you share what parameters you used? It appears exactly the same behavior when I use FP32 with PyTorch. Please try adjust the parameters, you may find a working set.

29-10-2021 07:53:08

UHD wrote:

I have a request to you: could you please stay with us for a while on this thread and help us test real-time playback with RIFE?

Sure, I'm happy to test this out and follow this thread.

For the troubleshooting it was really just the script error, nothing else.

Not sure how to configure vulkan vs d3d11 in VLC, I don't have the config file mentioned in previous posts (SVP 4\mpv64\mpv conf). I suppose it's TTA on or off.
Otherwise I've tried all the parameters you mentioned. Some notes:
I edited base.py that fixed scale to 1.0 regardless of GPU threads.
Following parameters did not affect the results at all, I've tried all combinations with both 720p and 240p videos:
GPU threads
Screen refresh rate. I tried 50Hz and 60Hz
VLC windowed and fullscreen mode
Black bar detection. Switching to this tab causes a 2s freeze, but clicking any options in this tab did not have any effects to the video.

Other values:
RIFE model = 3.8
Scale = 1.0
Math precision = FP16

Following are results:

With the sample 720p file:
TTA on: GPU stables at round 46%, playback is smooth at 50Hz. Switching to this profile from automatic has 3s freeze. Resuming pause will cause a 1s frame drop + freeze. Other actions are fluid without freeze.
TTA off: Lags every second, about half of the frames are dropped, GPU jumps between 1% to 100% in a very consistent manner.

Sample 240p file:
TTA on: GPU stables at around 8%, playback is smooth at 30Hz.
TTA off: GPU stables at around 13%, playback is smooth at 30Hz.

Interestingly when TTA is on, FP32 never worked, it freezes video but playback continues with audio, GPU stays idle.

Hope above data gives you a clear idea how it performed.

28-10-2021 02:32:44

Hi guys, glad I found there are people working on real-time playback with RIFE. I've tried this method myself, hope I'm adding more valuable data to your development. Throwing my specs first：

CPU: Ryzen 7 5800X
GPU: NVIDIA GeForce RTX 3090 24GB
Memory: DDR4 4x16GB (64GB)
System: Windows 11

After some troubleshooting I got my VLC player to work with VS RIFE.
Straight to conclusion: it works perfectly in 720p, but laggy and complicated in 1080p.

Best result's configs I used:
Math precision: FP16
GPU threads: 6

Main take away here is that my GPU was poorly utilized.
In 720p, GPU utilization was about 28%. Using scale=1.0 (i.e. GPU thread=2) it goes up to 48%.
In 1080p GPU utilization was very unstable, when output is smooth it stays at around 60%, but drops to single digits quickly, and frames were droppepd. The result is a very jaggy GPU graph and laggy playback. I feel there are lots of space for optimization, or is my system not configured correctly? Anyway, in terms of raw performance I believe it's achievable to have a smooth playback in 1080p.

Another thing to note is that it ran very stable, I could do everything with the playback, like pause and skip forward/backward. Switching the profile was only a short freeze of 1s.

All video files I tried were 24fps anime videos. To my surprise the output image was not satisfactory to me, artifacts were terrible on texts, and interpolation did not seem much better than default SVP method.

Regarding to script errors:

UHD wrote:

egandt wrote:
Well I have not gotten very far as you can see I get some script failure using PyTorch based RIFE
Chainik, can we please ask you to take a look at the script error https://www.svp-team.com/forum/viewtopi … 458#p79458 with your expert eye?

I got the same error. I looked at the log and figured it was reading an empty file in vs-rife library, therefore the EOF error. Please bear with me that I don't know much about the technologies used here, it seems pytourch is reading this .pkl file below for AI inference perhaps?
SVP 4\mpv64\Lib\site-packages\vsrife\model38\flownet.pkl

I found vs-rife project is providing these files separately from the source code, so I downloaded them and put them into the Lib directory, then everything worked:
https://github.com/HolyWu/vs-rife/releases/tag/model38
Just download the source code zip and overwrite correponding folders and files.
I bet there is a better way to install them with pip, but it's not my profession here, guess Chainik could provide the command.

Search options

Posts found: 8

1 Reply by lwk7454 12-11-2021 10:47:08

Re: New RIFE filter - 3x faster AI interpolation possible in SVP!!! (2,383 replies, posted in Using SVP)

2 Reply by lwk7454 06-11-2021 04:35:58

Re: New RIFE filter - 3x faster AI interpolation possible in SVP!!! (2,383 replies, posted in Using SVP)

3 Reply by lwk7454 04-11-2021 05:38:59

Re: New RIFE filter - 3x faster AI interpolation possible in SVP!!! (2,383 replies, posted in Using SVP)

4 Reply by lwk7454 31-10-2021 09:00:47

Re: New RIFE filter - 3x faster AI interpolation possible in SVP!!! (2,383 replies, posted in Using SVP)

5 Reply by lwk7454 30-10-2021 04:23:46

Re: New RIFE filter - 3x faster AI interpolation possible in SVP!!! (2,383 replies, posted in Using SVP)

6 Reply by lwk7454 29-10-2021 22:49:49

Re: New RIFE filter - 3x faster AI interpolation possible in SVP!!! (2,383 replies, posted in Using SVP)

7 Reply by lwk7454 29-10-2021 07:53:08

Re: New RIFE filter - 3x faster AI interpolation possible in SVP!!! (2,383 replies, posted in Using SVP)

8 Reply by lwk7454 28-10-2021 02:32:44

Re: New RIFE filter - 3x faster AI interpolation possible in SVP!!! (2,383 replies, posted in Using SVP)

Posts found: 8