1

(426 replies, posted in Using SVP)

I'm using GPU passthrough for over 4 years now w/o any issue. You may confuse this with igpu passthrough (which was unusable in the past).

I figured out, that the problem was with nvidia-39xxx, using 340 instead and everything works as expected. GPU load is @98% and CPU load @10%.

Edit:
It seems that with opencl-nvidia-390xx the card stays in lowest power mode and some more issues (like it still could have a higher GPU load even in lowest power mode)
As I read through the internet I see that other ppl having issues with 390 and "older" cards as well.

Edit2:
It's just another bug in nvidias proprietary driver, at least there's some work done on OpenCL support in nouveau driver this summer, maybe we can get rid of this nvidia driver shit later on this year.
I keep it with Linus "NVidia, fuck you!"

Regards

2

(426 replies, posted in Using SVP)

Yes this system is virtualized, but performance is at 95% @ everywhere else. What about the low GPU utalization?

3

(426 replies, posted in Using SVP)

According to https://www.svp-team.com/wiki/GPU_Compatibility a GT430 should be capable, but GPU accerlation is way to slow. I recognized that if disable GPU accerlation CPU usage is around 40-50% and GPU Utilization is @ 30-33%. If I enable GPU accerlation CPU usage drops to 10% and GPU Utilization is below 10% and video playback is hard desynced and stuttering

(Video Engine Utilazation is always @0%)

Logs:

qt5ct: using qt5ct plugin
12:53:25.575 [i]: Main: starting up SVP 4 Linux [4.2.0.137]... 
12:53:25.576 [i]: Main: args: none
12:53:25.576 [i]: Main: working dir is ***
12:53:25.576 [i]: Main: data dir set to ***
12:53:25.577 [i]: Settings: loading reg.cfg OK
12:53:25.577 [i]: Settings: loading main.cfg OK
12:53:25.577 [i]: Settings: loading ui.cfg OK
12:53:25.577 [i]: Settings: loading frc.cfg OK
12:53:25.578 [i]: Settings: loading profiles.cfg OK
12:53:25.578 [i]: Settings: loading custom.cfg OK
12:53:25.578 [i]: Settings: loading lights.cfg OK
12:53:25.578 [i]: Main: using Qt 5.11.1 (x86_64-little_endian-lp64 shared (dynamic) release build; by GCC 8.1.1 20180531)
12:53:25.579 [i]: Main: device scale is 1, user defined scale is 1
12:53:25.579 [i]: Main: system locale is [de]
12:53:25.579 [i]: Main: preferred language is [de-de]
12:53:25.581 [i]: Main: setting language file to de.qm...
12:53:25.583 [i]: Main: module 'plugins/libsvpflow1_vs64.so': 4.2.0.133
12:53:25.583 [i]: Main: module 'plugins/libsvpflow2_vs64.so': 4.2.0.145
12:53:25.585 [i]: Main: VLC filter (64 bit): 0.9.0.114
12:53:25.585 [i]: Main: running OpenCL info...
qt5ct: 12:53:25.590 [i]: D-Bus system tray: no
12:53:25.592 [i]: Main: collecting system information...
12:53:25.597 [i]: OS: Linux 4.17.3-1-ARCH #1 SMP PREEMPT Tue Jun 26 04:42:36 UTC 2018 x86_64
12:53:25.622 [i]: Desktop environment: /usr/bin/lightdm / gnome
12:53:25.626 [i]: CPU: Common KVM processor [base frequency 3000 MHz, 16 threads]
12:53:25.628 [i]: Video: reading OpenCL info...
12:53:25.656 [i]: Video: 1 GPU OpenCL device(s) on NVIDIA CUDA [OpenCL 1.2 CUDA 9.1.84] (NVIDIA Corporation)
12:53:25.656 [i]: Video 1: device name 'GeForce GT 430' (NVIDIA Corporation, ver.390.67) [gpuID=11]: OK
12:53:25.658 [i]: Memory:  3943  MB total,  2698 MB free
12:53:25.658 [i]: System: finding network settings...
12:53:25.687 [i]: Power: AC is ON [1]
12:53:25.698 [i]: Screens: updating information, 2 screen(s) found
12:53:25.698 [i]: Screens: screen 0 (VGA-0) - 1920x1080 @60.000 Hz, x1.0 [68 DPI]
12:53:25.698 [i]: Screens: screen 1 (HDMI-0) - 1920x1080 @60.000 Hz, x1.0 [30 DPI]
12:53:25.698 [i]: Screens: primary screen is 0
12:53:25.714 [i]: Main: preparing video profiles...
12:53:25.752 [i]: Main: preparing performance graphs...
12:53:25.788 [i]: Main: preparing mpv...
12:53:25.809 [i]: Main: preparing remote control...
12:53:25.809 [i]: RemoteControl: started
12:53:25.810 [i]: Main: preparing main menu...
12:53:25.827 [i]: Main: loading extensions...
12:53:25.831 [i]: Extensions: found svplight 2.0.0.116 ...
12:53:25.832 [i]: Settings: loading leds.cfg OK
12:53:25.892 [i]: Main: initialization completed in 301 ms
12:53:25.964 [i]: Updates: checking now...
12:53:28.754 [i]: Performance: quick estimation = 674 (previous value was 667)

Performance test:

12:53:48.890 [i]: Performance: motion vectors estimation = 5687
12:53:59.400 [i]: Performance: CPU-based frame rendering = 8474
12:54:10.332 [i]: Performance: GPU-based frame rendering [gpuID=11] = 22

Disabled GPU accerlation:

 (+) Video --vid=1 (*) (h264 1280x720 25.000fps)
 (+) Audio --aid=1 --alang=ger (*) (dts 6ch 48000Hz)
     Audio --aid=2 --alang=eng (ac3 6ch 48000Hz)
13:04:51.297 [i]: VideoPlayer: mpv connected, waiting for the video info...
AO: [pulse] 48000Hz 5.1(side) 6ch float
VO: [gpu] 1280x720 yuv420p
13:04:53.757 [i]: VideoPlayer: mpv 0.28.0-624-g861c10268d
13:04:54.386 [i]: Media: video 1280x720 [PAR 1.000] at 25.000 fps [constant]
13:04:54.386 [i]: Media: codec type is AVC, YUV/4:2:0/8 bits
13:04:54.711 [i]: Playback: starting up...
13:04:54.738 [i]: Playback [2e5b9ea1]: resulting video frame 1920x1080 [1280x720 -> scaled -> 1920x1080]
13:04:54.738 [i]: Playback [2e5b9ea1]: 1 acceptible profiles, best is 'Automatisch' [0]
13:04:54.743 [i]: Playback [2e5b9ea1]: enabled while video is playing
13:04:54.744 [i]: Profile: using auto values [1]
13:04:54.785 [i]: Playback [2e5b9ea1]: playing at 60 [25 *12/5] 
13:04:55.010 [W]: Lights: attempt to turn on while LED hardware isn't connected
VO: [gpu] 1920x1080 yuv420p

Enabled GPU accerlation

(+) Video --vid=1 (*) (h264 1280x720 25.000fps)
 (+) Audio --aid=1 --alang=ger (*) (dts 6ch 48000Hz)
     Audio --aid=2 --alang=eng (ac3 6ch 48000Hz)
13:07:17.664 [i]: VideoPlayer: mpv connected, waiting for the video info...
AO: [pulse] 48000Hz 5.1(side) 6ch float
VO: [gpu] 1280x720 yuv420p
13:07:18.932 [i]: VideoPlayer: mpv 0.28.0-624-g861c10268d
13:07:19.273 [i]: Media: video 1280x720 [PAR 1.000] at 25.000 fps [constant]
13:07:19.273 [i]: Media: codec type is AVC, YUV/4:2:0/8 bits
13:07:19.298 [i]: Playback: starting up...
13:07:19.306 [i]: Playback [2e5b9ea1]: resulting video frame 1920x1080 [1280x720 -> scaled -> 1920x1080]
13:07:19.306 [i]: Playback [2e5b9ea1]: 1 acceptible profiles, best is 'Automatisch' [0]
13:07:19.307 [i]: Playback [2e5b9ea1]: enabled while video is playing
13:07:19.308 [i]: Profile: using auto values [1]
13:07:19.322 [i]: Playback [2e5b9ea1]: playing at 60 [25 *12/5] 
13:07:19.509 [W]: Lights: attempt to turn on while LED hardware isn't connected
VO: [gpu] 1920x1080 yuv420p

Audio/Video desynchronisation detected! Possible reasons include too slow
hardware, temporary CPU spikes, broken drivers, and broken files. Audio
position will not match to the video (see A-V status field).

opencl.log:

Number of platforms                               1
  Platform Name                                   NVIDIA CUDA
  Platform Vendor                                 NVIDIA Corporation
  Platform Version                                OpenCL 1.2 CUDA 9.1.84
  Platform Profile                                FULL_PROFILE
  Platform Extensions                             cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_copy_opts cl_nv_create_buffer
  Platform Extensions function suffix             NV

  Platform Name                                   NVIDIA CUDA
Number of devices                                 1
  Device Name                                     GeForce GT 430
  Device Vendor                                   NVIDIA Corporation
  Device Vendor ID                                0x10de
  Device Version                                  OpenCL 1.1 CUDA
  Driver Version                                  390.67
  Device OpenCL C Version                         OpenCL C 1.1 
  Device Type                                     GPU
  Device Profile                                  FULL_PROFILE
  Device Topology (NV)                            PCI-E, 01:00.0
  Max compute units                               2
  Max clock frequency                             1400MHz
  Compute Capability (NV)                         2.1
  Max work item dimensions                        3
  Max work item sizes                             1024x1024x64
  Max work group size                             1024
  Preferred work group size multiple              32
  Warp size (NV)                                  32
  Preferred / native vector sizes                 
    char                                                 1 / 1       
    short                                                1 / 1       
    int                                                  1 / 1       
    long                                                 1 / 1       
    half                                                 0 / 0        (n/a)
    float                                                1 / 1       
    double                                               1 / 1        (cl_khr_fp64)
  Half-precision Floating-point support           (n/a)
  Single-precision Floating-point support         (core)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  No
  Double-precision Floating-point support         (cl_khr_fp64)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  No
  Address bits                                    64, Little-Endian
  Global memory size                              1011351552 (964.5MiB)
  Error Correction support                        No
  Max memory allocation                           252837888 (241.1MiB)
  Unified memory for Host and Device              No
  Integrated memory (NV)                          No
  Minimum alignment for any data type             128 bytes
  Alignment of base address                       4096 bits (512 bytes)
  Global Memory cache type                        Read/Write
  Global Memory cache size                        32768
  Global Memory cache line                        128 bytes
  Image support                                   Yes
    Max number of samplers per kernel             16
    Max 2D image size                             16384x16384 pixels
    Max 3D image size                             2048x2048x2048 pixels
    Max number of read image args                 128
    Max number of write image args                8
  Local memory type                               Local
  Local memory size                               49152 (48KiB)
  Registers per block (NV)                        32768
  Max constant buffer size                        65536 (64KiB)
  Max number of constant args                     9
  Max size of kernel argument                     4352 (4.25KiB)
  Queue properties                                
    Out-of-order execution                        Yes
    Profiling                                     Yes
  Profiling timer resolution                      1000ns
  Execution capabilities                          
    Run OpenCL kernels                            Yes
    Run native kernels                            No
    Kernel execution timeout (NV)                 Yes
  Concurrent copy and kernel execution (NV)       Yes
    Number of async copy engines                  1
  Device Available                                Yes
  Compiler Available                              Yes
  Device Extensions                               cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_copy_opts cl_nv_create_buffer

NULL platform behavior
  clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...)  No platform
  clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...)   No platform
  clCreateContext(NULL, ...) [default]            No platform
  clCreateContext(NULL, ...) [other]              Success [NV]
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU)  No platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU)  No platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR)  No platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM)  No platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL)  No platform

Something wrong with my system or is the information about the GT430 just out of date, and the card is to slow for SVP 4.x?
But I'm absolutly confused about the CPU/GPU utalization. Why is GPU utalization higher if I disable accerlation, and why is GPU utalization just around 10% if accerlation is enabled (no wonder it lags/stutters/desyncs as hell).

Best regards