Skip to content

Performance

Martin Pulec edited this page Nov 6, 2019 · 16 revisions

UltraGrid Performance

This site contains the results of performance tests of UltraGrid.

End-to-End Latency

The following table compares performance of individual cards. The results are measured as end-to-end frame-delay, i.e. the number of frames sent before the reciever outputs the original frame.

Linux

Setup:

  • Testing machine: hd2
  • Reference machine: hd4 with BlackMagic Decklink HD Extreme
  • For the test, we used 1080i HD video at 29.97Fps.
Card send recieve
DVS Centaurus II 5 3
BlackMagic DeckLink HD Extreme 3.75 3.75
BlackMagic DeckLink 4K Extreme 3.5 3
BlackMagic Decklink Quad 4.5 4
BlackMagic Decklink Intensity PRO 4.5 4
BlackMagic Decklink Intensity 4.5 4.5
Deltacast 3G 4.5 3
OpenGL - 2.5

MacOS X

Setup:

  • Testing machine: hd7
  • Reference machine: hd4 with BlackMagic Decklink HD Extreme
  • For the test, we used 1080i HD video at 29.97Fps.
Card send recieve
AJA Kona 3G 4 3.5
DeckLink HD Pro (Quicktime) 4.5 5.5
DeckLink HD Pro (native API) 4.5 4
OpenGL (with VSync) - 2.25
OpenGL (without VSync) - 1.75

Windows

Setup:

  • Testing machine: hd7
  • Reference machine: hd4 with BlackMagic Decklink HD Extreme
  • For the test, we used 1080i HD video at 29.97Fps.
Card send recieve
BlackMagic DeckLink HD Extreme 4.5 3.5
BlackMagic Decklink Quad 4.5 4
BlackMagic Decklink Intensity PRO 4.5 3.5
BlackMagic Decklink Intensity 4.5 4
Deltacast 3G 4 3

Compression

Performance

Here you can see the performance of individual compression modules (encoding). For the measurement purposes, we used machine hd2 running Ubuntu as receiver and hd7 as sender. We used Decklink HD Extreme as grabbing cards on both sides. We used a 4k video with increasing framerate to pin down the biggest achievable that still offers fluent video experience.

Encoder (setting) res FPS Content HW Ver
libx264 2160p 43 NZ H1
libx264 2160p 32 NZ H2
libvpx 2160p 28 NZ H1
mjpeg 2160p 65 NZ H1
gpujpeg (90) 2160p 150 NZ2500 H3 1.3
gpujpeg (90) 2160p 137 NZ2500 H4 1.3
gpujpeg (90) 2160p 157 NZ2500 H5 1.3
gpujpeg (90) 2160p 201 NZ2500 H6 1.3
gpujpeg (90:8) 2160p 207 NZ2500 H6 1.3
gpujpeg (90:16) 2160p 200 NZ2500 H6 1.3
gpujpeg (90) 2160p 178+146 NZ2500 H7 1.3
gpujpeg (90) 2160p 92 NZ2500 H8 1.3
gpujpeg (90) 2160p 96+96 NZ2500 H9 1.3
gpujpeg (90) 2160p 130 NZ2500 H10 1.3

Legend:

H1
Intel i7-4960X
H2
Intel i7-980X
H3
kypowall0 (i7-4770S, NV GTX 980, 4x8G DDR3@1600)
H4
hd12 (i7-4960X, 32 GB 1866 MHz DDR3, NV GTX 960)
H5
i7-5960X, DDR4@2166, NV GTX 960
H6
i7-5960X, DDR4@2166, GeForce GTX Titan Black
H7
i7-5960X, DDR4@2166, GeForce GTX Titan Black + NV 960
H8
bunny(2x Xeon E5-2660 v2, [email protected], 2 x GeForce GTX Titan) (utilization of both cards around 50%, the same in previous case)
H9
hdd1(i7-4930K, 780Ti,2x4GB DDR3@1333)
1.3
v1.3-140-g08dba83

Latency

We also measured the latency added by the compression modules. For the tests 1080p@30fps was used, compressing at hd7 running Ubuntu and decompression done by hd2

module end-to-end latency
uncompressed 3.75
cuda_dxt 3.75 (+0)
RTDXT:DXT1 6 (+2)
RTDXT:DXT5 5.5 (+1.75)
JPEG:90:0 4 (+0.25)
JPEG:97:0 4 (+0.25)
H.264 5 (+1.25)

Bandwidth

Here you can see the measured bandwidth including overhead with 9000B Ethernet frames. Uncompressed signal was 8-bit YUV422.

module 1080i@30 2k@30 4k (4096 × 2160)@25fps
uncompressed 980 Mbps 1504 Mbps 3489 Mbps
DXT1 245 Mbps 376 Mbps 870 Mbps
DXT5 YCoCg 489 Mbps 752 Mbps - Mbps
JPEG:90 80 Mbps 85 Mbps 160 Mbps
H.264 22 Mbps 22 Mbps 60 Mbps
Clone this wiki locally