Skip to content

Performance

Martin Pulec edited this page Sep 26, 2023 · 16 revisions

UltraGrid Performance

This site contains the results of performance tests of UltraGrid.

End-to-End Latency

The following table compares performance of individual cards. The results are measured as end-to-end frame-delay, i.e. the number of frames sent before the reciever outputs the original frame.

Linux

Setup:

  • Testing machine: hd2
  • Reference machine: hd4 with BlackMagic Decklink HD Extreme
  • For the test, we used 1080i HD video at 29.97Fps.
Card send recieve
DVS Centaurus II 5 3
BlackMagic DeckLink HD Extreme 3.75 3.75
BlackMagic DeckLink 4K Extreme 3.5 3
BlackMagic Decklink Quad 4.5 4
BlackMagic Decklink Intensity PRO 4.5 4
BlackMagic Decklink Intensity 4.5 4.5
Deltacast 3G 4.5 3
OpenGL - 2.5

macOS

Setup:

  • Testing machine: hd7
  • Reference machine: hd4 with BlackMagic Decklink HD Extreme
  • For the test, we used 1080i HD video at 29.97Fps.
Card send recieve
AJA Kona 3G 4 3.5
DeckLink HD Pro (Quicktime) 4.5 5.5
DeckLink HD Pro (native API) 4.5 4
OpenGL (with VSync) - 2.25
OpenGL (without VSync) - 1.75

Windows

Setup:

  • Testing machine: hd7
  • Reference machine: hd4 with BlackMagic Decklink HD Extreme
  • For the test, we used 1080i HD video at 29.97Fps.
Card send recieve
BlackMagic DeckLink HD Extreme 4.5 3.5
BlackMagic Decklink Quad 4.5 4
BlackMagic Decklink Intensity PRO 4.5 3.5
BlackMagic Decklink Intensity 4.5 4
Deltacast 3G 4 3

Compression

Performance

Here you can see the performance of individual compression modules (encoding). For the measurement purposes, we used machine hd2 running Ubuntu as receiver and hd7 as sender. We used Decklink HD Extreme as grabbing cards on both sides. We used a 4k video with increasing framerate to pin down the biggest achievable that still offers fluent video experience.

Encoder (setting) res FPS Content HW Ver
cineform 2160p 65 NZ2500 i7-4960X 1.6
gpujpeg (90) 2160p 157 NZ2500 5960X+9 1.3
libx264 2160p 43 NZ i7-4960X
libvpx 2160p 28 NZ i7-4960X
mjpeg 2160p 65 NZ i7-4960X
cineform - R10k 2160p 42 NZ2500 i7-4960X 1.6
cineform - R12L 2160p 25 NZ2500 i7-4960X 1.6
gpujpeg (90) 2160p 150 NZ2500 kypo 1.3
gpujpeg (90) 2160p 137 NZ2500 hd12 1.3
gpujpeg (90) 2160p 201 NZ2500 5960X+B 1.3
gpujpeg (90:8) 2160p 207 NZ2500 5960X+B 1.3
gpujpeg (90:16) 2160p 200 NZ2500 5960X+B 1.3
gpujpeg (90) 2160p 178+146 NZ2500 5960X+B9 1.3
gpujpeg (90) 2160p 92 NZ2500 bunny 1.3
gpujpeg (90) 2160p 96+96 NZ2500 bunnyX2 1.3
gpujpeg (90) 2160p 130 NZ2500 hdd1 1.3
libx264 2160p 32 NZ i7-980X
libx264 1080p 4 NZ VIA Nano 1.5d
mjpeg 1080p 10 NZ VIA Nano 1.5d

Legend:

  • kypo
    i7-4770S, NV GTX 980, 4x8G DDR3@1600 (kypowall0)
  • hd12
    i7-4960X, 32 GB 1866 MHz DDR3, NV GTX 960
  • 5960X+9
    i7-5960X, DDR4@2166, NV GTX 960
  • 5960X+B
    i7-5960X, DDR4@2166, GeForce GTX Titan Black
  • 5960X+B9
    i7-5960X, DDR4@2166, GeForce GTX Titan Black + GTX 960
  • bunny
    2x Xeon E5-2660 v2, [email protected], GeForce GTX Titan
  • bunnyX2
    2x Xeon E5-2660 v2, [email protected], 2 x GeForce GTX Titan
  • hdd1
    i7-4930K, 780Ti,2x4GB DDR3@1333
  • VIA Nano
    Via Nano U2250, 1 GB ram<
  • 1.3
    v1.3-140-g08dba83
  • 1.5d
    1.5 (rev 3fa1a0d7)
  • NZ
    New Zealand UYVY
  • NZ2500
    New Zealand frame 2500 UYVY

Latency

We also measured the latency added by the compression modules. For the tests 1080p@30fps was used, compressing at hd7 running Ubuntu and decompression done by hd2

module end-to-end latency
uncompressed 3.75
cuda_dxt 3.75 (+0)
RTDXT:DXT1 6 (+2)
RTDXT:DXT5 5.5 (+1.75)
JPEG:90:0 4 (+0.25)
JPEG:97:0 4 (+0.25)
H.264 5 (+1.25)

Bandwidth

Here you can see the measured bandwidth including overhead with 9000B Ethernet frames. Uncompressed signal was 8-bit YUV422.

module 1080i@30 2k@30 4k (4096 × 2160)@25fps
uncompressed 980 Mbps 1504 Mbps 3489 Mbps
DXT1 245 Mbps 376 Mbps 870 Mbps
DXT5 YCoCg 489 Mbps 752 Mbps
JPEG:90 80 Mbps 85 Mbps 160 Mbps
H.264 22 Mbps 22 Mbps 60 Mbps
cineform:quality=4 (default) 580 Mbps
cineform:quality=1 300 Mbps
Clone this wiki locally