Skip to content

Commit 6f24482

Browse files
committed
Add grape-graphx performance.md
1 parent 17819c8 commit 6f24482

File tree

3 files changed

+80
-1
lines changed

3 files changed

+80
-1
lines changed
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,74 @@
1+
# Performance
2+
3+
We test GraphScope for GraphX in end-to-end scenarios to measure the performance improvement of graph computing on Spark GraphX. This includes:
4+
- Graph loading: loading graphs from the file system into memory in the form of a graph
5+
- RDD Op: transforming the graph using RDD-defined operators
6+
- Pregel computin: running graph algorithms based on GraphX Pregel, such as SSSP, PageRank, and CC
7+
8+
## Settings:
9+
10+
| dataset | num of vertices | num of edges | avg degree |
11+
|:--------------: |:---------------: |:-------------: |:-----------: |
12+
| datagen-9_0-fb | 12,857,672 | 1,049,527,226 | 81.6 |
13+
| com-friendster | 65,608,366 | 1,806,067,135 | 27.5 |
14+
15+
The following tests are run on 4 Nodes cluster, each with 48 cores, 96 cpu.
16+
17+
18+
## End-to-End time
19+
20+
By using ORC-format files as input, the time for graph loading and converting it to ```RDD[(Long, Long)]``` is the same for GraphScope and GraphX.
21+
22+
### On com-friendster
23+
#### 256 partitions
24+
25+
| Algorithm | GS Graph Loading | GraphX Graph Loading | GS Query Time | GraphX Query Time | GS E2E Time| GraphX E2E Time | Performance Gain Query | Performance Gain E2E |
26+
|:--------------:|:--------------:|:--------------:|:--------------:|:--------------:|:--------------:|:--------------:|:--------------:|:--------------: |
27+
|PageRank | 108s | 106s | 152s | 1129s | 260s | 1235s | 7.4x | 4.8x |
28+
| SSSP | 108s | 106s | 31s | 164s | 139s | 270s | 5.3 | 1.9x |
29+
| CC | 108s | 106s | 58s | 228s | 166s | 334s | 3.9x | 2x |
30+
31+
32+
#### 320 partitions
33+
34+
| Algorithm | GS Graph Loading | GraphX Graph Loading | GS Query Time | GraphX Query Time | GS E2E Time| GraphX E2E Time | Performance Gain Query | Performance Gain E2d |
35+
|:--------------:|:--------------:|:--------------:|:--------------:|:--------------:|:--------------:|:--------------:|:--------------:|:--------------: |
36+
| PageRank | 100s | 100s | 158s | 1089s | 268s | 1189s | 6.5x | 4.4x |
37+
| SSSP | 100s | 100s | 31s | 156s | 131s | 256s | 5x | 2x |
38+
| CC | 100s | 100s | 62s | 219s | 162s | 319s | 2.8x | 2x |
39+
40+
#### 384 partitions
41+
| Algorithm | GS Graph Loading | GraphX Graph Loading | GS Query Time | GraphX Query Time | GS E2E Time| GraphX E2E Time | Performance Gain Query | Performance Gain E2d |
42+
|:--------------:|:--------------:|:--------------:|:--------------:|:--------------:|:--------------:|:--------------:|:--------------:|:--------------: |
43+
| PageRank | 99s | 98s | 154s | 1028s | 253s | 1126s | 6.7x | 4.5x |
44+
| SSSP | 99s | 98s | 33s | 163s | 132s | 261s | 5x | 2x |
45+
| CC | 99s | 98s | 60s | 223s | 159s | 321s | 2.8x | 2x |
46+
47+
48+
49+
### On Datagen-9_0-fb
50+
51+
#### 256 partitions
52+
53+
| Algorithm | GS Graph Loading | GraphX Graph Loading | GS Query Time | GraphX Query Time | GS E2E Time| GraphX E2E Time | Performance Gain Query | Performance Gain E2d |
54+
|:--------------:|:--------------:|:--------------:|:--------------:|:--------------:|:--------------:|:--------------:|:--------------:|:--------------: |
55+
| PageRank | 70s | 84s | 90s | 430s | 160s | 514s | 4.8x | 3.2x |
56+
| SSSP | 70s | 84s | 14s | 45s | 84s | 129s | 3x | 1.5x |
57+
| CC | 70s | 84s | 36s | 74s | 106s | 158s | 2x | 1.5x |
58+
59+
60+
#### 320 partitions
61+
62+
| Algorithm | GS Graph Loading | GraphX Graph Loading | GS Query Time | GraphX Query Time | GS E2E Time| GraphX E2E Time | Performance Gain Query | Performance Gain E2d |
63+
|:--------------:|:--------------:|:--------------:|:--------------:|:--------------:|:--------------:|:--------------:|:--------------:|:--------------: |
64+
| PageRank | 68s | 76s | 87s | 406s | 155s | 482s | 4.7x | 3.1x |
65+
| SSSP | 68s | 76s | 13s | 40s | 81s | 116s | 3x | 1.4x |
66+
| CC | 68s | 76s | 30s | 53s | 98s | 129s | 1.8x | 1.3x |
67+
68+
#### 384 partitions
69+
70+
| Algorithm | GS Graph Loading | GraphX Graph Loading | GS Query Time | GraphX Query Time | GS E2E Time| GraphX E2E Time | Performance Gain Query | Performance Gain E2d |
71+
|:--------------:|:--------------:|:--------------:|:--------------:|:--------------:|:--------------:|:--------------:|:--------------:|:--------------: |
72+
| PageRank | 68s | 73s | 82s | 395s | 150s | 468s | 4.8x | 3x |
73+
| SSSP | 68s | 73s | 13s | 40s | 81s | 113s | 3x | 1.4x |
74+
| CC | 68s | 73s | 30s | 50s | 98s | 143s | 1.7x | 1.4x |

analytical_engine/java/performance.md

+5
Original file line numberDiff line numberDiff line change
@@ -136,3 +136,8 @@ pr_delta set to 0.85, running for 50 rounds.
136136
| C++ time | 24.15 | 12.46 | 6.59 | 3.59 | 2.11 | 1.56 | 1.53 |
137137
| Java time | 80.77 | 40.94 | 20.87 | 14.55 | 8.14 | 5.13 | 5.15 |
138138
| Java(+LLVM4JNI) time | 49.80 | 24.15 | 10.54 | 6.63 | 3.83 | 2.95 | 3.42 |
139+
140+
141+
## Graphscope-GraphX Integration
142+
143+
We also evaluate the performance of `grape-graphx`, the integration of GraphScope on Spark GraphX. See [grape-graphX performace](grape-graphx/performance.md).

learning_engine/graph-learn

Submodule graph-learn updated 221 files

0 commit comments

Comments
 (0)