-
Notifications
You must be signed in to change notification settings - Fork 42
Evaluation results (Tests, metrics)
Patrick Hammer edited this page Mar 20, 2020
·
15 revisions
Current state, master:
<<NAR Follow test successful goods=502 bads=2 ratio=0.996032
System tests successful!
Now running Q&A experiments:
Q&A metrics for test ./examples/nal/school.nal
Average answer time = 140.5
Average answer confidence = 0.22967900000000002
Combined loss = 108.2301005
Q&A metrics for test ./examples/nal/symmetry.nal
Average answer time = 5125.0
Average answer confidence = 0.866876
Combined loss = 682.2605000000001
Q&A stress test results for test ./examples/nal/example1.nal
Total questions = 20.0
Correctly answered ones = 16.0
Answer ratio = 0.8
Narsese integration tests successful!
Q&A metrics for test ./examples/english/story3.english
Average answer time = 57.5
Average answer confidence = 0.5894055
Combined loss = 23.60918375
Q&A metrics for test ./examples/english/story2.english
Average answer time = 534.5
Average answer confidence = 0.6593835
Combined loss = 182.05951925
Q&A metrics for test ./examples/english/story1.english
Average answer time = 27.5
Average answer confidence = 0.597363
Combined loss = 11.0725175
English integration tests successful!
Q&A metrics global
Average answer time = 738.3333333333334
Average answer confidence = 0.5576153333333334
Combined loss = 326.62734555555556
Q&A answer rate global
Total questions = 47.0
Correctly answered ones = 43.0
Answer ratio = 0.9148936170212766
Now running procedure learning examples for 10K iterations each:
Pong metrics: Hits=489 misses=95 ratio=0.837329 time=29547
Pong2 metrics: Hits=326 misses=19 ratio=0.944928 time=15172
Alien metrics: shots=1812 hits=1780 ratio=0.982340 time=16766
Procedure learning metrics done
Note: successful tests without metrics are not printed, but if they fail they would appear.