-
Notifications
You must be signed in to change notification settings - Fork 42
Evaluation results (Tests, metrics)
Patrick Hammer edited this page May 28, 2020
·
15 revisions
Current state, master:
tc@box:~/OpenNARS-for-Applications$ python3 evaluation.py
<<NAR Follow test successful goods=506 bads=6 ratio=0.988281
System tests successful!
Now running Q&A experiments:
Q&A metrics for test ./examples/nal/symmetry.nal
Average answer time = 8125.0
Average answer confidence = 0.87168
Combined loss = 1042.6
Q&A metrics for test ./examples/nal/school.nal
Average answer time = 140.5
Average answer confidence = 0.22967900000000002
Combined loss = 108.2301005
Q&A stress test results for test ./examples/nal/example1.nal
Total questions = 20.0
Correctly answered ones = 16.0
Answer ratio = 0.8
Q&A metrics for test ./examples/nal/asthma.nal
Average answer time = 242.75
Average answer confidence = 0.7416662500000001
Combined loss = 62.71051781249998
Narsese integration tests successful!
Q&A metrics for test ./examples/english/story3.english
Average answer time = 82.5
Average answer confidence = 0.5894055
Combined loss = 33.87404625
Q&A metrics for test ./examples/english/story2.english
Average answer time = 515.0
Average answer confidence = 0.596314
Combined loss = 207.89829
Q&A metrics for test ./examples/english/story1.english
Average answer time = 38.5
Average answer confidence = 0.597363
Combined loss = 15.5015245
English integration tests successful!
Q&A metrics global
Average answer time = 819.1538461538462
Average answer confidence = 0.6049129230769231
Combined loss = 323.6370986272189
Q&A answer rate global
Total questions = 51.0
Correctly answered ones = 47.0
Answer ratio = 0.9215686274509803
Now running procedure learning examples for 10K iterations each:
Pong metrics: Hits=434 misses=134 ratio=0.764085 time=29232
Pong2 metrics: Hits=302 misses=43 ratio=0.875362 time=15281
Alien metrics: shots=1702 hits=1668 ratio=0.980024 time=16508
Cartpole metrics: successes=9478.000000, failures=558.000000, ratio=0.944400
Robot metrics: time=650 moves=157 move_success_ratio=0.241538 eaten=22 time=2018
Procedure learning metrics done
Note: successful tests without metrics are not printed, but if they fail they would appear.