Skip to content

Commit 573bab3

Browse files
authored
Fix #4659: Restructure doc tree with organized nested directories (#4667)
* Revamp doc tree into nested directories with myst parser * Segregate existing content for new doc tree structure * Add faqs and submission process * Update single quotes to double for formatting
1 parent 6164439 commit 573bab3

File tree

121 files changed

+1733
-124
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

121 files changed

+1733
-124
lines changed
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
# Getting Started
2+
3+
```{toctree}
4+
:maxdepth: 2
5+
6+
introduction
7+
installation
8+
setup/index

docs/source/intro.md renamed to docs/source/01-getting-started/introduction.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
[EvalAI] aims to build a centralized platform to host, participate, and collaborate in Artificial Intelligence (AI) challenges organized around the globe and hope to help in benchmarking progress in AI.
44

5-
<img src="_static/img/teaser.png">
5+
<img src="../_static/img/teaser.png">
66

77
## Features
88

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
# Docker Setup
2+
3+
Guide to setup EvalAI using Docker.
Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
# Setup
2+
3+
```{toctree}
4+
:maxdepth: 1
5+
6+
docker-setup
7+
manual-setup
8+
linux-setup
9+
windows-setup
10+
macos-setup
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
# Linux Setup
2+
3+
Guide to setup EvalAI on Linux.
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
# MacOS Setup
2+
3+
Guide to setup EvalAI on MacOS.
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
# Manual Setup
2+
3+
Guide to setup EvalAI manually.
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
# Windows Setup
2+
3+
Guide to setup EvalAI on Windows.
Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
# For Challenge Hosts
2+
3+
```{toctree}
4+
:maxdepth: 2
5+
6+
hosting-guide/index
7+
configuration/index
8+
evaluation/index
9+
templates/index
Lines changed: 69 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,69 @@
1+
# Challenge Configuration
2+
3+
Following fields are required (and can be customized) in the [`challenge_config.yml`](https://github.com/Cloud-CV/EvalAI-Starters/blob/master/challenge_config.yaml).
4+
5+
- **title**: Title of the challenge
6+
7+
- **short_description**: Short description of the challenge (preferably 140 characters max)
8+
9+
- **description**: Long description of the challenge (use a relative path for the HTML file, e.g. `templates/description.html`)
10+
11+
- **evaluation_details**: Evaluation details and details of the challenge (use a relative path for the HTML file, e.g. `templates/evaluation_details.html`)
12+
13+
- **terms_and_conditions**: Terms and conditions of the challenge (use a relative path for the HTML file, e.g. `templates/terms_and_conditions.html`)
14+
15+
- **image**: Logo of the challenge (use a relative path for the logo in the zip configuration, e.g. `images/logo/challenge_logo.jpg`). **Note**: The image must be in jpg, jpeg or png format.
16+
17+
- **submission_guidelines**: Submission guidelines of the challenge (use a relative path for the HTML file, e.g. `templates/submission_guidelines.html`)
18+
19+
- **evaluation_script**: Python script which will decide how to evaluate submissions in different phases (path of the evaluation script file or folder relative to this YAML file. For e.g. `evaluation_script/`)
20+
21+
- **remote_evaluation**: True/False (specify whether evaluation will happen on a remote machine or not. Default is `False`)
22+
23+
- **start_date**: Start DateTime of the challenge (Format: YYYY-MM-DD HH:MM:SS, e.g. 2017-07-07 10:10:10) in `UTC` time zone
24+
25+
- **end_date**: End DateTime of the challenge (Format: YYYY-MM-DD HH:MM:SS, e.g. 2017-07-07 10:10:10) in `UTC` time zone
26+
27+
- **published**: True/False (Boolean field that gives the flexibility to publish the challenge once approved by EvalAI admin. Default is `False`)
28+
29+
- **tags**: A list of tags to display the relevant areas of challenge.
30+
31+
- **domain**: Please choose the relevant domain for your challenge: (CV, NLP, RL, MM, AL, TAB).
32+
33+
- **allowed_email_domains**: A list of domains allowed to participate in the challenge. Leave blank if everyone is allowed to participate. (e.g. `["domain1.com", "domain2.org", "domain3.in"]` Participants with these email domains will only be allowed to participate.)
34+
35+
- **blocked_emails_domains**: A list of domains not allowed to participate in the challenge. Leave blank if everyone is allowed to participate. (e.g. `["domain1.com", "domain2.org", "domain3.in"]` Participants with these email domains will not be allowed to participate.)
36+
37+
- **leaderboard**:
38+
A leaderboard for a challenge on EvalAI consists of following subfields:
39+
40+
- **id**: Unique positive integer field for each leaderboard entry
41+
42+
- **schema**: Schema field contains the information about the rows of the leaderboard. A schema contains two keys in the leaderboard:
43+
44+
1. `labels`: Labels are the header rows in the leaderboard according to which the challenge ranking is done.
45+
46+
2. `default_order_by`: This key decides the default sorting of the leaderboard based on one of the labels defined above.
47+
48+
3. `metadata`: This field defines additional information about the metrics that are used to evaluate the challenge submissions.
49+
50+
The leaderboard schema for the [sample challenge configuration](https://github.com/Cloud-CV/EvalAI-Starters/blob/master/challenge_config.yaml) looks like this:
51+
52+
```yaml
53+
leaderboard:
54+
- id: 1
55+
schema:
56+
{
57+
"labels": ["Metric1", "Metric2", "Metric3", "Total"],
58+
"default_order_by": "Total",
59+
"metadata": {
60+
"Metric1": {
61+
"sort_ascending": True,
62+
"description": "Metric Description",
63+
}
64+
}
65+
```
66+
67+
The above leaderboard schema will look something like this on leaderboard UI:
68+
69+
![](../../_static/img/leaderboard.png "Random Number Generator Challenge - Leaderboard")
Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
# Dataset Splits
2+
3+
Dataset splits define the subset of test-set on which the submissions will be evaluated on. Generally, most challenges have three splits:
4+
5+
1. **train_split** (Allow participants to make a large number of submissions, let them see how they are doing, and let them overfit)
6+
2. **test_split** (Allow a small number of submissions so that they cannot mimic test_set. Use this split to decide the winners for the challenge)
7+
3. **val_split** (Allow participants to make submissions and evaluate on the validation split)
8+
9+
A dataset split has the following subfields:
10+
11+
- **id**: Unique integer identifier for the split
12+
13+
- **name**: Name of the split (it must be unique for every split)
14+
15+
- **codename**: Unique id for each split. Note that the codename of a dataset split is used to map the results returned by the evaluation script to a particular dataset split in EvalAI's database. Please make sure that no two dataset splits have the same codename. Again, make sure that the dataset split's codename match with what is in the evaluation script provided by the challenge host.
16+
17+
- **challenge_phase_splits**:
18+
19+
A challenge phase split is a relation between a challenge phase and dataset splits for a challenge (many to many relation). This is used to set the privacy of submissions (public/private) to different dataset splits for different challenge phases.
20+
21+
- **challenge_phase_id**: Id of `challenge_phase` to map with
22+
23+
- **leaderboard_id**: Id of `leaderboard`
24+
25+
- **dataset_split_id**: Id of `dataset_split`
26+
27+
- **visibility**: It will set the visibility of the numbers corresponding to metrics for this `challenge_phase_split`. Select one of the following positive integers based on the visibility level you want: (Optional, Default is `3`)
28+
29+
30+
| Visibility | Description |
31+
| ---------- | ----------------------------------------------------------------------- |
32+
| 1 | Only visible to challenge host |
33+
| 2 | Only visible to challenge host and participant who made that submission |
34+
| 3 | Visible to everyone on leaderboard |
35+
36+
- **leaderboard_decimal_precision**: A positive integer field used for varying the leaderboard decimal precision. Default value is `2`.
37+
38+
- **is_leaderboard_order_descending**: True/False (a Boolean field that gives the flexibility to challenge host to change the default leaderboard sorting order. It is useful in cases where you have error as a metric and want to sort the leaderboard in increasing order of error value. Default is `True`)
Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
# Configuration
2+
3+
```{toctree}
4+
:maxdepth: 2
5+
6+
challenge-config
7+
yaml-reference
8+
phases-setup
9+
dataset-splits
10+
Lines changed: 65 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,65 @@
1+
# Phases Setup
2+
3+
There can be multiple [challenge phases](https://evalai.readthedocs.io/en/latest/glossary.html#challenge-phase) in a challenge. A challenge phase in a challenge contains the following subfields:
4+
5+
- **id**: Unique integer identifier for the challenge phase
6+
7+
- **name**: Name of the challenge phase
8+
9+
- **description**: Long description of the challenge phase (set the relative path of the HTML file, e.g. `templates/challenge_phase_1_description.html`)
10+
11+
- **leaderboard_public**: True/False (a Boolean field that gives the flexibility to Challenge Hosts to either make the leaderboard public or private. Default is `False`)
12+
13+
- **is_public**: True/False (a Boolean field that gives the flexibility to Challenge Hosts to either hide or show the challenge phase to participants. Default is `False`)
14+
15+
- **is_submission_public**: True/False (a Boolean field that gives the flexibility to Challenge Hosts to either make the submissions by default public/private. Note that this will only work when the `leaderboard_public` property is set to true. Default is `False`)
16+
17+
- **start_date**: Start DateTime of the challenge phase (Format: YYYY-MM-DD HH:MM:SS, e.g. 2017-07-07 10:10:10)
18+
19+
- **end_date**: End DateTime of the challenge phase (Format: YYYY-MM-DD HH:MM:SS, e.g. 2017-07-07 10:10:10)
20+
21+
- **test_annotation_file**: This file is used for ranking the submission made by a participant. An annotation file can be shared by more than one challenge phase. (Path of the test annotation file relative to this YAML file, e.g. `annotations/test_annotations_devsplit.json`)
22+
23+
- **codename**: Unique id for each challenge phase. Note that the codename of a challenge phase is used to map the results returned by the evaluation script to a particular challenge phase. The codename specified here should match with the codename specified in the evaluation script to perfect mapping.
24+
25+
- **max_submissions_per_day**: A positive integer that tells the maximum number of submissions per day to a challenge phase. (Optional, Default value is `100000`)
26+
27+
- **max_submissions_per_month**: A positive integer that tells the maximum number of submissions per month to a challenge phase. (Optional, Default value is `100000`)
28+
29+
- **max_submissions**: A positive integer that decides the maximum number of total submissions that can be made to the challenge phase. (Optional, Default value is `100000`)
30+
31+
- **default_submission_meta_attributes**: These are the default metadata attributes that will be displayed for all submissions, the metadata attributes are `method_name`, `method_description`, `project_url`, and `publication_url`.
32+
```yaml
33+
default_submission_meta_attributes:
34+
- name: method_name
35+
is_visible: True
36+
- name: method_description
37+
is_visible: True
38+
- name: project_url
39+
is_visible: True
40+
- name: publication_url
41+
is_visible: True
42+
```
43+
- **submission_meta_attributes**: These are the custom metadata attributes that participants can add to their submissions. The custom metadata attributes are `TextAttribute`, `SingleOptionAttribute`, `MultipleChoiceAttribute`, and `TrueFalseField`.
44+
```yaml
45+
submission_meta_attributes:
46+
- name: TextAttribute
47+
description: Sample
48+
type: text
49+
required: False
50+
- name: SingleOptionAttribute
51+
description: Sample
52+
type: radio
53+
options: ["A", "B", "C"]
54+
- name: MultipleChoiceAttribute
55+
description: Sample
56+
type: checkbox
57+
options: ["alpha", "beta", "gamma"]
58+
- name: TrueFalseField
59+
description: Sample
60+
type: boolean
61+
required: True
62+
```
63+
- **is_restricted_to_select_one_submission**: True/False (indicates whether to restrict a user to select only one submission for the leaderboard. Default is `False`)
64+
- **is_partial_submission_evaluation_enabled**: True/False (indicates whether partial submission evaluation is enabled. Default is `False`)
65+
- **allowed_submission_file_types**: This is a list of file types that are allowed for submission (Optional Default is `.json, .zip, .txt, .tsv, .gz, .csv, .h5, .npy`)
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
# YAML Reference
Lines changed: 112 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,112 @@
1+
# Evaluation Scripts
2+
Each challenge has an evaluation script, which evaluates the submission of participants and returns the scores which will populate the leaderboard. The logic for evaluating and judging a submission is customizable and varies from challenge to challenge, but the overall structure of evaluation scripts are fixed due to architectural reasons.
3+
4+
## Writing an Evaluation Script
5+
6+
Evaluation scripts are required to have an `evaluate()` function. This is the main function, which is used by workers to evaluate the submission messages.
7+
8+
The syntax of evaluate function is:
9+
10+
```python
11+
def evaluate(test_annotation_file, user_annotation_file, phase_codename, **kwargs):
12+
pass
13+
```
14+
15+
It receives three arguments, namely:
16+
17+
- `test_annotation_file`: It represents the local path to the annotation file for the challenge. This is the file uploaded by the Challenge host while creating a challenge.
18+
19+
- `user_annotation_file`: It represents the local path of the file submitted by the user for a particular challenge phase.
20+
21+
- `phase_codename`: It is the `codename` of the challenge phase from the [challenge configuration yaml](https://github.com/Cloud-CV/EvalAI-Starters/blob/master/challenge_config.yaml). This is passed as an argument so that the script can take actions according to the challenge phase.
22+
23+
After reading the files, some custom actions can be performed. This varies per challenge.
24+
25+
The `evaluate()` method also accepts keyword arguments. By default, we provide you metadata of each submission to your challenge which you can use to send notifications to your slack channel or to some other webhook service. Following is an example code showing how to get the submission metadata in your evaluation script and send a slack notification if the accuracy is more than some value `X` (X being 90 in the example given below).
26+
27+
```python
28+
def evaluate(test_annotation_file, user_annotation_file, phase_codename, **kwargs):
29+
30+
submission_metadata = kwargs.get("submission_metadata")
31+
print submission_metadata
32+
33+
# Do stuff here
34+
# Set `score` to 91 as an example
35+
36+
score = 91
37+
if score > 90:
38+
slack_data = kwargs.get("submission_metadata")
39+
webhook_url = "Your slack webhook url comes here"
40+
# To know more about slack webhook, checkout this link: https://api.slack.com/incoming-webhooks
41+
42+
response = requests.post(
43+
webhook_url,
44+
data=json.dumps({'text': "*Flag raised for submission:* \n \n" + str(slack_data)}),
45+
headers={'Content-Type': 'application/json'})
46+
47+
# Do more stuff here
48+
```
49+
50+
The above example can be modified and used to find if some participant team is cheating or not. There are many more ways for which you can use this metadata.
51+
52+
After all the processing is done, this `evaluate()` should return an output, which is used to populate the leaderboard. The output should be in the following format:
53+
54+
```python
55+
output = {}
56+
output['result'] = [
57+
{
58+
'train_split': {
59+
'Metric1': 123,
60+
'Metric2': 123,
61+
'Metric3': 123,
62+
'Total': 123,
63+
}
64+
},
65+
{
66+
'test_split': {
67+
'Metric1': 123,
68+
'Metric2': 123,
69+
'Metric3': 123,
70+
'Total': 123,
71+
}
72+
}
73+
]
74+
75+
return output
76+
77+
```
78+
79+
Let's break down what is happening in the above code snippet.
80+
81+
1. `output` should contain a key named `result`, which is a list containing entries per dataset split that is available for the challenge phase in consideration (in the function definition of `evaluate()` shown above, the argument: `phase_codename` will receive the _codename_ for the challenge phase against which the submission was made).
82+
2. Each entry in the list should be a dict that has a key with the corresponding dataset split codename (`train_split` and `test_split` for this example).
83+
3. Each of these dataset split dict contains various keys (`Metric1`, `Metric2`, `Metric3`, `Total` in this example), which are then displayed as columns in the leaderboard.
84+
85+
## Editing Evaluation Script
86+
87+
Each prediction upload challenge has an evaluation script, which evaluates the submission of participants and returns the scores which will populate the leaderboard. The logic for evaluating a submission is customizable and varies from challenge to challenge.
88+
89+
When setting up a new challenge, hosts need to test multiple versions of evaluation script on EvalAI. To test multiple versions of evaluation script host can update the evaluation script of existing challenge without uploading a whole new challenge configuration.
90+
91+
To edit the evaluation script for existing challenge please follow the following steps:
92+
93+
### 1. Go to the challenge page
94+
95+
Go to hosted challenges and select the challenge to update evaluation script
96+
97+
<img src="../../_static/img/hosted_challenge_page.png"/>
98+
99+
### 2. Navigate to Evaluation criteria tab
100+
101+
Select the Evaluation criteria tab and click on 'upload' button
102+
103+
<img src="../../_static/img/evaluation_criteria_tab.png"/>
104+
105+
### 3. Update the evaluation script
106+
107+
Upload the latest evaluation script and click on 'Submit' button to update the evaluation script
108+
109+
<img src="../../_static/img/edit_evaluation_script.png"/>
110+
111+
**Tada!** you have successfully updated the evaluation script for a challenge. The evaluation workers for the challenge will be restarted automatically to pick up the latest evaluation script. Please wait for a minimum of 10 minutes for the workers to restart.
112+
Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
# Evaluation
2+
3+
```{toctree}
4+
:maxdepth: 2
5+
6+
evaluation-scripts
7+
prediction-upload
8+
remote-evaluation
9+
metrics-leaderboards
10+
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
# Metric Leaderboards
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
# Prediction Upload

0 commit comments

Comments
 (0)