You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There are two secure mechanisms in which Label Studio fetches media data from cloud storage: via proxy and via pre-signed URLS.
73
+
74
+
Which one you use depends on whether you have **Use pre-signed URLs** toggled on or off when setting up your source storage. Proxy storage is enabled when **Use pre-signed URLs** is OFF:
75
+
76
+

77
+
78
+
##### Proxy storage
79
+
80
+
When in proxy mode, the Label Studio backend fetches objects server-side and streams them directly to the browser.
- Access to media files is further restricted based on Label Studio user roles and project access.
88
+
- This access is applied to cached files. This means that even if the media is cached, access will be restricted to that file if a user's access to the task is revoked.
89
+
- Data stays within the Label Studio network boundary. This is especially useful for on-prem environments who want to maintain a single entry point for their network traffic.
90
+
-**Configuration**
91
+
- No CORS settings are needed.
92
+
- No pre-signed permissions are needed.
93
+
94
+
To allow proxy storage, you need to ensure your permissions include the following:
95
+
96
+
{% details <b>AWS S3</b> %}
97
+
98
+
```json
99
+
{
100
+
"Version": "2012-10-17",
101
+
"Statement": [
102
+
{
103
+
"Effect": "Allow",
104
+
"Action": [
105
+
"s3:GetObject",
106
+
"s3:ListBucket"
107
+
],
108
+
"Resource": [
109
+
"arn:aws:s3:::your-bucket-name",
110
+
"arn:aws:s3:::your-bucket-name/*"
111
+
]
112
+
}
113
+
]
114
+
}
115
+
116
+
```
117
+
118
+
{% enddetails %}
119
+
120
+
<br>
121
+
122
+
{% details <b>Google Cloud Storage</b> %}
123
+
124
+
-`storage.objects.get` - Read object data and metadata
125
+
-`storage.objects.list` - List objects in the bucket (if using prefix)
126
+
127
+
{% enddetails %}
128
+
129
+
<br>
130
+
131
+
{% details <b>Azure Blob Storage</b> %}
132
+
133
+
Add the **Storage Blob Data Reader** role, which includes:
Very large media files are streamed in sequential 8 MB chunks, which are split into different GET requests. This can result in frequent requests to the backend to get the next portion of data and uses additional resources.
143
+
144
+
You can configure this using the following environment variables:
145
+
146
+
* `RESOLVER_PROXY_MAX_RANGE_SIZE` - Defaults to 8 MB, and defines the largest chunk size returned per request.
147
+
* `RESOLVER_PROXY_TIMEOUT` - Defaults to 20 seconds, and defines the maximum time uWSGI workers spend on a single request.
148
+
149
+
150
+
##### Pre-signed redirect
151
+
152
+
In this scenario, your browser receives an HTTP 303 redirect to a time-limited S3/GCS/Azure URL. This is the default behavior.
153
+
154
+

155
+
156
+
The main benefit to using pre-signed URLs is if you want to ensure that your media files are isolated from the Label Studio network as much as possible.
157
+
70
158
#### Treat every bucket object as a source file
71
159
72
160
Label Studio Source Storages feature an option called "Treat every bucket object as a source file." This option enables two different methods of loading tasks into Label Studio.
@@ -178,7 +266,6 @@ When enabled, Label Studio automatically lists files from the storage bucket and
0 commit comments