You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/source/how_to/authentication_how_to.md
+113-1Lines changed: 113 additions & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -106,5 +106,117 @@ Finally,Some important things to remember:
106
106
107
107
## Authenticating on XXXX site with username/password
108
108
109
-
```{note} This section is still under construction 🚧
109
+
```{note}
110
+
This section is still under construction 🚧
110
111
```
112
+
113
+
114
+
# Proof of Origin Tokens
115
+
116
+
YouTube uses **Proof of Origin Tokens (POT)** as part of its bot detection system to verify that requests originate from valid clients. If a token is missing or invalid, some videos may return errors like "Sign in to confirm you're not a bot."
117
+
118
+
yt-dlp provides [a detailed guide to POTs](https://github.com/yt-dlp/yt-dlp/wiki/PO-Token-Guide).
119
+
120
+
### How Auto Archiver Uses POT
121
+
This feature is enabled for the Generic Archiver via two yt-dlp plugins:
Includes both a Python plugin and a **Node.js server or script** to generate the token.
128
+
129
+
These are installed in our Poetry environment.
130
+
131
+
### Integration Methods
132
+
133
+
**Docker (Recommended)**:
134
+
135
+
When running the Auto Archiver using the Docker image, we use the [Node.js token generation script](https://github.com/Brainicism/bgutil-ytdlp-pot-provider/tree/master/server).
136
+
This is to avoid managing a separate server process, and is handled automatically inside the Docker container when needed.
137
+
138
+
This is already included in the Docker image, however if you need to disable this you can set the config option `bguils_po_token_method` under the `generic_extractor` section of your `orchestration.yaml` config file to "disabled".
139
+
```yaml
140
+
generic_extractor:
141
+
bguils_po_token_method: "disabled"
142
+
```
143
+
144
+
**PyPi/ Local**:
145
+
146
+
When using the Auto Archiver PyPI package, or running locally, you will need additional system requirements to run the token generation script, namely either Docker, or Node.js and Yarn.
147
+
148
+
See the [bgutil-ytdlp-pot-provider](https://github.com/Brainicism/bgutil-ytdlp-pot-provider?tab=readme-ov-file#a-http-server-option) documentation for more details.
149
+
150
+
⚠️WARNING⚠️: This will add the server scripts to the home directory of wherever this is running.
151
+
152
+
- You can set the config option `bguils_po_token_method` under the `generic_extractor` section of your `orchestration.yaml` config file to "script" to enable the token generation script process locally.
153
+
- Alternatively you can run the bgutil-ytdlp-pot-provider server separately using their Docker image or Node.js server.
154
+
155
+
### Notes
156
+
157
+
- The token generation script is only triggered when needed by yt-dlp, so it should have no effect unless YouTube requests a POT.
158
+
- If you're running the Auto Archiver in Docker, this is set up automatically.
159
+
- If you're running locally, you'll need to run the setup script manually or enable the feature in your config.
160
+
- You can set up both the server and the script, and the plugin will fallback on each other if needed. This is recommended for robustness!
| `auto` | Docker: Automatically downloads and uses the token generation script. Local: Does nothing; assumes a separate server is running externally. | ✅ Yes |
169
+
| `script` | Explicitly downloads and uses the token generation script, even locally. | ❌ No |
If you change the default port of the bgutil-ytdlp-pot-provider server, you can pass the updated values using our `extractor_args` option for the gereric extractor.
For more details on this for bgutils see [here](https://github.com/Brainicism/bgutil-ytdlp-pot-provider?tab=readme-ov-file#usage)
199
+
200
+
### Checking the logs
201
+
202
+
To verify that the POT process working, look for the following lines in your log after adding the config option:
203
+
204
+
```shell
205
+
[GetPOT] BgUtilScript: Generating POT via script: /Users/you/bgutil-ytdlp-pot-provider/server/build/generate_once.js
206
+
[debug] [GetPOT] BgUtilScript: Executing command to get POT via script: /Users/you/.nvm/versions/node/v20.18.0/bin/node /Users/you/bgutil-ytdlp-pot-provider/server/build/generate_once.js -v ymCMy8OflKM
[debug] [GetPOT] Fetching gvs PO Token for tv client
210
+
```
211
+
212
+
If it can't find the script or something, you'll see something like this:
213
+
```shell
214
+
[debug] [GetPOT] Fetching player PO Token for tv client
215
+
WARNING: [GetPOT] BgUtilScript: Script path doesn't exist: /Users/you/bgutil-ytdlp-pot-provider/server/build/generate_once.js. Please make sure the script has been transpiled correctly.
216
+
WARNING: [GetPOT] BgUtilHTTP: Error reaching GET http://127.0.0.1:4416/ping (caused by TransportError). Please make sure that the server is reachable at http://127.0.0.1:4416.
217
+
[debug] [GetPOT] No player PO Token provider available for tv client
218
+
```
219
+
220
+
In this case check that the script has been transpiled correctly and is available at the path specified in the log,
Copy file name to clipboardExpand all lines: src/auto_archiver/modules/generic_extractor/__manifest__.py
+5Lines changed: 5 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -74,6 +74,11 @@
74
74
"default": "inf",
75
75
"help": "Use to limit the number of videos to download when a channel or long page is being extracted. 'inf' means no limit.",
76
76
},
77
+
"bguils_po_token_method": {
78
+
"default": "auto",
79
+
"help": "Set up a Proof of origin token provider. This process has additional requirements. See [authentication](https://auto-archiver.readthedocs.io/en/latest/how_to/authentication_how_to.html) for more information.",
80
+
"choices": ["auto", "script", "disabled"],
81
+
},
77
82
"extractor_args": {
78
83
"default": {},
79
84
"help": "Additional arguments to pass to the yt-dlp extractor. See https://github.com/yt-dlp/yt-dlp/blob/master/README.md#extractor-arguments.",
0 commit comments