This Actor wraps the Monolith Project to crawl a web page URL and bundle the entire content in a single HTML file, without installing and running the tool locally.
Actors are serverless microservices running on the Apify Platform. They are based on the Actor SDK and can be found in the Apify Store. Learn more about Actors in the Apify Whitepaper.
- Go to the Apify Actor page
- Click "Run"
- In the input form, fill in URL(s) to crawl and bundle
- The Actor will run and :
- save the bundled HTML files in the run's default key-value store
- save the links to the KVS with original URL and monolith process exit status to the dataset
apify call netmilk/monolith --input='{
"urls": ["https://news.ycombinator.com/"]
}'
curl --request POST \
--url "https://api.apify.com/v2/acts/netmilk~monolith/run" \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer YOUR_API_TOKEN' \
--data '{
"urls": ["https://news.ycombinator.com/"],
}
}'
The Actor accepts a JSON schema with the following structure:
Field | Type | Required | Default | Description |
---|---|---|---|---|
urls |
array | Yes | - | List of URLs to monolith |
urls[] |
string | Yes | - | URL to monolith |
{
"urls": ["https://news.ycombinator.com/"],
}
The Actor provides three types of outputs:
Field | Type | Required | Description |
---|---|---|---|
url |
string | Yes | A link to the Apify key-value store object where the monolithic html is available for download |
kvsUrl |
array | Yes | Exit status of the monolith process |
status |
number | No | The original start URL for the monolith process |
{
"url": "https://news.ycombinator.com/",
"kvsUrl": "https://api.apify.com/v2/key-value-stores/JRFLHRy9DOtdKGpdm/records/https___news.ycombinator.com_",
"status": "0"
}
- Memory Requirements:
- Minimum: 4168 MB RAM
- Processing Time:
- 30s per compex page like bbc.co.uk
For more help, check the Monolith Project documentation or raise an issue in the Actor page detail on Apify.