Table of Contents
We have included 1000 lyrics as toy data in toy-data
.
This data is ready to use with this example.
NOTE: This is what is used in the Docker image and is required if you want to build it (the Docker image) yourself.
If you want to use the full dataset, you can download it from kaggle (https://www.kaggle.com/neisse/scrapped-lyrics-from-6-genres). To get it, once you have your Kaggle Token in your system as described in (https://www.kaggle.com/docs/api), run:
bash get_data.sh
pip install -r requirements.txt
Command | Description |
---|---|
python app.py index |
To index files/data |
python app.py search |
To run query on the index |
python app.py dryrun |
Sanity check on the topology |
cd static
python -m http.server
Open http://0.0.0.0:8000/
in your browser.
To make it easier for the user, we have built and published the Docker image with 10000 indexed songs (more than the toy example, but just a small part of the huge dataset). You can retrieve the docker image using:
docker pull jinahub/app.example.multireslyricssearch:0.0.2-0.9.20
So you can pull from its latest tags.
Then you can run it, and you can proceed to see the results in the browser as explained before
docker run -p 65481:65481 jinahub/app.example.multireslyricssearch:0.0.2-0.9.20
Copyright (c) 2020-2021 Han Xiao. All rights reserved.