|
| 1 | +# Migrate contacts from Rapidpro to Turn |
| 2 | + |
| 3 | +The scripts in this folder will be used to get all the contact information from Rapidpro and update Turn contacts. |
| 4 | + |
| 5 | +The fetch part is based on the modified date we have on Rapidpro contacts and the the idea is to start backfilling before we go live. |
| 6 | +We can do this in batches and repeat until we do the actual switch over from Rapidpro to Turn. |
| 7 | + |
| 8 | +## SCRIPTS |
| 9 | + |
| 10 | +There is one fetch script and 2 update script options. We can test out the different update option with larger batches and see which one works the best. |
| 11 | + |
| 12 | +### fetch_rapidpro_contacts.py |
| 13 | + |
| 14 | +This fetches all the contacts from Rapidpro based on the start and end date provided, you can also configure a limit. |
| 15 | + |
| 16 | +The `FIELD_MAPPING` variable should be updated with all the fields we want to move over. |
| 17 | + |
| 18 | +The script will write all the contacts to a file with the start and end date in the name. |
| 19 | + |
| 20 | +It will also output the latest modified on date in the batch, this can then be used as a start date to get the next batch. |
| 21 | + |
| 22 | +### update_turn_contacts.py |
| 23 | + |
| 24 | +Update Turn contacts using the Turn contacts api asynchronously. |
| 25 | + |
| 26 | +This script takes a filename of a file generated by the `fetch_rapidpro_contacts.py` script as a parameter and updates all the contact in the file on Turn. |
| 27 | + |
| 28 | +It is an async script and `CONCURRENCY` can be updated to control the speed, to avoid hitting the Turn API rate limits. |
| 29 | + |
| 30 | +Command to run: |
| 31 | +`python scripts/migrate_to_turn/update_turn_contacts.py contacts-2025-01-01-2025-01-07.csv > update_turn_contacts.json` |
| 32 | + |
| 33 | +The output is sent to a json file, which can be used to retry failed requests. |
| 34 | + |
| 35 | +### update_turn_contacts_queue.py |
| 36 | + |
| 37 | +Update Turn contacts using the Turn contacts api asynchronously but using a queue and workers. It will sleep if it gets rate limited by Turn. |
| 38 | + |
| 39 | +This script takes a filename of a file generated by the `fetch_rapidpro_contacts.py` script as a parameter and updates all the contact in the file on Turn. |
| 40 | + |
| 41 | +It is an async script and `WORKER_COUNT` can be configured, to change the amount being processed at a time. |
| 42 | + |
| 43 | +Command to run: |
| 44 | +`python scripts/migrate_to_turn/update_turn_contacts_queue.py contacts-2025-01-01-2025-01-07.csv > update_turn_contacts.json` |
| 45 | + |
| 46 | +The output is sent to a json file, which can be used to retry failed requests. |
| 47 | + |
| 48 | +### update_turn_contacts_bulk.py |
| 49 | + |
| 50 | +Update Turn contacts using the Turn bulk update contacts API. The API is currently limited to allow files of a maximum of 1 Megabyte in size. |
| 51 | + |
| 52 | +This script takes the csv provided and sends it directly yto the Turn API. It will output the results to a new csv file. |
| 53 | + |
| 54 | +Command to run: |
| 55 | +`python scripts/migrate_to_turn/update_turn_contacts_bulk.py contacts-2024-01-01-2025-01-07.csv` |
| 56 | + |
| 57 | +### compare_contacts.py |
| 58 | + |
| 59 | +This script can be used to compare specific contacts. |
| 60 | + |
| 61 | +Add the WhatsApp IDs of the contacts you want to compare to the WA_IDS list in the script. |
| 62 | + |
| 63 | +The script will get their Rapidpro and Turn contact details and output everything to a `compare.csv` file. |
| 64 | + |
| 65 | +Command to run: |
| 66 | +`python scripts/migrate_to_turn/compare_contacts.py` |
| 67 | + |
| 68 | +## FIELD_MAPPING |
| 69 | + |
| 70 | +This is a dictionary the script uses to figure out where to get the data, how to process it and where it should go. |
| 71 | + |
| 72 | +The key is the field name in Rapidpro, the value determinues the rest: |
| 73 | + |
| 74 | +`turn_name` - The destination field name in Turn |
| 75 | + |
| 76 | +`process` - The function to call to process the value, like change datetime to format Turn understands. The functions live in the `process_fields.py` file and that's where any new process field functions should be added. |
| 77 | + |
| 78 | +`type` - Where the value comes from, `default` Rapidpro contacts field or a `custom` one added by Reach. |
| 79 | + |
| 80 | +## STEPS |
| 81 | + |
| 82 | +1. Make sure the `FIELD_MAPPING` is configured |
| 83 | +1. Set the required environment variables: `TURN_TOKEN` and `RAPIDPRO_TOKEN`. |
| 84 | +1. Update the start date, end date, limit and field mapping in `fetch_rapidpro_contacts.py` script |
| 85 | +1. Run `python scripts/migrate_to_turn/fetch_rapidpro_contacts.py` and take note of the last modified date and the filename. |
| 86 | +1. Run `python scripts/migrate_to_turn/update_turn_contacts.py contacts-2025-01-01-2025-01-07.csv > update_turn_contacts.json` |
| 87 | +1. Use jq to check if there were any errors `jq .response.status update_turn_contacts.json | sort | uniq -c` |
| 88 | +1. To retry errors, run `cat update_turn_contacts.json | python scripts/migrate_to_rapidpro/retry_requests.py > update_turn_contacts2.json` |
| 89 | +1. Repeat previous two steps until all contacts successfully completed. |
| 90 | +1. Update the start and end date in `fetch_rapidpro_contacts.py` script. Repeat from step 3. |
0 commit comments