Description
As discussed in #828, most file-based database packages (including MontyDB in the already-implemented MontyStore
) do not have any built-in protection against multiple Python processes (or threads) reading/writing to the same database at the same time. This makes them useful only for serial calculations and less suitable for high-throughput settings where the odds of a collision are very high.
Rather than relying on the external package to implement a file-locking system, we should introduce a file-locking mechanism within maggma that can be applied to all file-based data stores. py-filelock and portalocker are both good platform-agnostic options, with the former perhaps being slightly more active. There are built-in locking features in the MP monty
package, but in my opinion we are better off using a battle-tested solution since they are usually light on the dependencies anyway (and the lock mechanism used in fireworks often caused headaches...).
I'm jotting this down so that I don't forget. I don't have plans to work on this right now, but I will likely need to implement it one day in the future.