This article will be interesting to those who often have to deal with data integration.
Assume that there is a database where users always modify data (update or remove). Perhaps, this database is used by a large application that does not allow modifying the table structure. The task is to load data from this database to another database on a different server from time to time. The simplest way to tackle the problem is to load the new data from a source database to a target database with preliminary cleaning up the target database. You can use this method as long as the time to load data is acceptable and does not exceed preset deadlines. What if it takes several days to load data? In addition, unstable communication channels lead to the situation when data load stops and restarts. If you face these obstacles, I suggest considering one of the ‘data reloading’ algorithms. It means that only data modifications occurred since the latest load are loaded.