This is a common business scenario, but it turns out that you have to do quite a bit of work in Azure Data factory to make it work. So the goal is to take a look at the destination folder, find the file with the latest modified date, and then use that date as the starting point for coming new files from the source folder. I did not come up with this approach by myself, however, unfortunately, I misplaced the link to the original post so I cannot properly credit the author.
The details are in the video, but at high levels the steps are the following:
- Use Get Metadata activity to make a list of all files in the Destination folder
- Use For Each activity to iterate this list and compare the modified date with the value stored in a variable
- If the value is greater than that of the variable, update the variable with that new value
- Use the variable in the Copy Activity’s Filter by Last Modified field to filter out all files that have already been copied
The two activities in For Each:
And finally writing into the variable: