I have a user with a huge number of files, approximately 3 million, that are all very, very tiny: typically under a hundred kilobytes each. They’re all located on a remote Windows 2019 Server. The user wants to make a regular, nightly copy of this data from the remote system to his local computer, which is running Windows 10 Pro. Now, I know the initial copy will most likely take a few days, even if I compress it before sending it over. It is, after all, a lot of files.
The challenge comes in after the initial copy is made:
How can I quickly scan the folders and all subfolders for any changes (new files, changes to existing files, deleted files) and then replicate those changes on the Win10 computer?
/
There are two independent variables that I have to figure out: the type of secure connection, the “pipe”, and the method of copying files.
Types Of Secure Connection
- VPN
- SSH
- WireGuard
Method of Copying Files
- Windows Robocopy/Xcopy
- Rsync/SCP
- Compress into one big file, then transfer, then uncompress at the other end
Note: It is true Windows Server offers a couple great methods for making a remote duplicate of a folder: DFS Replication and Storage Replica. Unfortunately, its my understanding both the source and the target machine must be running Windows Server for this, so it will not work.
My plan is to test as many different configurations as possible, and to that effect, I’ve worked out steps to follow. I will go through all these steps for each of the 3 types of secure connection.
- Establish the secure connection
- Make exact duplicates of the data in 3 separate locations on the Win 10 machine
- In the source data, delete a file, make a new file, and change a file
- Use Rsync to replicate data on the Win 10 machine
- Repeat Step #4 with Robocopy, then Xcopy, and maybe with Scp, as well
- Measure how much time it took for each method
In my next post, I’ll present the results of the VPN for each method of transferring the data. See you soon!
This is a problem I have faced before. It can gum up the best of replication methods as most are not built for tons of tiny files. Looks like you are headed down the right path. Interested in seeing the results of your tests.