I am new to Bash scripting and I'm open to constructive criticism and want to learn. I'm working on a bash script to automate moving Backup files to an archive server. The plan is to run the script monthly to copy backups from storage server 1 to storage server 2. These backups are generated on the first Sunday of every month, and I need to copy the full backups from storage server 1 to storage server 2 on the following Monday. I want to loop through my directory of backup files and see which ones are the most recently created and match the file extension that I'm looking for. The .vbi extension is an incremental backup and I don't care about copying these, I only want to copy the most recently created .vbk files which are full backup files. There should only every be 2 files in the parent directory with names that match other than the timestamp and random 4 digit section. The last 5 characters before the file extension don't matter for my purposes (Im not really sure what they represent), and the last 22 characters in the filename before the .vbk will be the section that is different in each file. To clarify this, the filename is ('server name' - 'Server IP' D yyyy-mm-dd T hhmmss _ xxxx) I want to compare ('server name' - 'Server IP' D yyyy-mm-dd T hhmmss) against the time section (D yyyy-mm-dd T hhmmss) of the matching ('server name' - 'Server IP') I have most of this figured out, but I'm struggling with this one piece. This is an example of what the directory looks like
-rw-r--r-- 1 root root 0 Jul 1 10:20 'Webserver - 10.10.0.60D2023-07-01T003026_u153.vbk'
-rw-r--r-- 1 root root 0 Jul 8 08:32 'WebServer - 10.10.0.60D2023-07-08T002832_g842.vbk'
-rw-r--r-- 1 root root 0 Jul 8 07:23 'WebServer - 10.10.0.60D2023-07-08T023216_f264.vbi'
-rw-r--r-- 1 root root 0 Jul 1 10:10 'SQLServer - 10.10.0.4D2023-07-01T021049_8fj3.vbk'
-rw-r--r-- 1 root root 0 Jul 8 05:20 'SQLServer - 10.10.0.4D2023-07-08T012046_k860.vbk'
-rw-r--r-- 1 root root 0 Jul 8 11:04 'SQLServer - 10.10.0.4D2023-07-08T042046_9ju7.vbi'
I want to grab the files on line 2 and line 5 because they are the most recently created backups, and have the .vbk extension.
I can get a list of just the .vbk files already by running this.
for i in *.vbk;
do
[ -f "$i" ] || break
echo "$i"
done
and I get this list
'Webserver - 10.10.0.60D2023-07-01T003026_u153.vbk'
'WebServer - 10.10.0.60D2023-07-08T002832_g842.vbk'
'SQLServer - 10.10.0.4D2023-07-01T021049_8fj3.vbk'
'SQLServer - 10.10.0.4D2023-07-08T012046_k860.vbk'
how can I loop through this list and create a list of only the 2 newest backups where the _xxxx at the end of the name appears to be random? In this example I want to grab lines 2 and 4. I can compare the timestamps in the file name, or I can compare the system file times, I believe either will work.
This command will extract the date - time of your to latest files:
find ...: lists all files named *.vbksed ...: extract the portion between theDand the_. This is where you have your data and time information.sort: sort numerically. You are lucky, the file naming convention used by whatever produces theses files has the date and time properly ordered for a simple sort to work.tail: keep only the last 2 linesThe result of this command is the following:
You can then use a
whileloop to list files: