Skip to main content

using vim and command line to backup and remove a lot of directories

Was just working on some server cleanup for a client and thought that this might be a handy tip for anyone out there in a similar situation...

Here's the problem, I have a directory on a server that has lots (i.e. thousands) of subdirectories. I need to copy those up to an S3 backup bucket - organized by year and month - and then delete them.

The tricky bit is they're not organized by month/year right now. It's just a giant mess of subdirectories without much of a coherent naming strategy.

So, here's what I've come up with so far.

First, I generate a text file listing all the directories, filtering through grep for the year I'm interested in, i.e.

ls -ltr | grep 2019 > list.txt

This gives me a list of all directory entries that have '2019' in them, which should get them all since we're using 'ls -l'

Next, I open that file in vim, and copy all the files for the month I'm backing up, and use a regex to remove the leading bit of each line that shows the permissions, date, size, etc.

:% s/.*2019 //g

So now I have a list of just the directory names, one per line.

Then I use another regex to create an aws cli command on each line, which copies the directory up to the bucket. The aws cli command looks something like this

aws s3 cp --recursive [directory] s3://[bucket]/2019/04/[directory]

The vim regex looks like this:

:% s/\(.*\)/[s3 command from above]\/\1/g

Once that's done, I have a big old text file of aws cli commands. I quit out of vim, make that file executable, and run it on the command line.

To remove the directories after the command finishes, I basically use the exact same process - except that I the last bit generates the appropriate rm -rf command - and execute that file.

I'm sure there's better/smoother processes, but this one works pretty well for my needs. One thing I need to remember to due next time I start a command is to do it in a screen session in case of connection issues. That's a post for another day!

Comments

Popular posts from this blog

Another VI tip - using macros, an example

God I love VI. Well, actually, vim but whatever. Here's another reason why. Suppose you need to perform some repetitive task over and over, such as updating the copyright date in the footer of a static website. (Yes, yes I know you could do a javascript thing or whatever, just bear with me.) Of course you could just search and replace in some text editor, changing "2007" to "2008" (if you're stupid) - and you'll end up with a bunch of incorrect dates being changed, most likely. What you need to do is only change that date at the bottom. And suppose that because of the formatting, you can't use the "Copy" part of the string in a search replace - perhaps some of the pages use "©", some spell out "Copyright" etc. This is where vi macros come in handy. A macro in vi is exactly what you expect, it records your actions and allows you to play them back. To start recording, press q followed by a character to use to "stor

Using FIle FIlters in FileZilla

Here's a handy tip for situations when you want to download a large number of files - but only of a certain type. For example, perhaps you want to download all the PHP files from a largish website, scattered through many subdirectories. Perhaps you're making a backup and don't want any image files, etc. FileZilla (still the best FTP in my opinion) has a handy feature called filename filters - located under the Edit menu. Here you can set various filters that filter out files based on their filename. Took me a minute to figure that out - you're saying show only PHP files, rather you're saying filter out files that do not have ".php" as their suffix. For some reason, that seems a little backwards to me, but whatever. It works quite well. You can also check whether the filter applies only to files, only to directories - or both. In this example, you'd want to check only files, as otherwise you won't see any directories unless they happen to end in

Debugging a DOS

I'm not a sysadmin, but I end up doing my best now and then when one of my sites gets into trouble. This is a sort of "after action report" of an incident that I just resolved (hopefully). I woke up and happened to check email on my phone (don't always do this, will now) and was greeted with a uptime robot email that one of my sites was down, and had been for about 4 hours. I quickly checked the site on my phone and yup, it wasn't loading. Ran to the office and hopped on my laptop. SSH to the server, and everything seems fine. Very little load on the server (AWS instance). Did a restart of apache/php/mysql and the site is still down. Weird. Running the site's index.php file on the command line works as expected and fast. Ask a few other people to check, and it's down for them. Then I logged into the AWS console and checked on status there - everything is up and running.... WTF? This is a lightsail instance, and then I noticed the outgoing network traffic h