CentOS: how to deal with a folder containing very large number of files

 

So, you’ve left that cron running for a couple of months and now you have several hundreds of thousands of files in one folder that you would love to delete but you cannot even list them?

The situation is rather trivial if you can just delete the folder with all it’s contents but what do you do if there are other files in the folder that shouldn’t be removed?

First of all it is good to understand how many files you have in the folder. In order to check how many files you have in specific folder on CentOS first navigate to the folder

cd /yourdir

And then issue the following command
ls | wc -l

If  you estimate to have 200K-500K files there it will take quite a while to complete so do not hurry to interrupt it with Ctrl-C but rather leave it running and go for a coffee 🙂

Once you’ve got the output you now have the exact idea of the volume of files you are dealing with. It is great if know the names of the files (at least the pattern of the names) but it might not be the case.

Issuing
ls -la
will just overload your console and will take the whole lunch break to complete.
You might want to list just the first 20-50 files in the folder
ls -la| head -100
will display the first 100 files in the folder in alphabetical order. This command will take a while as well if you have a very large volume of files in the folder.

Now once you know the names of the files that need to be deleted it is as easy as
rm -f unwantedfile*
Right? Well, it might be if you do not stumble into
/bin/rm: Argument list too long


So you now have to either come with the pattern that covers smaller portion of the files and use it a couple of hundreds of times or you can use “find” command to subsequently delete all that it finds. Make sure you are still in the folder where your files are located and use the following command

find . -name 'unwantedfile*' | xargs rm
or
find /path/to/files/ -name "unwantedfile*" -delete

Though the command does take a while to complete you will clean up all the files that match your pattern in one go.

Make sure you take a look at this blog post http://blog.ergatides.com/2012/01/09/be-careful-when-using-find-with-the-delete-flag-to-clean-old-files/ for a warning about the “find” command delete flag.