Counting “Things” with grep

Niole Nelson
Niole Net
Published in
3 min readNov 16, 2021

--

Photo by Towfiqu barbhuiya on Unsplash

Counting is really useful. It’s usually faster to count things and return a number than to return all-the-things. Sometimes there are so many things, that your baby bird brain can’t comprehend it all.

Count lines

Ever had a BIG file of “things” and you didn’t want to open it because it would probably crash your computer? Counting lines is very helpful here.

cat big_file.txt | grep -c "$" # count the ends of linescat big_file.txt | grep -c "^" # count the starts of lines

The next example also counts lines, but for some reason I trust it less than the regex examples…

cat big_file.txt | grep -c ""

Count numbers

Have you ever logged the duration of time it took to execute something? Have you ever done that for lots of things, trying to narrow down for example, high latency bottlenecks in a large codebase? Latency problems can be caused by large groups of many little latencies, or a big latency here and there. It can be helpful to know 1. how many little latencies there are, and 2. are there any big latencies?

Let’s look for latencies in the 10–99 ms range…

cat logs.txt | grep -cE "[0-9]{2}"

You can also ask questions like, “how many of these numbers are less than 20?”. The following regex matches on numbers that “may or may not start with a 1, and are followed by a number”.

cat logs.txt | grep -cE "1*\d"

Compile a summary of the distribution of latencies for later analysis.

echo "tiny: $(cat logs.txt | grep -cE "[0-9]{1}")" >> report.txt
echo "small: $(cat logs.txt | grep -cE "[1-9][0-9]")" >> report.txt
echo "problematic: $(cat logs.txt | grep -cE "[1-9][0-9]{2}")" >> report.txt
echo "very problematic: $(cat logs.txt | grep -cE "[1-9][0-9]{3}")" >> report.txt

Count files

Have you ever wanted to know how many files are in a directory? Maybe you’re waiting for a process which writes files to a directory to complete and you want to know how far along the process is. Getting the number of files in that directory would be very helpful. Turns out, file names are actually just words to grep and you can count them as such.

ls /my/dir | grep -c "\w"

Count a specific type of file. Say you have a bunch of files that are named after the date and you need all of the files from the 5th to the 7th of November.

ls /my/dir | grep -c "^2021-11-0[5-7].*.txt$"

You can get a little crazy by throwing find into the mix…

Have you ever wanted to know how big a file system is? Maybe you’re troubleshooting something that processes files and you’re analyzing a particularly problematic directory that someone has loaded into your app.

cd big_dir && find . | grep -c "$"

find lists out all twenty five thousand files in the file tree. We then count them by piping the output through grep with the “count lines” regex. This exits surprisingly quickly.

--

--