First, the video, then the discussion:
Sean claims that ‘cat’ is short for ‘concatenate’ … which is what I always thought (I’m sure ‘cat’ is ‘concatinate’ in at least one relatively common computer language other than bash).
If you man cat you do indeed get a statement that says “cat – concatenate files and print on the standard output.”
It has become fashionable over the last few years for shell programmers to eschew cat. It is often the case that using cat is redundant with some other way of doing something which is seen as better for some reason, but that reason is often rather obscure or irrelevant. Like, “you are using six more processing cycles” or whatever … on your 2.3 megahertz dual quad core computer that was otherwise mainly just sitting there handling mouse input.
So, the coolest thing about cat is that you can use it to take over the world. cat is the world’s simplest text editor. Type cat with a redirect to a file, like Sean demonstrated, and then write a C++ program that hacks the World Bank, and you’re there.
cat has a lot of options, such as -n which numbers the output lines, and -b which numbers only the none blank lines.
Personally, I like to use cat as a feeding device for some other part of a script. I use cat to produce output from a file of test data, and then I mess with that output until I get the results I’m looking for, and then I work upstream and replace the cat command with some other code (which is supposed to produce the output that the test data mimicked).
I know, I know. I’m wasting cycles. But I have extra cycles, I promise.
No mention of ‘tac’?
tac – concatenate and print files in reverse
Definitely a why-the-hell-would-anyone-need-that-command that turns out to be a lifesaver once or twice in a career.
Somehow, Sean neglected what seems to me to be another obvious use of cat. With the “double greater-than” operator, (>>) cat will add text to an existing file. I use this about once a week.
I have yet to use cat to write a program of any kind, but the World Bank hack does sound like an interesting challenge.
While cat certainly is a very useful tool, you should remember that it’s essentially bound to the drive IO and not the CPU. A useless use of cat (UUOC) causes a file to be read repeatedly and unnecessarily and the performance degradation can’t be overcome with more cores or faster clock speeds.
Consider:
$ cat file | grep foo
versus:
$ grep foo file
This can seem pretty trivial with tiny files, but if the file is large or the storage is slow the effects can be extremely painful.
I ran the following unscientific benchmark using a 7MB text file:
$ time grep foo foo.csv > /dev/null
real 0m0.172s
user 0m0.093s
sys 0m0.046s
$ time cat foo.csv | grep foo > /dev/null
real 0m4.238s
user 0m3.951s
sys 0m0.264s
Imagine if that file’s size was greater than a GB!