Monday, February 3

Linux Fu: The Linux Shuffle

Computers are known to be precise and — usually — repeatable. That’s why it is so hard to get something that seems random out of them. Yet random things are great for games, encryption, and multimedia. Who wants the same order of a playlist or slide show every time?

It is very hard to get truly random numbers, but for a lot of cases, it isn’t that important. Even better, if you programming or using a scripting language, there are lots of things that you can use to get some degree of randomness that is sufficient for many purposes.

The Root of Random

In your device directory are two quasi-files you might not have noticed before. The /dev/random and /dev/urandom files will output as many random bytes as you might want to read. Why are there two? The kernel grabs noisy data from different places. For example, it might read crypto hardware or measure time intervals between disk accesses. These numbers are not easy to predict and can make a good source of difficult to guess numbers. However, for a certain number of random bits you need a certain amount of random noise. The /dev/random device file fills with these environmental random bits, and if it needs more random measurements to complete the request, it will block until it gets them. The /dev/urandom file, on the other hand, will provide an “unlimited” number of bytes; it works by periodically re-seeding a pseudo-random number generator with environmental randomness.

If you program in any normal language, it is easy to just open either of these files and read the number of bytes you want. In normal shell scripting, it is easy, too. For example:

head -c 3 /dev/random | od -t x1 -A none

This command will give you three hex bytes. If you prefer, you could change the x1 to get decimal numbers or anything else you want.

Better Shell

Of course, the Shell knows you want to do this. Bash keeps $RANDOM updated and you can read from it if you prefer:

for i in {1..5}
do echo $RANDOM
done

This will give you five random numbers each time.

Better Still

This is easy, but we can still do better. After all, suppose you have a bunch of sayings in a file, one per line. Even with a random number, you’d need to skip the lines and worry about how many lines are in the file total. There’s a better way: the shuf command.

This command seems simple at first but is actually quite powerful. The bare command reads a file, or standard input, and permutes it based on a random number. There are options to feed it your own source of random numbers if you care.

Sometimes you don’t want all the items in the file. For example, picking a single quote from a file, you might just want the next random song. The -n option limits the output to the first line or lines. If you want to shuffle numbers, you can use the -i option. For example:

shuf -n 1 -i1-10

This command will give you a single random number between 1 and 10. Very easy!

Back to the picking a random quote from a file, that’s as easy as:

shuf -n1 input_file.txt

Combined with a list of files, this can pick random files easily, too:

ls *.mp3 | shuf -n 1

When to Choose Which

Note the shuf command is part of the GNU Core Utilities, so some machines won’t have it. In BSD, the jot command is somewhat similar. For a more portable script, it would probably be wise to check that shuf exists, maybe look for jot, and if you find neither, try to see if $RANDOM changes. You could process the raw number with awk. Absent that, you could check for /dev/urandom and /dev/random, which would also require some processing.

With these tools, you can write delightfully unpredictable scripts. (Of course, some of our scripts are less than delightfully unpredictable, too. But we can’t blame /dev/urandom for that.)

No comments:

Post a Comment