This site is now 100% read-only, and retired.

Generating consistent "random" numbers

Posted by Steve on Wed 21 Feb 2007 at 13:19

Generating random numbers on a collection of machines can be a useful way of ensuring they don't all access a particular resource at the same time. (For example backup jobs to a central NFS server). However using truly random numbers can make things unpredictable - using a machine-specific delay can be the best solution.

When you have many services each trying to talk to a central resource on a fixed schedule you run the risk of overwhelming your resource - especially if the client machines all have synchronised clocks.

The obvious solution is to stagger the jobs which cause this problem, perhaps with something like this:

#!/bin/bash
#
#  Run our backup.  Called by cron at midnight..
#
#

#
#  Sleep a random amount, not greater than ten minutes
#
(sleep $(($RANDOM % 600));

#
#  Backup ..
#
rsync ...

The problem with this is that you have no way of knowing, on the server side, which machines are going to backup when - you just hope they'll all manage to run at different times. (Such is the way of random numbers - all of your machines could sleep for 2 seconds!)

Instead the perfect solution, short of manually updating the cron times would be for each machine to sleep a consistent amount of time - which would be different from each other machine.

So, for example, machine "mine" would always sleep 90 seconds, and machine "yours" would sleep for 200. How can we do this in a simple fashion? Using hostid

hostid, contained in the coreutils package will return a fixed hexadecimal identifier for the current host.

This can be transformed easily into an integer for later user.

As an example the machine mine:

skx@mine:~$ hostid
a8c01401

The machine yours returns something different:

skx@yours:~$ hostid
a8c02801

The output of hostid will be some hex digits which will will need to convert into an integer.

Here we'll do that in a two-step process. First of all upper-casing any alphabetical characters, then invoking bc to convert from hex to decimal.

#!/bin/sh
#
# convert hex digits to uppercase.
number=`hostid | sed -e 's:^0[bBxX]::' | tr '[a-f]' '[A-F]'`

# convert hex to decimal
dec=`echo "ibase=16; $number" | bc`

# sleep a random amount of time, no more than 10 minutes.
delay=$(($dec % 600))
echo "Sleeping for $delay seconds"
sleep $delay

# do the backup here..

This will give a consistent delay for each host, and will ensure that all your hosts use a different time - without hard-coding anything!

Here is a sample run:

skx@mine:~$ ./rand-sleep.sh
Sleeping for 321 seconds

And again on a different host:

skx@yours:~$ ./rand-sleep.sh
Sleeping for 41 seconds

A great use of hostid is generating "random" MAC addresses, but that is a topic for another day.

 

 


Re: Generating consistent "random" numbers
Posted by flyboy (66.92.xx.xx) on Wed 21 Feb 2007 at 21:29
That's cool. But it doesn't really ensure that two hosts won't have the same delay time. Any two or more hosts whose ID's differ by a multiple of 600 would have the same delay, I believe.

[ Parent ]

Re: Generating consistent "random" numbers
Posted by Steve (62.30.xx.xx) on Wed 21 Feb 2007 at 22:01
[ View Weblogs ]

Indeed.

Still it is a nice simple solution to scheduling problems. Make the number larger for more spread. I seem to recall that making things "mod prime" would be better - but I'm hazy why that could be..

Steve

[ Parent ]

Re: Generating consistent "random" numbers
Posted by Thorsten (84.58.xx.xx) on Wed 21 Feb 2007 at 21:40
Hi Steve,

nice article - I think this has a great use with cfengine!
I also like thing as ps aux | od | md5sum | tr -d [a-f] to get big numbers.

best regards
Thorsten

[ Parent ]

Re: Generating consistent "random" numbers
Posted by Steve (62.30.xx.xx) on Wed 21 Feb 2007 at 22:14
[ View Weblogs ]

Neat trick - I've used this in the past:

head -c 32 /dev/urandom  | md5sum | tr -d '[ a-f-]'

The main difference there is that I get tr to remove " -" from the end of the number.

/dev/random is a better source to use, but it will stall when entropy isn't available..

Steve

[ Parent ]

Re: Generating consistent "random" numbers
Posted by Anonymous (139.80.xx.xx) on Thu 1 Mar 2007 at 22:33
Hostid seems to get its data from /etc/hosts. Unfortunately more often than not, hostid finds the loopback address first, 127.0.0.1, and returns the exact same string, 007f0100 for multiple hosts. This makes hostid useless for the general case.

People would be better off saving the output of 'uuidgen' to a file on each host to get a better pseudorandom number.

[ Parent ]