Growing random numbers from seeds
In the preceding video I talk about changing the random number seed in the Weka Explorer and getting a different result. Were you mystified? An explanation follows.
Here’s the issue. Many data mining processes depend on some random process – like randomly splitting a dataset into training and test sets. This creates a conflict between getting repeatable results and realistic results. Realistically, the results should be slightly different each time, depending on the exact split. But in practice that would be a nightmare: you want to be able to repeat experiments and get the same results.
Here’s Weka’s solution. It uses a random number generator (a simple little program), but it generates the same sequence of numbers each time, so that you can do the same thing tomorrow with the same result. The sequence is controlled by number called a “seed”. You change it in the Explorer’s Classify panel, under More options. The default value is 1, but you get a different sequence of random numbers by changing the seed to something else – like 2, or 3, or 42, or anything.