Skip main navigation

Growing random numbers from seeds

Ian Witten explains how random number seeds are used in Weka to ensure that experiments can be both repeatable and reliable.
In the preceding video I talk about changing the random number seed in the Weka Explorer and getting a different result. Were you mystified? An explanation follows.

Here’s the issue. Many data mining processes depend on some random process – like randomly splitting a dataset into training and test sets. This creates a conflict between getting repeatable results and realistic results. Realistically, the results should be slightly different each time, depending on the exact split. But in practice that would be a nightmare: you want to be able to repeat experiments and get the same results.

Here’s Weka’s solution. It uses a random number generator (a simple little program), but it generates the same sequence of numbers each time, so that you can do the same thing tomorrow with the same result. The sequence is controlled by number called a “seed”. You change it in the Explorer’s Classify panel, under More options. The default value is 1, but you get a different sequence of random numbers by changing the seed to something else – like 2, or 3, or 42, or anything.

This article is from the free online

Data Mining with Weka

Created by
FutureLearn - Learning For Life

Reach your personal and professional goals

Unlock access to hundreds of expert online courses and degrees from top universities and educators to gain accredited qualifications and professional CV-building certificates.

Join over 18 million learners to launch, switch or build upon your career, all at your own pace, across a wide range of topic areas.

Start Learning now