Skip main navigation

New offer! Get 30% off your first 2 months of Unlimited Monthly. Start your subscription for just £29.99 £19.99. New subscribers only. T&Cs apply

Find out more

The Knowledge Flow interface

The Knowledge Flow interface is an alternative to the Explorer, as Ian Witten explains. Data and classification models flow through a diagram!
10.7
Hello again! We’re going to look at the Knowledge Flow Interface. The Knowledge Flow Interface is an alternative to the Explorer, and it lets you lay out filters, classifiers, and evaluators interactively on a 2D canvas. There are various other components like data sources, and visualization components, and so on. We have different kinds of connections between the components, and a feature of the Knowledge Flow Interface is that it can work incrementally on potentially infinite data streams. Let’s go ahead and set up a configuration in the Knowledge Flow Interface. I’ll just start it up here. I’m going to load an ARFF file with a DataSource called an ARFF Loader.
59.5
I’m going to configure that – this is a right-click, Configure – to use the iris dataset, which is here. Then I’m going to need a Class Assigner to assign the class. That’s here – Class Assigner. I can make a connection, and I’m going to make a Dataset connection to the Class Assigner. Then I’m going to get a Cross-Validation Fold Maker, because we’re going to evaluate this with cross-validation. I’m going to connect up the dataset to the CrossValidationFoldmaker. Then I’m going to get a classifier. I’ll use good old J48. Here are all of the classifiers. J48 is up here with the tree classifiers at the end. Let me put that there.
114.7
I’m going to connect both the Training Set and the Test Set from the CrossValidationFoldmaker to J48. I’m going to get a Classifier Performance Evaluator in the Evaluation tab. I’m going to connect the classifier – that is, the batch classifier produced by J48 – to this, and I’m going to connect the output to a Text Viewer. Here’s a Text Viewer, the textual output I’m going to connect. Then I’m going to start it all up. I’m going to run it. With my right-click here, I’m going to Start Loading. Let’s have a look at this Text Viewer; right-click to show the results. Here we go. These are the results that we’ve got. Well, we’ve seen these results before many times, of course.
174
There are a lot of different things back on my slide here. This is what I’ve done.
181
Here’s the configuration I set up. Next, I’m going to add a Model Performance Chart. Let’s find that. That would be under Visualization. Here’s our Model Performance Chart. I’m going to connect the VisualizableError to this. Then I’m going to have a look at the output. Let me just run this again (Start Loading). Now I’m going to look at the output (Show Chart). Here – well, you’ve seen this kind of chart before – I could plot, for example, the predicted class against the actual class. There are a lot of different things you could do.
234.2
Back on the slide here: let’s work with stream data. I’m going to take an ARFF loader in stream mode – not load a dataset, but a single instance at a time. We’re going to use an updateable classifier, an incremental evaluator, and look at a Strip Chart. We clear all of this over here. Select “Data Source”. Let’s get that ARFF loader going, and configure it to use the iris data.
265.5
Then I’m going to take that to a Class Assigner, which is in Evaluation.
273.5
This time I’m going to make an instance connection: I’m just going to send a single instance along here. And I’m not going to make cross validation folds; I’m going to take that straight to an updateable classifier. There’s an updateable version of NaiveBayes. Some classifiers are updateable and some aren’t.
293.2
NaiveBayes Updateable, let’s use that one. I’m going to connect that instance here to the updateable NaiveBayes classifier. Then I’m going to use an Incremental Classifier Evaluator.
311.9
It’s an incremental classifier that I’m going to connect up to this. Now I’m going to take the output from that and put it on a Strip Chart. Here’s a Strip Chart.
330.6
Take the output here to the chart I picked and put it there. Okay. Let’s show the Strip Chart, which is blank at the moment. Then with my ARFF Loader, I will Start Loading. You can see a little bit of output here. I’m going to use a larger dataset. I could configure this, of course, but the simplest thing is to use a larger dataset. Let me use the segment-challenge dataset and start loading again. Now we get this kind of output. This shows you how the class probabilities change for one class and for the other class as we go through. These are effectively learning curves in this situation. We’ve looked at the Knowledge Flow Interface.
378.8
The panels are broadly similar to the Explorer’s with some exceptions. Evaluation is a separate panel, for example. The facilities are broadly similar, as well, with just a couple of notable exceptions. We can deal incrementally with potentially infinite datasets. That’s what we just did – the configuration we just set up loaded from the file incrementally, so it was never stored in memory at the same time, which is what the Explorer does. The Explorer loads everything into memory. Also, you can look inside cross-validation at the models for individual folds. Some people really like graphical interfaces like this, and it’s really good to know about the Knowledge Flow Interface.

The Knowledge Flow interface is an alternative to the Explorer. You lay out filters, classifiers, evaluators, and visualizers interactively on a 2D canvas and connect them together with different kinds of connector. Data and classification models flow through the diagram!

Note: the version of the Knowledge Flow interface shown here is slightly older than the current version. However, the features are the same, just re-arranged slightly. You now run the Knowledge Flow by clicking the “play” icon at the top left corner, and the interface components are shown down the left-hand side rather than at the top. You can also double click on components instead of right-clicking.

This article is from the free online

More Data Mining with Weka

Created by
FutureLearn - Learning For Life

Reach your personal and professional goals

Unlock access to hundreds of expert online courses and degrees from top universities and educators to gain accredited qualifications and professional CV-building certificates.

Join over 18 million learners to launch, switch or build upon your career, all at your own pace, across a wide range of topic areas.

Start Learning now