Learn more about this course.

What is distributed Weka?

Mark Hall introduces a plugin that runs Weka on a cluster of machines. It uses the “map-reduce” framework, and operates with both Spark and Hadoop.

Mark Hall from Pentaho introduces a plugin that runs Weka on a cluster of machines. It uses the “map-reduce” framework, and operates with both Spark and Hadoop. It comprises two Weka packages, distributedWekaBase, which provides general map-reduce tasks for machine learning that are not tied to any particular map-reduce implementation, and distributedWekaSpark, a wrapper for the base tasks that operates on the Spark platform. (There are also packages for Hadoop.) The aim is to support all Weka’s classification and regression algorithms without reimplementing them, generating output just like that produced by standard Weka. Clustering, however, had to be rewritten specifically for the distributed framework.

Want to keep learning?

This content is taken from The University of Waikato online course

Advanced Data Mining with Weka

View Course

See other articles from this course

This article is from the free online

Advanced Data Mining with Weka

Created by

Join Now

Reach your personal and professional goals

Unlock access to hundreds of expert online courses and degrees from top universities and educators to gain accredited qualifications and professional CV-building certificates.

Join over 18 million learners to launch, switch or build upon your career, all at your own pace, across a wide range of topic areas.

Start Learning now

Learn more about this course.

What is distributed Weka?

Share this post

Want to keep learning?

Advanced Data Mining with Weka

Share this post

Advanced Data Mining with Weka

Advanced Data Mining with Weka

Reach your personal and professional goals

Register to receive updates

Learn more about this course.

Learn more about this course.