Skip main navigation

New offer! Get 30% off one whole year of Unlimited learning. Subscribe for just £249.99 £174.99. T&Cs apply

The role of data mining

How to do data mining? In this article, Dr Ming Yan discusses his recent research.

In general, data mining tasks can be categorized as descriptive and predictive. Descriptive mining tasks portray the general nature of the target data; predictive mining tasks generalize over the current data in order to make predictions. Common data mining functions include clustering, classification, correlation analysis, data summarization, deviation detection, and prediction, etc., where clustering, correlation analysis, data summarization, and deviation detection can be considered as descriptive tasks, and classification and prediction can be considered as predictive tasks.

The main roles of data mining are as follows.

Clustering

Clustering is a process of dividing data objects into subsets, each of which is a cluster. Data objects are clustered or grouped according to the principle of maximizing intra-class similarity and minimizing inter-class similarity. Because no information on class labeling is provided, clustering is a form of unsupervised learning by observation rather than by example.

Classification

Classification is an important form of data analysis that extracts models that portray important data classes. Such models, called classifiers, predict the (discrete, unordered) class labels for classification, and are a form of supervised learning, i.e., the learning of the classifier is “supervised” by being told which class each training tuple belongs to.

Correlation analysis

If there is some regularity between the values of two or more variables, it is called an association. Correlations can be categorized as simple correlations, temporal correlations, causal correlations, and so on. The purpose of association analysis is to find out the hidden network of associations in the data. Sometimes the correlation function of the data in the database is not known, and even if it is known, it is uncertain, so the rules generated by correlation analysis carry credibility.

Data summarization

Data summarization evolved from statistical analysis in data analysis and its purpose is to condense the data and give a compact description of it. Among them, data description is to describe the connotation of a certain class of objects and summarize the relevant features of such objects. Data description is divided into characteristic description and distinguishing description, the former describes the common features of a certain class of objects, and the latter describes the differences between different classes of objects.

Bias Detection

Deviation includes many potential knowledge, such as anomalous instances in classification, special cases that do not satisfy the rules, deviation of observation results from the predicted values of the model, change of quantities over time, and so on. The basic method of bias detection is to look for meaningful differences between observations and reference values, to describe the few, extreme exceptions in the analyzed object, and to explain the intrinsic causes.

Prediction

Forecasting is the process of obtaining a forecasting model by learning the correlation between the input and output values of the sample data (historical data) and then using the model to forecast the output values of the input values of the aggregate (future).

Your task

Summarize the role of data mining.

Share your thoughts and ideas in the comments below.

© Communication University of China
This article is from the free online

Introduction to Digital Media

Created by
FutureLearn - Learning For Life

Reach your personal and professional goals

Unlock access to hundreds of expert online courses and degrees from top universities and educators to gain accredited qualifications and professional CV-building certificates.

Join over 18 million learners to launch, switch or build upon your career, all at your own pace, across a wide range of topic areas.

Start Learning now