Skip main navigation

New offer! Get 30% off one whole year of Unlimited learning. Subscribe for just £249.99 £174.99. New subscribers only. T&Cs apply

Find out more

Introduction to Data Analytics libraries

Introduction to NLP

Data Analytics and Natural Language Processing

Data analytics is particularly useful when unstructured data is analysed. There are two types of data, structured and unstructured. The former includes any format that exhibits any highly organised structure, such as numerical data on a spreadsheet. The latter is much more complicated (but more interesting) as there is not a clear structure to use to identify patterns or identify the type of data.

Examples include text, images, sounds, etc. Our brain is highly effective at interpreting, assessing, and extracting usable information from such data. It is something we do with very little effort. However, replicating the same process within a computing environment is extremely hard.

There have been notable advances in image and text recognition, where machines have surpassed humans in correctly identifying specific information. But these are still few and far in between and certainly not the expected outcome. As a consequence, there is a significant effort in developing big data analytics solutions that can address unstructured data.

Natural Language Processing (NLP) is a branch of computer science, which specifically deals with text analysis, whose aim is to facilitate interactions between computers and human language. Its ultimate goal is a computerised understanding of textual data, including the contextual ambiguities of any language. The current-state-of-the-art NLP technology has demonstrated good accuracy in information extraction tasks, which include topic recognition, text summarisation, systematic reviews, sentiment, and social network analysis. A full discussion of NLP goes beyond the scope of this course.

In this context, NLP will be regarded as a set of tools to extract insights from text. In particular, we will analyse some examples to extract specific actionable information that would be otherwise difficult to carry out manually, due to their volume, complexity and diversity.

This article is from the free online

Introduction to Python for Big Data Analytics

Created by
FutureLearn - Learning For Life

Reach your personal and professional goals

Unlock access to hundreds of expert online courses and degrees from top universities and educators to gain accredited qualifications and professional CV-building certificates.

Join over 18 million learners to launch, switch or build upon your career, all at your own pace, across a wide range of topic areas.

Start Learning now