Skip main navigation

Introduction to Data Analytics libraries

Introduction to NLP

Data Analytics and Natural Language Processing

Data analytics is particularly useful when unstructured data is analysed. There are two types of data, structured and unstructured. The former includes any format that exhibits any highly organised structure, such as numerical data on a spreadsheet. The latter is much more complicated (but more interesting) as there is not a clear structure to use to identify patterns or identify the type of data.

Examples include text, images, sounds, etc. Our brain is highly effective at interpreting, assessing, and extracting usable information from such data. It is something we do with very little effort. However, replicating the same process within a computing environment is extremely hard.

There have been notable advances in image and text recognition, where machines have surpassed humans in correctly identifying specific information. But these are still few and far in between and certainly not the expected outcome. As a consequence, there is a significant effort in developing big data analytics solutions that can address unstructured data.

Natural Language Processing (NLP) is a branch of computer science, which specifically deals with text analysis, whose aim is to facilitate interactions between computers and human language. Its ultimate goal is a computerised understanding of textual data, including the contextual ambiguities of any language. The current-state-of-the-art NLP technology has demonstrated good accuracy in information extraction tasks, which include topic recognition, text summarisation, systematic reviews, sentiment, and social network analysis. A full discussion of NLP goes beyond the scope of this course.

In this context, NLP will be regarded as a set of tools to extract insights from text. In particular, we will analyse some examples to extract specific actionable information that would be otherwise difficult to carry out manually, due to their volume, complexity and diversity.

This article is from the free online

Introduction to Python for Big Data Analytics

Created by
FutureLearn - Learning For Life

Our purpose is to transform access to education.

We offer a diverse selection of courses from leading universities and cultural institutions from around the world. These are delivered one step at a time, and are accessible on mobile, tablet and desktop, so you can fit learning around your life.

We believe learning should be an enjoyable, social experience, so our courses offer the opportunity to discuss what you’re learning with others as you go, helping you make fresh discoveries and form new ideas.
You can unlock new opportunities with unlimited access to hundreds of online short courses for a year by subscribing to our Unlimited package. Build your knowledge with top universities and organisations.

Learn more about how FutureLearn is transforming access to education