Skip main navigation

Batch Processing with HDInsight

In this step, we will begin to explore Azure HDInsight, and some of the open-source technologies available to us when batch processing data

In this step, we’ll start exploring Azure HDInsight and some of the open-source technologies available to us when batch processing data. You’ll also be introduced to Jupyter Notebooks.

HDInsight & Apache OpenSource

Azure HDInsight allows us to run popular open-source frameworks (including Apache Hadoop, Spark, Hive, Kafka, and more) within the Azure environment using a customisable, enterprise-grade service for open-source analytics.

You can effortlessly process massive amounts of data and get all the benefits of the broad open-source project ecosystem with the global scale of Azure, allowing you to easily migrate big data workloads and processing to the cloud.

Load-balancing

We need to scale up the amount of processing power available to reduce the amount of time taken to run our processes. However, often the environment has limitations on the amount of power available on an individual machine.

Load-balancing allows us to spread the responsibilities and tasks across multiple machines to grow the processing capabilities. In Azure HDInsight, Spark clusters can be utilised for the parallel processing of tasks specific for high-performance instantaneous querying.

Note: If you’d like to delve deeper into the open-source technologies mentioned in this step, take a look at the links posted in the See also section below.

In the final step of this activity, we’ll look at the real-time processing technologies available to us in Azure HDInsight.

This article is from the free online

Microsoft Future Ready: Fundamentals of Big Data

Created by
FutureLearn - Learning For Life

Our purpose is to transform access to education.

We offer a diverse selection of courses from leading universities and cultural institutions from around the world. These are delivered one step at a time, and are accessible on mobile, tablet and desktop, so you can fit learning around your life.

We believe learning should be an enjoyable, social experience, so our courses offer the opportunity to discuss what you’re learning with others as you go, helping you make fresh discoveries and form new ideas.
You can unlock new opportunities with unlimited access to hundreds of online short courses for a year by subscribing to our Unlimited package. Build your knowledge with top universities and organisations.

Learn more about how FutureLearn is transforming access to education