Learn more about this course.

Introduction to Big Data Technologies

This will describe the technological solutions available to us within the Azure platform to tackle the problems highlighted in the previous Activity

This activity will describe the technological solutions available within the Azure platform to tackle the problems highlighted in the previous section.

In previous steps, we explored batch processing, which typically happens when you have large amounts of static data that you need to process regularly to aggregate or clean them.

We also spoke about real-time processing, where we take a perpetual stream of data and process it perpetually in real-time to analyse or store it.

There are many technological solutions within Microsoft Azure to achieve this.

Batch processing.

An example of a workflow:

Extract data
Clean up
Load into the analytical store
Report from that store.

Orchestration of Analytics

Options available to us within the Azure platform for processing the data:

Apache Hadoop, or Spark (open-source)
Native Platform as a Service option: Azure Data Lake Analytics.

Both of these will be explored in later steps.

Data Store for Analytics post-processing

Azure Synapse Analytics is an analytics service that brings together enterprise data warehousing and big data analytics. Dedicated SQL pool (referred to in this video as Azure SQL Data Warehouse) refers to the enterprise data warehousing features that are available in Azure Synapse Analytics.

Dedicated SQL pool (formerly SQL DW) represents a collection of analytic resources that are provisioned when using Synapse SQL. The size of a dedicated SQL pool (formerly SQL DW) is determined by Data Warehousing Units (DWU).

Analytics

Once we’ve created these processes, we have options for displaying the outputs within Azure & the Microsoft family. We could export to Excel, generate charts and explore the data further there, or use some of the integrated functionality to display results within Power-BI.

Real-time Data

As discussed in the last step, we first need to capture data. Within Azure, we have options of tools such as:

Message Broker
Azure Event Hubs
IoT Hub.

There are other tools such as Azure Stream Analytics, or Apache Open Source solutions such as Kafka, Storm, Spark, or H-Base, which we can use to create bespoke solutions for our problems. These will be explored more in later steps.

Processing

Whilst real-time reporting Azure Stream Analytics output, Azure Machine Learning can also do some processing.

In the next step, Graeme will demonstrate some tools for the batch processing of data in the Azure environment.

Want to keep learning?

This content is taken from CloudSwyft Global Systems, Inc. online course

Microsoft Future Ready: Fundamentals of Big Data

View Course

See other articles from this course

This article is from the free online

Microsoft Future Ready: Fundamentals of Big Data

Created by

Join Now

Reach your personal and professional goals

Unlock access to hundreds of expert online courses and degrees from top universities and educators to gain accredited qualifications and professional CV-building certificates.

Join over 18 million learners to launch, switch or build upon your career, all at your own pace, across a wide range of topic areas.

Start Learning now

Learn more about this course.

Introduction to Big Data Technologies

Batch processing.

Orchestration of Analytics

Data Store for Analytics post-processing

Analytics

Real-time Data

Processing

Want to keep learning?

Microsoft Future Ready: Fundamentals of Big Data

Share this post

Microsoft Future Ready: Fundamentals of Big Data

Microsoft Future Ready: Fundamentals of Big Data

Reach your personal and professional goals

Register to receive updates

Learn more about this course.

Learn more about this course.