Apache Spark, Scala and Kafka

Apache Spark, Scala and Kafka

Price : ₹45,000.00

Offer Price : ₹40,000.00

Estimated Hours : 90 HOURS

Overview

Spark is an open source processing engine built, in spark we have ecosystems like Spark SQL, Streaming, Mlib, Graphx, for processing the data we use Scala as a programming language and Apache kafka is most advanced feature of big data used for streaming data integrated with java API’s.

Objective

  • Understand Scala and its implementation
  • Install Spark and implement Spark operations on Spark Shell
  • Understand the role of Spark RDD
  • Implement Spark applications on YARN (Hadoop)
  • Learn Spark Streaming API
  • Implement machine learning algorithms in Spark MLlib API
  • Analyse Hive and Spark SQL architecture
  • Understand Spark Graphx API and implement graph algorithms
  • Understand Kafka and its components.
  • Kafka cluster deployment on Hadoop and YARN
  • Understanding real time Kafka streaming
  • Integrating Kafka with real time streaming systems like Spark Streaming.
  • Introduction to the Kafka API
  • Project

Audience

  • Professionals aspiring to work on Big Data Analytics.
  • Spark Developers
  • Data Scientist
  • Individuals looking for a change in career
  • Project Managers, Messaging and Queuing System professionals

Prerequisites

Basic knowledge of big data, HDFS, any programming language like java, python, etc. but it is not mandatory.

Course Curriculum

  • Introduction to Spark Getting started
  • Resilient Distributed Dataset and Data Frames
  • Spark application programming
  • Introduction to Spark Eco System (Spark SQL)
  • Spark Streaming
  • Spark MLib
  • Spark Graphx