Kafka Fundamentals

Master the essentials of Kafka with our "Kafka Fundamentals" course. Learn about Kafka architecture, topics, and APIs, and gain hands-on experience with AVRO, Schema Registry, SpringBoot, and streaming pipelines. Perfect for developers and data engineers looking to build and optimize real-time data processing applications.

  • duration 24 hours
  • Language English
  • format Online
duration
24 hours
location
Online
Language
English
Code
EAS-026
price
€ 650 *

Available sessions

To be determined



Training for 7-8 or more people?
Customize trainings for your specific needs

Description

Kafka Fundamentals is a comprehensive course designed to provide you with a deep understanding of Apache Kafka, one of the most popular platforms for building real-time data pipelines and streaming applications. This course is ideal for developers, data engineers, and system architects who want to learn how to design, build, and manage Kafka-based solutions effectively.

 

The course begins with an exploration of Kafka’s architecture, where you’ll learn how to plan and design your own distributed queue. You’ll address key questions related to message format, consumption patterns, data persistence, and retention, and understand how to support multiple producers and consumers efficiently.

 

Next, you’ll dive into Kafka topics, console producers, and console consumers, learning how to create topics with multiple partitions, ensure data replication, and manage message order and data skew. You’ll also gain practical experience in optimizing message writing and reading for different use cases, including low latency and maximum compression scenarios.

 

The course then covers working with Kafka using various programming languages, including Java, Scala, and Python. You’ll build simple consumers and producers, manage consumer groups, and handle transactions. This module also explores integration with Web UI and REST APIs.

 

A dedicated module on AVRO and Schema Registry follows, where you’ll learn how to add and manage AVRO schemas, build AVRO consumers and producers, and use Schema Registry to ensure data consistency. You’ll also learn how to handle errors using specific error topics.

 

In the SpringBoot and SpringCloud module, you’ll learn how to integrate Kafka with Spring applications. You’ll write templates for Spring Apps, add Kafka Templates for producers and consumers, and modify Spring Boot to work in asynchronous mode. The course also covers streaming pipelines, where you’ll compare Kafka Streams, KSQL, Kafka Connect, Akka Streams, Spark Streaming, and Flink. You’ll learn how to build robust streaming pipelines and manage checkpoints, backpressure, and executor management.

 

Finally, the course concludes with Kafka monitoring, where you’ll learn how to build and manage Kafka metrics using tools like Grafana, ensuring your Kafka deployments are optimized and well-monitored.

 

Learning Outcomes: By the end of this course, participants will:

  • Understand Kafka’s architecture and design distributed queues with optimal performance.
  • Efficiently manage Kafka topics, producers, and consumers, ensuring data consistency and performance.
  • Integrate Kafka with Java, Scala, Python, and other languages via REST, and handle transactions effectively.
  • Implement AVRO schemas and use Schema Registry to manage data serialization and deserialization.
  • Build and optimize streaming pipelines using Kafka Streams, KSQL, and other streaming frameworks.
  • Monitor Kafka clusters effectively using Grafana and other monitoring tools.
After completing the course, a certificate is issued on the Luxoft Training form

Objectives

Upon completion of the "Kafka Fundamentals" course, trainees will be able to:

  • Design and implement distributed queues using Kafka, with a focus on message format, order, and persistence.
  • Create and manage Kafka topics, partitions, and replicas, ensuring optimal performance and reliability.
  • Develop and integrate Kafka consumers and producers in Java, Scala, Python, and through REST APIs.
  • Use AVRO and Schema Registry to manage data serialization and ensure compatibility across services.
  • Build and manage robust streaming pipelines with Kafka Streams, KSQL, and other streaming frameworks.
  • Monitor Kafka clusters, set up metrics, and optimize Kafka performance using tools like Grafana.

Target Audience

Developers, Architects, Data Engineers

Prerequisites

Development experience in Java (over 6 months)


Roadmap

1. Module 1: Kafka Architecture: theory 2h / practice 1.5h

  • Planning your own distributed queue in pairs: write, read, keep data in parallel mode.
  • What's the format and average size of messages?
  • Can messages be repeatedly consumed?
  • Are messages consumed in the same order they were produced?
  • Does data need to be persisted?
  • What is data retention?
  • How many producers and consumers are we going to support?

2. Module 2: Kafka-topics, console-consumer, console-producer: theory 2h / practice 1.5h

  • Using internal Kafka-topics, console-consumer, console-producer
  • Create topic with 3 partitions & RF = 2
  • Send message, check the ISR
  • Organize message writing/reading with order message keeping
  • Organize message writing/reading without order message keeping and hash partitioning
  • Organize message writing/reading without skew data
  • Read messages from the start, end and offset
  • Read topic with 2 partitions / 2 consumers in one consumer group (and different consumer group)
  • Choose optimal number of consumers for reading topic with 4 partitions
  • Write messages with min latency
  • Write messages with max compression

3. Module 3: Web UI + Java, Scala, Python API + other languages (via Rest): theory 2h / practice 1.5h

  • build simple consumer and producer
  • add one more consumer to consumer group
  • write consumer which reads 3 records from 1st partition
  • add writing to another topic
  • add transaction

4. Module 4: AVRO + Schema Registry: theory 2h / practice 1.5h

  • Add avro schema
  • compile java class
  • build avro consumer and producer with a specific record
  • add schema registry
  • add error topic with error topic and schema registry
  • build avro consumer and producer with a generic record

5. Module 5: SpringBoot + SpringCloud: theory 2h / practice 1.5h

Homework:

  • Write template for Spring App
  • Add Kafka Template with producer
  • Add Kafka Template with consumer
  • Add rest controller
  • Modify spring boot to work in async (parallel) mode

6. Module 6: Streaming Pipelines (Kafka Streams + KSQL + Kafka Connect vs Akka Streams vs Spark Streaming vs Flink), theory 2h / practice 1.5h

Homework:

  • Choose the way to read data from a Kafka topic with 50 partitions
  • Try to use the checkpoint mechanism
  • Start the five executors and kill some of them
  • Check the backpressure

7. Module 7: Kafka Monitoring, theory 2h / practice 1.5h

Homework:

Build several metrics in Grafana

Total: theory 14h (58%) / practice 10h (42%)


Leszek Gawron
  • Trainer

Leszek Gawron

Team leader, Java/Kotlin developer


Related courses

You may also be interested in

Discover more about professional growth and skills development

contact us