Apache Spark Fundamentals
Durată
26
ore
Locație
Pe net
Limba
Engleză
Cod
EAS-017
Training pentru 7-8 sau mai multe persoane?
Personalizați antrenamentele
pentru nevoile dumneavoastră specifice
Descriere
După finalizarea cursului, se eliberează un certificat
în formularul Luxoft Training
în formularul Luxoft Training
Obiective
During the training participants will:
- Write a Spark pipeline via functional Python and RDDs;
- Write a Spark pipeline via Python, Spark DSL, Spark SQL and DataFrame;
- Draw architecture with different sources;
- Write a Spark pipeline with external systems (Kafka, Cassandra, Postgres) which works in parallel modes;
- Resolve problems with slow joins.
After the training, participants will be able to build a simple PySpark application and execute it on the cluster in parallel mode.
Public țintă
- Software developers
- Software architects
Cerințe preliminare
Basic Java, Python, Scala programming skills. Unix/Linux shell familiarity. Experience with databases is optional.
Foaia de parcurs
- Spark concepts and architecture
- Programming with RDDs: transformations and actions
- Using key/value pairs
- Loading and storing data
- Accumulators and broadcast variables
- Spark SQL, DataFrames, Datasets
- Spark Streaming
- Machine Learning using MLLib and Spark ML
- Graph analysis using GraphX