Introduction to Python for Big Data Engineering with PySpark

Deal Score+5
Deal Score+5

The key objectives of this course are as follows;

  • Learn Spark Architecture

  • Learn Spark Execution Concepts

  • Learn Spark Transformations and Actions using the Structured API

  • Learn Spark Transformations and Actions using the RDD (Resilient Distributed Datasets) API

  • Learn how to set up your own local PySpark Environment

  • Learn how to interpret the Spark Web UI

  • Learn how to interpret DAG (Directed Acyclic Graph) for Spark Execution

The Python Spark project that we are going to do together;

Sales Data

  • Create a Spark Session

  • Read a CSV file into a Spark Dataframe

  • Learn to Infer a Schema

  • Select data from the Spark Dataframe

  • Produce analytics that shows the topmost sales orders per Region and Country

Author : Thulani Mngadi

Ratings : 0.0 / 5.0

Students : 120 students

Category : IT & Software, Other IT & Software, PySpark