Transformations Narrow Vs Wide

Transformations Narrow Vs Wide

Lesson objectives

In this lesson, we will explain the following topics:

  • Learn about the two types of Spark transformations: narrow and wide.
  • Understand the characteristics and benefits of narrow transformations.
  • Explore the implications and performance considerations of wide transformations.

Narrow and Wide Transformations

Introduction to Spark Transformations

  • Transformations create new RDDs from existing ones.
  • Spark has two types of transformations: Narrow and Wide.

What are Narrow Transformations?

  • Transformations that do not require data shuffling between partitions.
  • Examples: map(), filter().
  • Data processing is limited to a single partition.

What are Narrow Transformations?

Spark Narrow Transformations.
Figure 1: Spark Narrow Transformations.

Benefits of Narrow Transformations

  • Efficient with minimal data movement.
  • Best for independent data processing tasks.

What are Wide Transformations?

  • Transformations that involve shuffling data across partitions.
  • Examples: groupBy(), reduceByKey().

What are Wide Transformations?

Spark Wide Transformations.
Figure 2: Spark Wide Transformations.

Wide Transformations and Dependencies

  • Wide Dependencies: Require data from multiple partitions, often involving shuffling.
  • Examples: groupBy(), orderBy() - data is combined across partitions, affecting performance.
  • Impact: These transformations are necessary for operations like counting occurrences across a dataset.

Implications of Wide Transformations

  • Shuffling can be expensive in terms of time and network I/O.
  • Essential for aggregation and grouping operations.

Narrow vs. Wide Dependencies

  • Narrow Dependencies: A single output partition can be computed from a single input partition without data exchange.
  • Examples: filter(), contains() - operate independently on partitions.

Watch on Youtube

Watch on our Servers

You can download the videog the link and chose save link as: Download Video