Spark Executor and its memory

In the Spark Application Architecture post, we discussed Apache Spark architecture concepts. As we could see, tasks are the fundamental unity of work in Spark, and we are going to use it here to talk about Spark Executor and its memory. In the section “Tasks and Partitions”, we are going to see the relation among tasks, partitions and the hardware. In the second section, “On-Heap and Off-Heap Memory”, we talk about the executor memory with a special focus on the On-Heap memory. In third part, “Reserved, Unified and User Memories”, we describe better the On-Heap memory and how it’s used. In the fourth, “Unified Memory: Storage and Execution”, we unveil some details about how this memory behaves accordingly to the size of objects being stored in it. ...

September 24, 2024 Â· Leandro Kellermann de Oliveira

Apache Spark Application Architecture

In this post, I’d like to show some concepts for better understanding of Apache Spark applications. Most of the content here is available in many books, blog posts, paid courses and free YouTube videos. Here, I just compiled these materials and added some important details regarding my experience. This text is divided in three sections. The first section, “Apache Spark Components Overview”, I present the basic Apache Spark components and their respective roles when executing an application, as well as the composition of an Apache Spark application. In the second section, “Actions, Transformations and Lazy Evaluation”, I discuss these three important concepts that are frequently mentioned in the first section, as well as in every text about Apache Spark. The third section is the Conclusion, where I wrap up the previous sections. ...

September 22, 2024 Â· Leandro Kellermann de Oliveira