What is RDD in Spark | Hadoop Interview Questions and Answers | Big Data Interview Questions
We know that spark is a in memory computer system and we need to understand the components inside which provides a capability to process a data from spark to a memory, so the first component is RDD.
Welcome to Apache Spark interview questions and answers powered by Acadgild. Here in this video, Mr. Sudhanshu, a Data Scientist, explains the Hadoop interview questions and answers specifically on apache spark. If you have missed the master video of interview question and answers kindly, click the following link.
Top 20 Apache Spark Interview Questions – https://www.youtube.com/watch?v=Y8LKEDyA5iY
In this video the mentor will explain the answer for the following question.
What is RDD in Spark?
RDD is nothing but a resilient distributed dataset
What is Resilient?
Resilient is a property of a spark to achieve a fault tolerance in case of any job failure. To achieve the fault tolerance spark always try to create a linage graph it is nothing but a Meta information about a processes that a spark is going to execute inside its executor.
Thank you for watching the video. Please like, comment and subscribe the channel for more videos.
For more updates on courses and tips follow us on: