Видео 136
Просмотров 855 664

How to read from APIs in PySpark codebase...

25:30

Data Engineering Interview at top product based company | First Round

40:07

First round of Data Engineering Interview at product based company

58:30

What is topic, partition and offset in Kafka?

27:24

Brokers in Apache Kafka | Replication factor & ISR in Kafka

21:22

Job, Stage and Task in Apache Spark | PySpark interview questions

21:39

Big Data Mock Interview | Data Engineering Interview | First Round of Interview

Data Engineering Mock Interview
Join Nisha, an experienced Data Engineering professional with over 5 years of experience, and Sai Varun Kumar Namburi for an exciting and informative Data Engineering mock interview session.
If you're preparing for a Data Engineering interview, this is the perfect opportunity to enhance your skills and increase your chances of success. The mock interview simulates a real-life interview scenario and provides valuable insights and guidance. The topics covered include #apachespark SQL, #snowflake, ETL pipelines, data modelling, database technologies, cloud platforms, and more. You'll get to see how professionals tackle technical questions and problem-solving cha...

Видео

How to read from APIs in PySpark codebase...

25:30

How to read from APIs in PySpark codebase...

Просмотров 89614 дней назад

PySpark mini project: Dive into the world of big data processing with our PySpark Practice playlist. This series is designed for both beginners and seasoned data professionals looking to sharpen their Apache Spark skills through scenario-based questions and challenges. Not all the inputs come from storage files like JSON, CSV and other formats. There can be cases where you are given a scenario ...

Data Engineering Interview at top product based company | First Round

40:07

Data Engineering Interview at top product based company | First Round

Просмотров 3,9 тыс.21 день назад

Data Engineering Mock Interview In top product-based companies like #meta #amazon #google #netflix etc, the first round of Data Engineering Interviews checks problem-solving skills. It mostly consists of screen-sharing sessions, where candidates are expected to solve multiple SQL and DSA problems, particularly in #python. We have tried to replicate the same things by asking multiple good SQL an...

First round of Data Engineering Interview at product based company

58:30

First round of Data Engineering Interview at product based company

Data Engineering Mock Interview Join a Staff Data Engineer & Senior Data Engineer for a wonderful Data Engineering Mock Interview. If you're preparing for a Data Engineering interview, this is the perfect opportunity to enhance your skills and increase your chances of success. The mock interview simulates a real-life scenario and provides valuable insights and guidance. It includes discussion o...

What is topic, partition and offset in Kafka?

27:24

What is topic, partition and offset in Kafka?

Просмотров 464Месяц назад

This is the third video of our "Kafka for Data Engineers" playlist. In this video, we have tried understanding the topic, partition and offset Apache Kafka in depth. Understanding and imagining Apache Kafka at its core is very important to understand its concept deeply. Stay tuned to all to this playlist for all upcoming videos. 𝗝𝗼𝗶𝗻 𝗺𝗲 𝗼𝗻 𝗦𝗼𝗰𝗶𝗮𝗹 𝗠𝗲𝗱𝗶𝗮: 🔅 Topmate (For collaboration and Scheduli...

Brokers in Apache Kafka | Replication factor & ISR in Kafka

21:22

Brokers in Apache Kafka | Replication factor & ISR in Kafka

Просмотров 282Месяц назад

This is the fourth video of our "Kafka for Data Engineers" playlist. In this video, we have tried to understand the brokers, replication factor and ISR. Understanding and imagining Apache Kafka at its core is very important to understand its concept deeply. Stay tuned to all to this playlist for all upcoming videos. 𝗝𝗼𝗶𝗻 𝗺𝗲 𝗼𝗻 𝗦𝗼𝗰𝗶𝗮𝗹 𝗠𝗲𝗱𝗶𝗮: 🔅 Topmate (For collaboration and Scheduling calls) - t...

Job, Stage and Task in Apache Spark | PySpark interview questions

21:39

Job, Stage and Task in Apache Spark | PySpark interview questions

Просмотров 946Месяц назад

In this video, we explain the concept of Job, Stage and Task in Apache Spark or PySpark. We have gone in-depth to help you understand the topic, but it's important to remember that theory alone may not be enough. To reinforce your knowledge, we've created many problems for you to practice on the same topic in the community section of our RUclips channel. You can find a link to all the questions...

Unlocking Apache Kafka: The Secret Sauce of Event Streaming

17:17

Unlocking Apache Kafka: The Secret Sauce of Event Streaming

Просмотров 622Месяц назад

This is the second video of our "Apache Kafka for Data Engineers" playlist. In this video, we have tried understanding Apache Kafka in brief and then we have tried understanding the real meaning of event & event streaming. Understanding and imagining Apache Kafka at its core is very important to understand its concept deeply. Stay tuned to all to this playlist for all upcoming videos. 𝗝𝗼𝗶𝗻 𝗺𝗲 𝗼...

Unleashing #kafka Magic: What Data Engineers Do with Apache Kafka?

25:45

Unleashing #kafka Magic: What Data Engineers Do with Apache Kafka?

Просмотров 1,3 тыс.Месяц назад

This is the first video of our "Apache Kafka for Data Engineers" playlist. In this video, we have tried discussing one real use case or big data pipeline involving Kafka which is often used in the E-Commerce industry like Amazon, Walmart etc. It is very important to understand some of the real use cases of Apache Kafka in the Data Engineering domain. I hope this video will set up the tone for t...

Repartition vs. Coalesce in Apache Spark | PySpark interview questions

19:22

Repartition vs. Coalesce in Apache Spark | PySpark interview questions

Просмотров 495Месяц назад

During a Data Engineering interview, you may be asked about concepts related to #apachespark . In this video, we explain the difference between Repartition and Coalece in Apache Spark or PySpark. We go in-depth to help you understand the topic, but it's important to remember that theory alone may not be enough. To reinforce your knowledge, we've created over ten problems for you to practice on ...

Apache Spark End-To-End Data Engineering Project | Apple Data Analysis

3:01:19

Apache Spark End-To-End Data Engineering Project | Apple Data Analysis

Просмотров 21 тыс.Месяц назад

Dive into the world of big data processing with our PySpark Practice playlist. This series is designed for both beginners and seasoned data professionals looking to sharpen their Apache Spark skills through scenario-based questions and challenges. Each video provides step-by-step solutions to real-world problems, helping you master PySpark techniques and improve your data-handling capabilities....

Sports Data Analysis using PySpark - Part 02

41:47

Sports Data Analysis using PySpark - Part 02

Просмотров 931Месяц назад

Narrow vs. Wide Transformation in Apache Spark | PySpark interview questions

29:21

Narrow vs. Wide Transformation in Apache Spark | PySpark interview questions

Просмотров 647Месяц назад

Narrow vs. Wide Transformation in Apache Spark | PySpark interview questions

Sports Data Analysis using PySpark - Part 01

41:14

Sports Data Analysis using PySpark - Part 01

Просмотров 1,3 тыс.Месяц назад

Sports Data Analysis using PySpark - Part 01

Big Data Mock Interview | Data Engineering Interview | First Round of Interview

51:35

Big Data Mock Interview | Data Engineering Interview | First Round of Interview

Просмотров 6 тыс.Месяц назад

Big Data Mock Interview | Data Engineering Interview | First Round of Interview

42:46

Data Engineering Interview

Просмотров 4 тыс.2 месяца назад

Data Engineering Interview

1:01:50

Data Engineering Interview | PySpark Questions | Manager behavioural questions

Просмотров 6 тыс.2 месяца назад

Data Engineering Interview | PySpark Questions | Manager behavioural questions

50:25

Data Engineering Interview at top product based company | First Round

Просмотров 11 тыс.2 месяца назад

Data Engineering Interview at top product based company | First Round

51:59

Big Data Mock Interview | Data Engineering Interview | First Round of Interview

Просмотров 7 тыс.3 месяца назад

Big Data Mock Interview | Data Engineering Interview | First Round of Interview

46:54

Big Data Mock Interview | Data Engineering Interview

Просмотров 16 тыс.3 месяца назад

Big Data Mock Interview | Data Engineering Interview

55:13

AWS Data Engineering Interview

Просмотров 21 тыс.3 месяца назад

AWS Data Engineering Interview

Data Engineering Interview | System Design

1:00:00

Data Engineering Interview | System Design

Просмотров 22 тыс.3 месяца назад

Data Engineering Interview | System Design

System Design round of #dataengineering interview

50:14

System Design round of #dataengineering interview

Просмотров 15 тыс.3 месяца назад

System Design round of #dataengineering interview

First round of Big Data Engineering #interview

1:18:51

First round of Big Data Engineering #interview

Просмотров 2,4 тыс.4 месяца назад

First round of Big Data Engineering #interview

System Design round of Data Engineering #interview at top product-based company

46:25

System Design round of Data Engineering #interview at top product-based company

Просмотров 41 тыс.4 месяца назад

System Design round of Data Engineering #interview at top product-based company

52:23

Big Data Mock Interview | First Round

Просмотров 27 тыс.4 месяца назад

Big Data Mock Interview | First Round

Data Engineering Mock Interview at Top Product Based Companies

1:24:50

Data Engineering Mock Interview at Top Product Based Companies

Просмотров 10 тыс.5 месяцев назад

Data Engineering Mock Interview at Top Product Based Companies

Data Engineering #mockinterview | Myntra | Part 2

1:24:56

Data Engineering #mockinterview | Myntra | Part 2

Просмотров 17 тыс.5 месяцев назад

Data Engineering #mockinterview | Myntra | Part 2

Data Engineering Mock Interview | Myntra | Part 1

34:05

Data Engineering Mock Interview | Myntra | Part 1

Просмотров 6 тыс.5 месяцев назад

Data Engineering Mock Interview | Myntra | Part 1

49:03

Data Engineering Mock Interview | Myntra

Просмотров 32 тыс.7 месяцев назад

Data Engineering Mock Interview | Myntra

@adityeshchaturvedi6553 День назад
Great explanation Ankur !!
@adityeshchaturvedi6553 2 дня назад
Great video Ankur. Being following your content and blogs via linked-In Congrats !!
@dhruvingandhi1114 2 дня назад
Hello I am getting error to read delta table that is on default at 01:21:50 IllegalArgumentException: Path must be absolute: default.customer_delta_table_persist.Please help me through that
@unknown_fact1586 2 дня назад
Please mention the experience of the interviewee either in caption or in thumbnail. It would be helpful
@Ravi-oy8zl 3 дня назад
The best playlist who wants to learn Kafka in data engineering domain. Every video has a clear cur explanation. Hope it will be continued. It can be a one stop tutorial for those who wants to learn kafka.
@TheBigDataShow 3 дня назад
Thank you for your kind words. I will continue this in a few days. Stuck in my work for many days. Hope I get free time soon
@teox4571 3 дня назад
thanks!
@sandeepmodaliar6980 6 дней назад
The Python Program is an interesting one, assuming a value in list as Store ID and k as the distance or proximity within which another store with the same ID shouldn't exist. We can store a list as value with count,start_indx, end_ indx. If count is 2 we check the diff i.e., end_indx-start_indx <=k or not if it is then return that value and iterate the dict for the others.
@tejachillapalli8812 11 дней назад
dictionary = dict() for index, value in enumerate(nums): if value in dictionary: if abs(dictionary[value] - index) <= k: print(True) break dictionary[value] = index else: print(False)
@siddheshchavan2069 12 дней назад
Great series, can you upload more such videos with complex problems and bigger datasets
@TheBigDataShow 12 дней назад
Please check the other video from the same playlist. We have uploaded near to 4 videos
@stylishsannigrahi 13 дней назад
sum_val=0 def sum_of_vals_recursive(my_dict): global sum_val if (type(my_dict)==dict):#processing logic for dictionary for k,v in my_dict.items(): if(type(v)==int or type(v)==float):#if the value is a simple int/float sum_val=sum_val + v else: sum_of_vals_recursive(v)#this will get invoked during 1st nested level else:#this bit of logic is for nested ones, where it is list, set, tuple from 1st nesting for x in my_dict:#iteration logic for set, tuple, list if(type(x)==int or type(x)==float):#type check of elements of the value of each iterated from previous step sum_val = sum_val + x else: sum_of_vals_recursive(x)#this one is for handling next set of nestng and then, it will again follow return sum_val; inp_dict = {'ireland':100,'india':[200,300,[400,[200,200],400]],'uk':{'scotland':[50]}} print(sum_of_vals_recursive(inp_dict))
@014_amitdwivedi6 15 дней назад
Sir in first pipeline I am getting error that str object has no attribute write
@TheBigDataShow 14 дней назад
Share the code snippet where you are getting errors and have you StackOverflow it?
@dante421 12 дней назад
Sir can u please reply to my question @@TheBigDataShow
@user-rh1hr5cc1r 16 дней назад
parquet store the file in hybrid format, not column based(80 % true)..
@650jitu 17 дней назад
Is the next video available ?
@TheBigDataShow 14 дней назад
I am a little busy these days. It will released in few days.
@RohanKumar-gx3iy 4 дня назад
@@TheBigDataShow its ok please take your time but please continue the next video you are teaching great and also show some practical implementation of apache kafka along with the theoritical concept it will be very helpful
@dante421 18 дней назад
Will i be able to switch into data engineering after watching and practicing the project ? Will i be able to tell my interview that i done this project in my current company?
@TheBigDataShow 12 дней назад
Yes but you have to work hard and learn all the concepts. Just completing one project will not help you to get a job. You have to learn multiple technology and frameworks for getting into Data Engineering domain.
@saladilakshminarayana9871 19 дней назад
can you please share the code ,dataset,api end point and also we are excepting one more session on optimal approach for this problem.
@VenkatesanVenkat-fd4hg 20 дней назад
Can you share data Dataset links?
@sarathkumar-tr3is 21 день назад
def distinct_ind(l,k): dict={} for i in range(len(l)): if l[i] in dict: if abs(i - dict[l[i]]) <=k: return True else: dict[l[i]]=i return False
@abhiksaha3451 21 день назад
Can you also setup data engineering interviews with respect to GCP ecosystem?
@mohitbhandari1106 21 день назад
I think the first sql can be done using group by as well instead of window function
@sarathkumar-tr3is 23 дня назад
2.SQL solution: select name from ( select e.name, DATEDIFF(day,p.promotion_date,l.leave_start) as d_diff from employee e join promotions p on e.employee_id = p.employee_id join leaves l on e.employee_id = l.employee_id) A where d_diff = 1;
@kiranmudradi3927 21 день назад
I think d_diff should be >=1. lets say if an employee got promoted on some date which falls on Friday. From Monday he is taking leave. which has d_diff =2 in this case this recored wont be counted right. Just sharing my thoughts of some edge case.
@sarathkumar-tr3is 21 день назад
@@kiranmudradi3927 hey thanks for covering that
@sarathkumar-tr3is 23 дня назад
1.SQL solution: select name from ( select e.name,d.department_name, DATEDIFF(day,e.hire_date,p.promotion_date) as day_count, rank() over(partition by d.department_name order by DATEDIFF(day,e.hire_date,p.promotion_date) desc) as Rank from employee e join promotions p on e.employee_id =p.employee_id join departments d on e.department_id = d.department_id) A where rank =1;
@DE_Pranav 23 дня назад
great questions, thank you for this video
@VishalSharma-lz6ky 23 дня назад
Awesome mock interview And the last question was very good How it saves time if you are reading from disk
@sarathkumar-tr3is 26 дней назад
it would be great if you attach the SQL and DSA questions in the comment or description
@TheBigDataShow 26 дней назад
Are the questions not clear from the video?
@shubhamkashid6919 26 дней назад
Please break down the video into topics.
@TheBigDataShow 26 дней назад
Done, please check and try to complete it
@shubhamkashid6919 26 дней назад
Okay Thanks you
@vishalbhandari8875 27 дней назад
WHAT AN EXPLANATION. KEEP UP THE GOOD WORK
@santypanda4903 Месяц назад
Is this the full video? Where is the link? I thought it got abruptly cut at the end.
@cuccuckute7758 Месяц назад
waiting for 70 days...
@Someonner Месяц назад
AMEX also asks the same question
@ashwinraje6520 Месяц назад
Just completed this project after a lot of debugging. Got to learn about factory design pattern. Is this pattern typically used in the production environments? Thank you Ankur for creating such a quality project!
@TheBigDataShow Месяц назад
Yes a lot. Try learning builder, singleton and companion, low level design now.
@gagansingh3481 Месяц назад
Where do we learn pyspark from scratch to advance with databricks
@abhisheknigam3768 6 дней назад
Learn with wafastudies on RUclips.
@anshusharaf2019 Месяц назад
Hey Ankur This side Anshu, First of all, thanks for your amazing effort I'm a little bit confused about the source file (Extraction part) You explained to us in the videos We have used sources like CSV, Parquet, and Delta Table. But this is the type of file where you keep the data as a source then what is the Actual Source of data? For example, we have some ABC database I export the data in CSV or parquet and other file formats But my data source would be ABC Data Base) is it the right way I think? @Ankur
@MohitKumar-ex1pk Месяц назад
do really Interviewer give this much leverage, In most of the Interviews i gave, I was not even allowed to use any IDE, I have to write the code in Notepad. I'll suggest all the budding Data Engineers to practice and remember at least the basic syntax. window, to_date(), groupby() are very common and used extensively. \
@TheBigDataShow Месяц назад
Not always, but in some of the interviews they are allowing it. Our aim is to demonstrate more Interview related problems so that it can help interviewees in their preparations.
@MohitKumar-ex1pk Месяц назад
@@TheBigDataShow No doubt in that, being a experienced DE, I find these mock test questions very relatable, you guys are doing a great job :-)
@debabratabar2008 Месяц назад
is below correct ? df_count = example_df.count() ----> transformation example_df.count() ---> job ?

The Big Data Show

Видео

Комментарии