List: Data Lake | Curated by Krut Patel | Medium

Jun 6, 2024

6 stories

Data Lake

Daniel Craciun

Stop Using UUIDv4 in Your Database

How UUIDs can Destroy SQL Database Performance

May 16, 2024

a lock shooting energy into a database table field causing the field to grow massive in size

May 16, 2024

In

Data Engineer Things

by

Kerrache Massipssa

Apache Spark Partitioning and Bucketing

Learn the Partitioning and Bucketing with Apache Spark (PySpark) and understand how and when to use each of them.

Dec 14, 2023

Apache Spark Partitioning and Bucketing

Dec 14, 2023

Shruti Ghoradkar

Spark Scenario-Based Interview Questions Part II.

link of part I: https://medium.com/me/stats/post/1fd3485c2911

Feb 4, 2024

Spark Scenario-Based Interview Questions Part II.

Feb 4, 2024

Vengateswaran Arunachalam

Mastering Spark Memory Allocation for 1 Billion Rows

Processing big data efficiently in Spark is an art. Here’s how you can estimate the memory needed for processing a 1 billion row table with…

Nov 14, 2023

Mastering Spark Memory Allocation for 1 Billion Rows

Nov 14, 2023

In

Towards Data Mesh

by

Amine Kaabachi

2023 — Rockstar Data Engineer Roadmap

This article presents a roadmap for those who want to become Data Engineers in 2023. It also serves as a reference to learn and improve…

Jan 1, 2023

2023 — Rockstar Data Engineer Roadmap

Jan 1, 2023

In

TDS Archive

by

💡Mike Shakhomirov

Data pipeline design patterns

Choosing the right architecture with examples

Jan 2, 2023

Data pipeline design patterns

Jan 2, 2023

Krut Patel

Krut Patel

Machine Learning Engineer | Computer Vision | iamkrut.github.io

Following

Help
Status
About
Careers
Press
Blog
Privacy
Terms
Text to speech
Teams