Hey you all! 👋🏻
It's Josep, back again this week!
Today I’m writing from Ito, relaxing for a couple of days in this gorgeous Onsen land! However, during the last week I’ve been visiting Kyoto and Osaka (after spending a whole week in Tokyo).
My trip is about to come to an end (the 28th I’m flying back to Europe) and I couldn’t be more excited to return to my routine!!
Today, stay with me for 5 minutes to dive deeper into SQL—trust me, it’ll be worth your time!
Now that we’ve caught up on life updates, let's dive into the important stuff 👨🏻💻.
In my previous issue, we discussed the importance of understanding SQL’s execution order— today, I want to further understand SQL with another core concept.
Modular queries.
Actually this is a concept that applies to all forms of coding.
Before we dive in, I strongly recommend checking out my previous issue on Understanding SQL’s Execution Order (if you haven't already).
And now…
Let’s get started with the funny part 💥
Reusable and Understandable Queries
Modular coding is a best practice in any programming language, and SQL is no exception.
Have you ever wondered how to create SQL queries that are both reusable and easy to understand?
Enter Common Table Expressions (CTEs) — the secret to modular SQL!
But first… why does this matter?
Imagine you're tasked with calculating key metrics for a department in your company, but you’ve inherited a mess of queries from a former colleague who left months ago. Suddenly, you’re facing a nightmare: the queries are disorganized, and you can’t make sense of any of them.
Here are the typical issues you’ll encounter:
Lack of documentation: Every query follows its own unique structure, making it hard to trace logic.
Inconsistent metrics: Without standardized CTEs, metrics are computed in different ways, leading to errors.
Poor collaboration: If others can’t understand your queries, it’s hard to work together effectively.
Think of CTEs as temporary helpers that cleanly organize your query’s logic into structured, readable sections, making your SQL both reusable and easy to follow.
So let’s start from the beginning…
#1. What are CTEs?
CTEs, or Common Table Expressions, are temporary result sets that can be referenced multiple times within the same SQL query.
They always start with the WITH command, which helps make SQL queries modular and more readable.
CTEs act as the building blocks for SQL modules.
Each module, defined within a CTE, can be considered a named result set. These temporary tables can be referenced multiple times within the same query, giving them modular capabilities.
You can chain multiple CTEs together, and even nest them, as in the following example:
Here, Table1 and Table2 can be reused throughout the query, turning SQL into self-documenting code.
And what is more important, they can be used in other queries to calculate the exact same metrics… which standardize the computing process and makes sure all results are exactly the same.
#2. The WITH command
The WITH command is a powerful tool in SQL that allows you to define CTEs at the beginning of a query. This enables you to break down complex logic into multiple, reusable temporary tables.
In this example:
Table1 and Table2 are defined using the WITH command.
They act like modular building blocks.
The main query can reference these CTEs as temporary tables, joining them together or applying additional logic.
#3. Reusable Temporal Tables
CTEs also enable temporal table creation.
This means that a new CTE can be created using data from previously defined CTEs. This makes CTEs highly reusable and adaptable to changing logic.
For example, a third table can be generated based on the results of previous CTEs:
These modular queries allow you to create, reference, and manipulate multiple datasets on the fly.
So you can think of them as LEGO blocks, snapping together to form the bigger picture—one that's intuitive, maintainable, and oh-so-slick.
Summary
To sum it up, the main benefits of using CTEs in your queries are:
Modularity: Queries are easier to manage because you can break them into logical units.
Reusability: Once defined, you can use a CTE multiple times within the same query.
Readability: It simplifies large, complex queries by dividing them into discrete, manageable steps.
By starting with the WITH command, you can chain multiple CTEs, giving each one a clear alias. This turns SQL queries into self-explanatory code, making them easier to understand, debug, and maintain.
Next Sunday I’ll share with you an example of modular SQL query… so stay tuned!!
Are you still here? 🧐
👉🏻 I want this newsletter to be useful, so please let me know your feedback!
Additionally, you can let me know any preference for future content or any idea you think might add more value to you!
My latest articles 📝
You can go check the whole theory in the following article and you can find the cheat sheet in the following post.
30 Azure Interview Questions: From Basic to Advanced in DataCamp.
The Transformers Architecture - What’s the magic behind LLMs? in AIgents.
Nice articles (my weekly favs)! ♥
This week I haven’t had time to read any, but next issue will come with its habitual set of weekly favs 😉
My Favorite Resources! 👨🏻💻
I strongly recommend following AIgents roadmaps if you want to become a full Data Scientist, ML Engineer, or Data Engineer,
Understand ML models following my MLBasics series!
You can follow the Data Engineer Zoomcap on GitHub to become a fully proficient Data Engineer.
Want to learn GCP? You can follow The Cloud Girl and learn using her intuitive illustrations!
Want to get more of my content? 🙋🏻♂️
Reach me on:
LinkedIn, X (Twitter), or Threads to get daily posts about Data Science.
My Medium Blog to learn more about Data Science, Machine Learning, and AI.
Just email me at rfeers@gmail.com for any inquiries or to ask for help! 🤓
Remember now that DataBites has an official X (Twitter) account and LinkedIn page. Follow us there to stay updated and help spread the word! 🙌🏻