Hey you all!
Itβs Josep here one more week! ππ»
Today, I'm writing from a charming beach town just an hour away from Barcelona. I needed some peace and rest to plan the upcoming weeks (especially after August 15th π₯), so Iβll be staying here for a while!
The most important news of the week?
I am starting brand-new X (Twitter) and LinkedIn accounts for DataBites! Follow them for the latest insights and to help spread the word about our newsletter π€
In other exciting news, I'm collaborating with my colleague Abid Ali Awen on an interesting - and exciting - new ML project. Stay tuned for more updates on this endeavor.
Another significant milestone is my upcoming role as the digital ambassador for a company I truly admire. I'll share more details next week once everything is finalized.
Lastly, a HUGE THANK YOU to the 250 of you who voted in my previous issue poll.
So⦠what were the results?
200 of you voted to get new content on YouTube!
This is why Iβm thrilled to announce that Iβll be planning my first-ever YouTube video this week, with a target release in the second week of August. Wish me luck π€π»!
Now that weβve caught up on life updates, let's dive into the important stuff π¨π»βπ».
Following up on my second DataBites issue, today I want to delve into the art of mastering Data Visualization (or DataViz for friends π)
Why is this important?
Understanding data visualization is essential for anyone working with data. No matter how technically skilled you are, if you can't effectively communicate your insights and findings to others, your efforts may fall short.
Data visualization, or making data look good in charts and graphs, might not seem as cool as stuff like machine learning.Β
But, it's really a key part of what a Data Scientist does.Β
It serves as a crucial bridge, allowing us to present our work to non-technical audiences in an understandable way. Even those with technical expertise benefit from clear, well-designed charts and graphs.
When discussing DataViz, certain questions frequently arise.
But whatβs the magic behind a good visualization?
Why does one visualization enlighten while another confuses?
Today, we're returning to the basics to understand the fundamentals of data visualization.
Breaking DataViz to Its Basics
Mastering how to tell a story efficiently is one of the hardest skills to master as a data professional. If we check the term Data Visualization in a dictionary, we find the following definition:
βThe act of representing information as a picture, diagram or chart, or a picture that represents information in this wayβ
This basically means that Data Visualization aims to craft a story from the dataset, presenting insights in a form thatβs digestible, appealing, and impactful.Β
Any chart is always composed of two main components:Β
#1. Data TypesΒ
I bet you are thinking of data as numbers, but numerical values are only two out of several types of data we may encounter.
This is why, whenever we visualize data, we should always consider what types of data we are dealing with.Β
Any displayed data can always be described in one of the following seven types of data:
Numerical or quantitative
Discrete: Countable values, often whole numbers. Example: Number of students in a class, number of cars in a parking lot.
Continuous: Values within a range, can be measured but not counted. Example: Height, weight, temperature.
Categorical or qualitative.Β
Nominal: Categories with no inherent order. Example: Colors (red, blue, green), gender (male, female).
Ordinal: Categories with a meaningful order but no fixed intervals between them. Example: Rankings (first, second, third), satisfaction levels (satisfied, neutral, dissatisfied).
Date and Time: Data points can be collected or recorded at specific time intervals. Example: Stock prices over time, daily temperature readings.
Geographical: Data related to locations on the Earth. Example: Coordinates, geographic maps.
Text: Unstructured data in the form of text. Example: Emails, social media posts, articles.
Once we have a clear what kind of data we have, we need to understand how to encode this data into final charts.Β
#2. Encoding Information: The Visual Lexicon
Visual encoding is at the core of data visualization. It translates abstract numbers into graphical representations, a language weβre all fluent in.
Showing shapes and visuals is like speaking a universal visual language, where everyone, regardless of their background in data analysis, can interpret the information at a glance.
Butβ¦ as you already must be aware ofβ¦
There are thousands of ways to encode numbers!
There are two main groups:
Retinal Encodings: From shape, size, colors, and intensity, these are elements our eyes catch instantly. They are inherent to the element.
Spatial Encodings: They exploit our brainβs cortexβs spatial awareness to encode information. This kind of encoding can be achieved through position in a scale, a defined order or using relative sizes.Β
With all the previously explained encodings, we could use all of them in one chart but it would be hard for the reader to grasp all the information quickly.
Overloading a chart with multiple encodings can be confusing so 1 or 2 retinal encodings per chart is optimal.
Always remember that less is often more, so always try to create minimalist and easy-to-understand charts.Β Now I now most of you must be wonderingβ¦
Which encoding should I choose?Β
That, my friends, depends on the story you want to weave.Β
So you could better askβ¦
What Works and What Doesnβt?Β
While the visual arsenal at our disposal is vast, not all weapons are fit for every battle.
Think about what encodings are best for what kind of variable.Β
Continuous data variables, like weight and height, their best representation often comes from positioning them on a scatter plot. This approach effectively communicates variations and relationships within the data.
Discrete ones, such as gender or nationality, shine when depicted by colors or spatial regions. This distinction allows these variables to stand out clearly and be easily interpreted in the context of the visualization.
The Essence of a Good Visualization
A striking balance between aesthetics and functionality is pivotal. And while some encodings may seem less effective, they can be chosen deliberately to make a statement or evoke an emotion.
In our age of an ever-increasing flow of data, the importance of crafting meaningful visual narratives cannot be overstated.
Especially when trying to communicate our insights to non-data professionals.Β
Good data visualization isnβt just about presenting numbers, but instead trying to articulate our data around a story. Bringing our data to life while telling stories, and forging connections between raw information and real-world implications and insights.Β
As data lovers, itβs our art, our language, and our bridge to the whole world.
So next time you craft a chart, rememberβ¦
You're not just a data professional, You're a storyteller!
Are you still here? π§
ππ» I want this newsletter to be useful for everyone, soβ¦
Let me know any preference for future content.
If you have any suggestions or preferences for the newsletter to be more useful, feel free to let me know!
My latest articles π
5 Simple Steps to Automate Data Cleaning with Python in KDnuggets
7 Steps to Mastering Large Language Model Fine-tuning in KDnuggets.
Getting Started with LLMOps: The Secret Sauce Behind Seamless Interactions in KDnuggets.
How to Fine-tune and Deploy LLMs 10x faster using Natural Language with MonsterGPT in Medium.
Recommendations! β₯
What Would I do to LLM from the Beginning by
What's The Difference Between NumPy's `arange()` and `linspace()` by
This SQL mistake fools even experienced Data Scientists by
Want to get more of my content? ππ»ββοΈ
Reach me on:
Great issue Josep! And thank you for the shoutout β¨