Quick intro! :D
Hello, I am Josep
From Barcelona
I am full time Analytics engineer & Part-time content creator on Medium
Follow me for more content on SQL
Python
Analytics
Data Science
Quick intro! :D
Hello, I am Josep
From Barcelona
I am full time Analytics engineer & Part-time content creator on Medium
Follow me for more content on SQL
Python
Analytics
Data Science
My number one #Mastodon tip for people is to boost more!
Discovery sucks on Mastodon, which is partially by design, but it also means it’s hard to find new and interesting people here. If you see something cool, reach for that boost button as well!
Do you need to program on Python but would like to have the simplicity of SQL syntax?
Use pandasql! A python library that allows you to query in SQLite directly against any pandas DataFrame.
Do you want to start writing online but do not know where to start?
Start writing on Medium!
It is the easiest platform to start off from scratch with a pre-build audience.
What I've learned working on a startup
- Expectations will determine your commitment.
- You will have a wide view of the company - which is an extra motivation boost!
- But... you will be working extra hours. If you do not do the work, no one else will.
5/ "Data Science is a hype"
Far from it - data-driven decision-making is here to stay
4/ "Correlation = Causation"
Remember, correlation doesn't imply causation!
3/ "All ML models are equally good"
Different models serve different purposes.
Always choose wisely
2/ "Data scientists are glorified statisticians"
It is a multidisciplinary field combining stats, programming, domain expertise & more
1/ "More data = better results"
Quality > Quantity
Always!
Clean, relevant data is key
What have you been learning recently?
This last week I started again to enhance my GitHub skills! :D
Nice random projects I have found online. An interactive world map that let you know where other indie makers are! :)
firstinternetdollar.com
That's a wrap!
1. Follow me for more of these.
2. Share this thread with your audience.
5/ By avoiding these common Data Engineering pitfalls, you can ensure your projects are efficient, reliable, and scalable.
Prioritize data quality, privacy, security, and governance while focusing on scalability and modern architecture.
4/ Don't discount Scalability and Modern Architecture
Design data platforms and processes with scalability and stability in mind, considering the increasing data volumes and use cases.
Adopt cloud-based solutions like Data Lakehouses and Data Mesh.
3/ Don't overlook Data Governance
Implement robust data governance procedures to ensure data accuracy, consistency, and compliance with regulations and standards.
Good data governance helps prevent data inconsistencies, duplication, and poor quality.
2/ Don't ignore Data Privacy and Security
Use secure methods to transmit, store, and process data to avoid breaches and comply with regulations like GDPR and CCPA.
Implement encryption, access controls, and monitoring tools.
1/ Don't overlook Data Quality
Ensure accurate, consistent, and reliable data through validation, testing, profiling, cleansing, and monitoring.
High-quality data is essential for accurate insights and decision-making.
Discover 4 common pitfalls to avoid in Data Engineering to ensure efficient, reliable, and scalable projects.
- A thread -
6/ Key Takeaways
There's no one-size-fits-all answer.
Reflect on your priorities, career stage, and the specific company culture.
Remember, you can always switch if something isn't working out.
Let me know your experiences in the comments!