Reading Update
Since March there was a lot happening. I’ve moved from Talkdesk to a startup (Fidel API) to help bootstrap a data team and got to read a lot of articles around a lot of things (will leave those on the readings section). I’ve also got my hands on a personal project to help setup a POS and update a windows form app to the most recent .NET version (why not XD). I’ve been revisiting my CSS + react skills and will dive a bit into framer motion and react spring to see if I can create a nice UI. Finally, I’ll be picking up a new project to try and deploy a full data stack that can be used by any team starting up (and should allow to scale up to some degree). Currently something mainly using DBT and with an open source database.
Readings
I think I’ve missed a couple of interesting ones but here are most of it (I’m thinking of automating this list through Instapaper API 🤔🧐)
Data Engineering
- Zapier: The $5B unbundling opportunity
- Humans destroyed forests for thousands of years
- SQL Notebooks: Combining the power of Jupyter and SQL editors for data analytics
- Build your data pipeline in your AWS modern data platform using AWS Lake Formation, AWS Glue, and dbt Core
- Evolution of Redash at Blinkit
- Parquet and Postgres in the Data Lake
- Data tests and the broken windows theory
- Apache Flink Table Store 0.1.0 Release Announcement
- Announcing D1: our first SQL database
- Friendlier SQL with DuckDB
- Choosing a Data Catalog
- JSON and virtual columns in SQLite
- Managing PII in DataHub: A Practitioner’s Guide
- Improving speed and stability of checkpointing with generic log-based incremental checkpoints
- The State of Data Engineering 2022
- Haunted by Data - Maciej Ceglowski
- Why We Need Hive Metastore
- Data Quality and Testing Frameworks
- The metrics layer has growing up to do
- Meet Dash-AB — The Statistics Engine of Experimentation at DoorDash
- Comparison of Data Lake Table Formats (Iceberg, Hudi and Delta Lake)
- Data Engineering Best Practices: How Netflix Keeps Its Data Infrastructure Cost-Effective
- Using Apache Kafka to process 1 trillion inter-service messages
- Democratizing Metrics at Airbnb
Engineering
- Switching from pyenv, rbenv, goenv and nvm to asdf
- You should be reading academic computer science papers
- The Joy of Small Projects
- How we lost 54k GitHub stars
- How to Write a Git Commit Message
- Why we don’t use a staging environment
- I’m All-In on Server-Side SQLite
- Shipping to Production
- Are You a Cargo Cult Programmer?
- What’s new in Python 3.11?
- Remote Development at Slack
- Using architectural decision records to streamline technical decision-making for a software development project
- A Minimum Viable Product Needs a Minimum Viable Architecture
As a side note I’ve been binge watching a lot of critical Role. Really loved it but it’s really time consuming.