Reading Update
Hi there! I’ve gathered some articles and in the meanwhile I’ve been reading a bit about scala and also saving some papers for a “ligh read”.
Architectures
The topics are over reaching but in the data systems distributed systems is a must.
- The Pros and Cons of DRY Code
- The Evolution of Precomputation Technology and its Role in Data Analytics
- Patterns of Distributed Systems
Python
A new proposal around python dependencies in Introducing PDM. Another good one is poetry with some advantages presented on A poetic apology
Streaming
Kafka is one of those critical systems which has grown so much that we are having discussions like Kafka As A Database? Yes Or No. I’ve liked this overview of both sides and, although I actually need to read the book Kafka: the definitive guide this presentation Kafka as a Platform: The Ecosystem from the Ground Up by Robin Moffatt has given me a pretty great bird eye view.
Databases
I’m always on the lookout for new databases like scylladb (although I keep using the trusty postgresql) and for those that keep using select * the second article might be a good reason to avoid it.
- MongoDB vs Scylla at Numberly
- https://tanelpoder.com/posts/reasons-why-select-star-is-bad-for-sql-performance/
Orchestration
Coming back to airflow, Cloudfare as shown how truly great this technology can be for all kinds of needs in Automating data center expansions with Airflow
Organization
In the tech environment I’m working I’ve actually felt the consequences of the different types of companies shown in What Silicon Valley “Gets” about Software Engineers that Traditional Companies Do Not
More related to data teams I can concur with the views in How to Drive Effective Data Science Communication with Cross-Functional Teams that shows how importan is the communication of insights which is often overlooked.
Be well and stay safe :-)