Reading Feed

Articles I've read with my notes and highlights

Why I’m not a fan of zero-copy Apache Kafka-Apache Iceberg by Jack Vanlightly
Introducing OpenZL: An Open Source Format-Aware Compression Framework by Chris Wiltz
  • However, while it was improved over time, remaining within the Zstandard framework offers diminishing returns. So we started looking for the next great leap in data compression.
  • General compressors rely on a one-size fits all processing strategy, or alternatively spend a lot of their cycles guessing which techniques to use
  • As a user, you provide OpenZL with the data shape (via a preset or a thin format description). Then the trainer, an offline optimization component, builds an effective compression config that can be re-employed for similar data. During encoding that config resolves into a concrete decode recipe that’s embedded into the frame. The universal decoder will directly execute that recipe, without any out-of-band information.
  • Describe the input: With the Simple Data Description Language (SDDL), you sketch how the bytes map to fields — rows, columns, enums, nested records. SDDL is for parsing only; it just tells OpenZL the shape of your data. Alternatively, you can write your own parser function directly using one of the supported languages, and register it with OpenZL to delegate the logic.
Data Inlining by GitHub User
  • can be wasteful to write each changeset to an individual Parquet file.
State of AI | OpenRouter
How Tables Grew a Brain: Iceberg, Hudi, Delta, Paimon, DuckLake by Anton Borisov
  • The lake stops acting like a filesystem with extras and starts behaving like a database fronting an object store
  • Choose your ride (intent over brand):Need stable, open analytics across engines? Start with Iceberg’s snapshot model.Need continuous upserts/deletes with low-lag views? Add Paimon/Hudi’s incremental/LSM ideas where it matters.Need multi-table ACID and fast planning? Move the brain into a catalog database (DuckLake-style).
Stacked Diffs with git rebase --onto by Dinesh Pandiyan
  • Stacked diffs solve this by breaking your work into smaller, dependent PRs:
  • git rebase —onto ↑ ↑ ↑ new parent old parent branch to rebase
  • The marker branch pattern takes the guesswork out of tracking the old base. Use it, update it, and your stacked diffs will stay clean.
  • Force pushes are required: Every rebase changes commit hashes, so you’ll be doing git push —force-with-lease a lot.
  • Marker branches need discipline: If you forget to update your marker, your next sync will be painful. Consider aliasing the full command: 1alias gsync=‘git rebase —onto $1 $2-base $2 && git branch -f $2-base $1’
  • Merge conflicts multiply: If you have conflicts when rebasing feature-1, you might hit them again when rebasing feature-2. That’s the nature of the beast.
  • Don’t stack too deep: Two or three levels is manageable. Beyond that, the maintenance overhead outweighs the benefits. I personally try to keep it at 2 levels max.
How agents can use filesystems for context engineering by LangChain Accounts
The Continual Learning Problem
Welcome to the age of $10/month Lakehouses by Tobias Müller
Mistakes I see engineers making in their code reviews
The DuckLake Manifesto: SQL as a Lakehouse Format by GitHub User
  • Data Compaction Avoidance: DuckLake requires far fewer compaction operations than comparable formats. DuckLake supports efficient compaction of snapshots.
  • changes to the data, DuckLake can optionally use the catalog database to store those small changes directly to avoid writing many small files.
Amazon Bedrock now supports reinforcement fine-tuning delivering 66% accuracy gains on average over base models
  • Models learn to align with your specific requirements using a small set of prompts rather than the large sums of data needed for traditional fine-tuning methods, enabling teams to get started quickly
  • You can define reward functions using verifiable rule-based graders or AI-based judges along with built-in templates to optimize your models for both objective tasks such as code generation or math reasoning, and subjective tasks such as instruction following or chatbot interactions. Your proprietary data never leaves AWS’s secure, governed environment during the entire customization process, mitigating security and compliance concerns.
Introducing AWS DevOps Agent (preview), frontier agent for operational excellence
AWS Lambda announces durable functions for multi-step applications and AI workflows
Amazon EMR Serverless eliminates local storage provisioning for Apache Spark workloads
Amazon S3 Vectors is now generally available with 40 times the scale of preview
  • With general availability, you can store and query up to two billion vectors per index and elastically scale to 10,000 vector indexes per vector bucket
  • Infrequent queries continue to return results in under one second, with more frequent queries now resulting in latencies around 100 milliseconds or less
  • application can achieve write throughput of 1,000 vectors per second when streaming single-vector updates into your indexes, retrieve up to 100 search results per query, and store up to 50 metadata keys alongside each vector for fine-grained filtering in your queries.
  • You can also tag vector buckets and indexes for attribute-based access control (ABAC) as well as to track and organize costs using AWS Billing and Cost Management
How Would You Like Your Iceberg Sir? Stream or Batch Ordered? — Jack Vanlightly by Jack Vanlightly
  • We call the reading from the historical source, bootstrapping
  • Fluss is a streaming tabular storage layer built for real-time analytics which can serve as the real-time data layer for lakehouse architectures
  • Fluss uses its own offset (akin to the Kafka offset) as the Iceberg sort order. This ensures that when Flink reads from Iceberg, it sees a temporally ordered sequence
  • It’s true some late-arrival management is harder but that’s usually overengineering
AI vs Gen Z: How AI has changed the career pathway for junior developers - Stack Overflow
Simple Control Flow for Automatically Steering Agents
  • Embedding environment state validation directly into control flow ensures the agent continues until either:

The task is genuinely complete, or

The token budget is exhausted

Streaming Patterns with DuckDB by Guillermo Sanchez
Apache Parquet vs. Newer File Formats (BtrBlocks, FastLanes, Lance, Vortex) by Dipankar Mazumdar
  • AI pipelines require fast feature retrieval, vector search, and low-latency scoring.
  • Storage has changed — NVMe-backed systems and memory-mapped datasets call for fine-grained, cache-friendly data access.
  • Row groups and pages: Within this columnar layout, Parquet organizes data into row groups (commonly ~128 MB), which are further divided into column chunks and smaller pages. This structure define clear, fixed-size chunks of data and makes it easier for query engines to parallelize scans and skip over unneeded sections, improving efficiency at scale.
  • Encodings and compression: Each page can use type-specific encodings, such as Dictionary, Run-Length Encoding (RLE) or Delta encoding , combined with block compression (Snappy, Zstd, LZ4, GZIP). This two-layer design provides both speed and compactness.
  • Statistics and filtering. Parquet stores per-page and per-column statistics such as min/max values, null counts, and distinct counts. These allow query engines to skip pages or entire row groups when predicates fall outside recorded ranges. Parquet also supports dictionary filtering (using dictionary values for comparisons) and optional bloom filters for selective reads. Together, these features make predicate pushdown highly effective.
  • RAG workloads are especially sensitive to random access performance, since each query may need to fetch small slices of data across massive corpora stored on NVMe
  • BtrBlocks, developed at TUM, introduces the idea of cascaded lightweight compression (LWC). Instead of relying on heavyweight compressors like Zstd, it uses chains of lightweight encodings (bit-packing, dictionary, frame-of-reference). A greedy, sample-based algorithm selects the best chain per column segment.
  • Compressed execution. Instead of fully materializing decoded vectors, FastLanes returns compressed vectors directly to engines (e.g. DuckDB, Velox), allowing SIMD/GPU-friendly execution on compressed data.
  • Repetition index. Enables random access in 1–2 IOPS per lookup, independent of nesting depth. This is a dramatic improvement over Parquet, which scales poorly for nested data.
  • Meta introduced Nimble, a new columnar file format optimized for machine learning feature stores and very wide tables (thousands of columns). Its design goals:Lightweight metadata for handling extremely wide schemas.Cascaded encodings, with support for SIMD and GPU acceleration.Portable implementation for consistent decoding across engines.
  • For practitioners, the question becomes when to rely on Parquet’s universality and when to reach for a specialized format to unlock specific benefits
Anthropic acquires Bun by Simon Willison
Amazon S3 Tables now offer the Intelligent-Tiering storage class
Benchmarking read latency of AWS S3, S3 Express, EBS and Instance store by Roman Grebennikov
10 Smart Performance Hacks For Faster Python Code | The PyCharm Blog by Evgenia Verbina
  • Hack 2: Avoid unnecessary copies

Copying large objects like lists, dictionaries, or arrays can be costly in both time and memory. Each copy creates a new object in memory, which can lead to significant overhead, especially when working with large datasets or within tight loops.

  • Hack 4: Use math functions instead of operators

For numerical computations, Python’s math module provides functions that are implemented in C, offering better performance and precision than equivalent operations written in pure Python.

  • Hack 6: Avoid exception handling in hot loops

While Python’s exception handling is powerful and clean for managing unexpected behavior, it’s not designed for high-frequency use inside performance-critical loops. Raising and catching exceptions involves stack unwinding and context switching, which are relatively expensive operations

  • Hack 9: Use bisect for sorted list operations

When working with sorted lists, using linear search or manual insertion logic can be inefficient – especially as the list grows. Python’s bisect module provides fast, efficient tools for maintaining sorted order using binary search.

  • Hack 10: Avoid repeated function calls in loops

Calling the same function multiple times inside a loop – especially if the function is expensive or produces the same result each time – can lead to unnecessary overhead. Even relatively fast functions can accumulate significant cost when called repeatedly in large loops.

Amazon CloudWatch incident reports now support Five Whys analysis
  • The capability leverages both human input and AI-based analysis of incident data to recommends specific measures operators can take to prevent future occurrences and improve their operations.
  • You can create an incident report by first creating a CloudWatch investigation and then clicking “Incident report”.
PostgreSQL grouping sets: ROLLUP & CUBE by Hans-Jürgen Schönig
  • ROLLUP is useful if you want to add the “bottom line”. However, you often want to see all combinations of countries and products. GROUP BY CUBE will do exactly that
On Idempotency Keys - Gunnar Morling
  • We can somewhat improve this situation by adding a timestamp to the idempotency key, for instance by using a UUIDv7 which contains both a timestamp part (first 48 bits) and a random part (remaining bits), or an ULID. That way, the consumer can detect when it receives a message with an idempotency key which is “too old”.
  • All these intricacies can be avoided when it is possible to use a monotonically increasing sequence value as the idempotency key.
  • log sequence numbers (LSN)
  • For many scenarios, using UUIDs and dropping them after some time will probably be sufficient, provided you can tolerate that messages occasionally can be processed a second time when duplicates arrive after the retention period of processed keys.
  • The more messages you need to process overall, the more attractive a solution centered around monotonically increasing sequences becomes, as it allows for space-efficient duplicate detection and exclusion, no matter how many messages you have
  • The proposed log-based approach can be an efficient solution for doing so, but it also adds operational complexity: your database needs to support logical replication, you need to run a CDC connector, etc. However, many organizations already operate CDC pipelines for other purposes (analytics, search indexing, cache invalidation, etc.). If you’re in that category, the incremental complexity is minimal
Accelerate data lake operations with Apache Iceberg V3 deletion vectors and row lineage
Data-at-Rest Encryption in DuckDB by Lotte Felius, Hannes Mühleisen
  • Starting with DuckDB 1.4.0, DuckDB supports transparent data encryption of data-at-rest using industry-standard AES encryption.
  • The user itself is responsible for the key management and thus for using a secure key
What it means to get your data ready for AI | by Lak Lakshmanan | Nov, 2025 | AI Advances by Lak Lakshmanan
The Case Against pgvector | Alex Jacobs by Alex Jacobs
  • Each insert acquires locks on the graph structure. Under heavy write load, this becomes a bottleneck
  • Pre-filter works great when the filter is highly selective (1,000 docs out of 10M). It works terribly when the filter isn’t selective—you’re still searching millions of vectors.
  • Timescale has released pgvectorscale, which addresses some of these issues. It adds:StreamingDiskANN, a new search backend that’s more memory-efficientBetter support for incremental index buildsImproved filtering performance
  • But here’s what I’ve learned: for most teams, especially small teams, dedicated vector databases are actually simpler
  • Index management is hard. Rebuilds are memory-intensive, time-consuming, and disruptive. Plan for this from day one
  • Query planning matters. Filtered vector search is a different beast than traditional queries, and Postgres’s planner wasn’t built for this.
  • Real-time indexing has costs. Either in memory, in search quality, or in engineering time to manage it.
Measuring what matters: How offline evaluation of GitHub MCP Server works by Ksenia Bobrova
  • Offline evaluation catches regressions before users see them and keeps the feedback loop short, so we can ship changes that genuinely improve performance
Frozen DuckLakes for Multi-User, Serverless Data Access by Mark Harrison (Madhive Data Engineering)
Weaponizing image scaling against production AI systems
How I Use Claude Code on My Phone with Termux and Tailscale by Nicholas Khami
Stop Hardcoding Everything: Use Dependency Injection
Writes in DuckDB-Iceberg by Tom Ebergen
Introducing AWS Glue 5.1
  • AWS Glue 5.1 introduces support for Apache Iceberg format version 3.0, adding default column values, deletion vectors for merge-on-read tables, multi-argument transforms, and row lineage tracking
deepseek-ai/DeepSeek-Math-V2 by Simon Willison
Electron vs. Tauri by Eric Richardson
The Thinking Game | Full documentary | Tribeca Film Festival official selection by GPT & Me
Super fast aggregations in PostgreSQL 19 by Hans-Jürgen Schönig
Amazon Adds A2A Protocol to Bedrock AgentCore for Interoperable Multi-Agent Workflows by Vinod Goje
  • Amazon announced support for the Agent-to-Agent (A2A) protocol in Amazon Bedrock AgentCore Runtime, enabling communication between agents built on different frameworks. The protocol allows agents developed with Strands Agents, OpenAI Agents SDK, LangGraph, Google ADK, or Claude Agents SDK
  • Agentic systems require several foundational components to operate effectively. Memory functions at two levels: short-term memory maintains conversation context within active sessions, while long-term memory retains insights across multiple sessions over time
  • MCP solves the agent-to-resource connection problem, while A2A solves the agent-to-agent communication problem in multi-agent deployments.
  • The A2A protocol’s stateful behavior lets agents remember recent interactions and maintain coherent conversations. Session smuggling attack exploits this property to inject malicious instructions into a conversation, hiding them among otherwise benign client requests and server responses.
Demystifying Determinism in Durable Execution
  • function that performs some side effects, such as writing to a database, making an API call, sending an email etc, and makes it reliable via recovery (which in turn depends on durability).
  • This becomes equivalent to jumping to the first unexecuted step and resuming from there.
  • Re-execution of the control flow requires determinism: it must execute based on the same decision state every single time and it must also pass the same arguments to side effect code every single time. However, side effects themselves do not need to be deterministic, they only require idempotency or duplication tolerance.
  • The side effects absolutely can and should be non-deterministic, which is fine because they should generally only be executed once, even if the function itself is executed many times. For those failure cases where the result is not durably stored, we rely on idempotency or duplication tolerance.
Agent Design Is Still Hard by Armin Ronacher
The Dark Data Tax: How Hoarding is Poisoning Your AI by Ananth Packkildurai
Technology Radar | Guide to technology landscape
New prompt injection papers: Agents Rule of Two and The Attacker Moves Second by Simon Willison
A Fork in the Road: Deciding Kafka’s Diskless Future — Jack Vanlightly by Jack Vanlightly
Enough With All The Raft
You Should Write An Agent
  • Take “sub-agents”. People make a huge deal out of Claude Code’s sub-agents, but you can see now how trivial they are to implement: just a new context array, another call to the model. Give each call different tools. Make sub-agents talk to each other, summarize each other, collate and aggregate. Build tree structures out of them. Feed them back through the LLM to summarize them as a form of on-the-fly compression, whatever you like.
Code research projects with async coding agents like Claude Code and Codex by Simon Willison
Waiting for SQL:202y: GROUP BY ALL by Peter Eisentraut
Thinking About Thinking With LLMs
  • In the case of this article, most people agreed that using LLMs to learn new things needs to be done with some caution.
  • t’ll lead to the continued democratization of programming and ever more people in conversation with computers.
  • But I don’t think this changes the fundamental reality that the best programmers aren’t the ones that make the widest use of the highest abstractions. That’ll continue to be those who dig down and understand what’s happening at a deeper level — that understanding will always lead to more deft use of any tools that programmers have at their disposal.
The Rise of Subagents by Philipp Schmid
  • Context Engineering is everything.
  • Models are improving very fast. Don’t over-engineer a solution today that a simpler or better model can solve tomorrow.
How To Not Plateau When Learning Python
I Stopped Using AI To Code For 30 Days
Trust is everything by Pedro Tavares
Code execution with MCP: Building more efficient agents by Simon Willison
  • identifies two challenges with MCP as it exists today. The first has been widely discussed before: all of those tool descriptions take up a lot of valuable real estate in the agent context even before you start using them. The second is more subtle but equally interesting: chaining multiple MCP tools together involves passing their responses through the context, absorbing more valuable tokens and introducing chances for the LLM to make additional mistakes.
You’re Passing Way Too Many Arguments (and How to Fix It)