Blog

On Remote Work and Deep Thinking

Reflections on how remote work has changed the way we think, collaborate, and find meaning in our professional lives.

Read this on Substack

Deciphering Coupling in Software Architecture- Architecture Quantum Explored

Learn how independent deployability, functional cohesion, and coupling help define architecture quanta for robust distributed systems.

Read this on Substack

Ethical Data Practices for Building Better Systems

A critical look at how data-intensive systems can impact society, exploring issues like predictive analytics, surveillance, biases, and the responsibilities of engineers.

Read this on Substack

Building Correct Systems in Distributed Environments

Explore strategies to build reliable and fault-tolerant systems while handling limitations in transactions, data corruption, and distributed coordination.

Read this on Substack

Unbundling Monolithic Databases for Flexibility

Learn how unbundling databases helps to achieve scalability and flexibility, combining specialized tools to meet modern data needs.

Read this on Substack

Integrating Distributed Systems for Unified Data Pipelines

Explore the intricacies of data integration in distributed applications, including synchronizing specialized systems and maintaining correctness across diverse data sources.

Read this on Substack

Unifying Batch and Stream Processing for Modern Pipelines

Examine how unbounded data streams are processed in real-time applications, including operators, time reasoning, joins, and fault tolerance.

Read this on Substack

Synchronizing Databases with Real-Time Streams

Examine how streams integrate with databases through change data capture, event sourcing, and the immutability of state, enabling real-time system synchronization.

Read this on Substack

Enabling Reliable and Scalable Event Streams in Distributed Systems

Explore how messaging systems and partitioned logs enable reliable and scalable transmission of event streams within distributed systems.

Read this on Substack

Advancing Beyond MapReduce- Modern Frameworks for Scalable Data Processing

Explore alternatives to MapReduce, including advanced dataflow engines and their benefits in efficiency, iterative graph processing, and high-level abstractions.

Read this on Substack

MapReduce and Distributed Filesystems- Foundations of Scalable Data Processing

Learn how MapReduce operates over distributed filesystems like HDFS, combining computation and storage for scalable data processing.

Read this on Substack

Leveraging Unix Tools for Efficient Batch Processing

Explore the power of Unix-based batch processing using tools like awk, sort, and grep, and how their design philosophy laid the foundation for modern big data processing.

Read this on Substack

Want to get blog posts over email?

Enter your email address and get notified when there's a new post!