tezvyn:

Databases & Architecture

SQL, NoSQL, system design, microservices, APIs

32 bites

Databases & Architecture30 sec read

Fact Table: The Numbers in Your Data Warehouse

A fact table is the ledger of business events, recording what happened and how much. It's the core of a data warehouse, holding sales figures or page views. The footgun is storing descriptive text here; that belongs in linked dimension tables.

Databases & Architecture30 sec read

BASE: Trading Consistency for Availability

BASE is a database design philosophy that prioritizes availability over strict consistency, the opposite of ACID. It's used in large-scale systems like social media where uptime is key and slightly stale data is okay.

Databases & Architecture30 sec read

PostgreSQL's Free Space Map: Finding Room to Write

A Free Space Map is a table of contents for empty space in a Postgres table. It lets Postgres quickly find a page with enough room for a new row or index entry, avoiding a slow scan. It's used on every INSERT/UPDATE.

Databases & Architecture30 sec read

Database Checkpoints: Faster Recovery After a Crash

A database checkpoint creates a known good point for faster crash recovery. Instead of writing every change to disk, it periodically flushes modified data from memory, reducing the amount of log data to process.

Databases & Architecture30 sec read

Slotted Page Structure: Stable Pointers on Disk

A slotted page organizes data by growing records from the end of the page and pointers from the beginning. This allows databases to handle variable-sized records without costly reshuffling, keeping pointers stable.

Databases & Architecture31 sec read

Heap File Organization: Fast Writes, Slow Reads

Heap file organization is like tossing records into a box in no particular order. It's great for bulk-loading data quickly, but searching requires a full table scan. The footgun is using it for frequently queried tables, which kills performance.

Databases & Architecture30 sec read

Database Pages: The Building Blocks of Your Data

A database page is the fundamental 8KB block for all storage. The database engine reads and writes entire pages, not single rows, for user data, indexes, and metadata. The key footgun: the physical order of rows on a page is not guaranteed.

Databases & Architecture31 sec read

Predicate Pushdown: Filter Data at the Source

Predicate pushdown tells the database to filter data at the source, not after fetching it. This speeds up queries in data warehouses and lakehouses by reducing network traffic. The main footgun: not all data sources can execute all types of filters.

Databases & Architecture30 sec read

Database Statistics: The Query Optimizer's Internal Map

Database statistics are the raw data the query optimizer uses to guess the cheapest way to run your query. It uses stats like row counts and value distribution to decide between a full table scan and an index seek.

Databases & Architecture30 sec read

Row Mode vs. Batch Mode Execution in SQL Server

Row mode processes data one row at a time, like a checklist. Batch mode processes chunks of rows together for vectorized speed. Row mode is classic for OLTP, while batch mode shines in data warehousing for large scans.

Databases & Architecture30 sec read

MVCC: Read and Write Data Without Blocking Each Other

MVCC avoids slow, traditional locks by giving each transaction its own consistent data snapshot. This allows readers and writers to work at the same time without blocking each other, boosting performance in databases like PostgreSQL.

Databases & Architecture30 sec read

Shared & Exclusive Locks: The Read vs. Write Rule

A Shared (S) lock is like many people reading a library book at once; an Exclusive (X) lock is one person writing in it alone. Databases use S/X locks to manage concurrency, preventing writes from corrupting reads.