The Fragmentation Illusion: Why Your Cloud Database Might Not Be as Big as It Looks
The Fragmentation Illusion: Why Your Cloud Database Might Not Be as Big as It Looks
1. Introduction: Terabytes of Storage, Gigabytes of Data — The Fragmentation Illusion
If you've worked with cloud databases long enough, you've probably seen one report several terabytes of storage — only to discover during a migration that most of it is empty space. The data hadn't changed — but fragmentation had quietly piled up over the years.
This phenomenon is often the result of database fragmentation: a quiet but impactful consequence of years of updates, deletes, and idle tables. Fragmentation doesn't just inflate your storage footprint — it can degrade performance, increase backup size, and quietly drive up cloud costs.
In this article, I'll explore:
- What database fragmentation is and its types
- Why it happens over time
- How to identify it using real-world queries
- And most importantly — how to fix it before it becomes an operational burden
2. What Is Database Fragmentation?
Database fragmentation refers to the inefficient organization of data on disk that leads to wasted space and degraded performance. Over time, as rows are inserted, updated, and deleted, the physical layout of that data becomes increasingly disorganized.
It's similar to how an operating system handles file storage: as files are created, deleted, and modified, data blocks become scattered across the disk. This fragmentation slows access and wastes space — and databases behave the same way. When rows are no longer stored compactly, performance drops and storage usage grows disproportionately.
There are two main types of fragmentation:
- Internal Fragmentation
- External Fragmentation
Visualizing Database Fragmentation
The following diagram illustrates how fragmentation occurs and its impact:
[Row1][Row2][Row3][Row4]
100% Used"] B2["Page 2
[Row5][Row6][Row7][Row8]
100% Used"] B3["Page 3
[Row9][Row10][Row11][Row12]
100% Used"] B4["Page 4
[Row13][Row14][Row15][Row16]
100% Used"] B1 --> B2 B2 --> B3 B3 --> B4 end subgraph Operations["Operations Over Time"] O1["UPDATE Row2
(grows in size)"] O2["DELETE Row5"] O3["INSERT Row17
(non-sequential)"] O4["UPDATE Row9
(moves to new page)"] end subgraph After["After Fragmentation - Fragmented State"] A1["Page 1
[Row1][GAP][Row3][Row4]
75% Used"] A2["Page 2
[GAP][Row6][Row7][Row8]
75% Used"] A3["Page 3
[GAP][Row10][Row11][Row12]
75% Used"] A4["Page 4
[Row13][Row14][Row15][Row16]
100% Used"] A5["Page 5
[Row2_new][Row9_new][Row17]
60% Used"] A1 -.->|Scattered| A2 A2 -.->|Scattered| A5 A5 -.->|Scattered| A3 A3 -.->|Scattered| A4 end subgraph Impact["Impact"] I1["Storage: 4 Pages → 5 Pages
+25% Storage Used"] I2["Performance: Sequential → Random I/O
Slower Queries"] I3["Cost: Paying for Empty Space
Higher Cloud Costs"] end Before --> Operations Operations --> After After --> Impact style Before fill:#90EE90,stroke:#228B22,stroke-width:2px style After fill:#FFB6C1,stroke:#DC143C,stroke-width:2px style Operations fill:#FFE4B5,stroke:#FF8C00,stroke-width:2px style Impact fill:#FF6B6B,stroke:#C92A2A,stroke-width:2px,color:#fff
2.1 Internal Fragmentation
Occurs when unused space builds up within data pages — typically due to DELETEs or UPDATEs that enlarge rows and relocate them. Some engines (like PostgreSQL) retain old row versions (dead tuples), which stay in place until explicitly cleaned up.
Seen in: PostgreSQL, SQL Server, Oracle
How Internal Fragmentation Happens
- Row Updates: When a row is updated and grows in size, it may not fit in its original page location
- Row Deletes: Deleted rows leave gaps in pages that aren't immediately reclaimed
- Dead Tuples: PostgreSQL's MVCC (Multi-Version Concurrency Control) keeps old row versions until VACUUM runs
- Page Padding: Pages may have unused space due to alignment requirements
Impact of Internal Fragmentation
- Wasted Storage: Unused space within pages increases storage costs
- Reduced Cache Efficiency: More pages need to be read into memory for the same amount of data
- Slower Queries: More I/O operations required to access the same logical data
2.2 External Fragmentation
Happens when data pages are scattered across disk instead of being stored sequentially. This slows down range scans and full table reads. It's caused by non-sequential inserts, page splits, and repeated growth/shrink cycles.
Seen in: PostgreSQL, SQL Server, Oracle, and distributed systems like Cassandra or DynamoDB
How External Fragmentation Happens
- Non-Sequential Inserts: Inserting rows with non-sequential primary keys causes pages to be scattered
- Page Splits: B+Tree indexes split pages when they become full, causing physical disorganization
- Growth/Shrink Cycles: Tables that grow and shrink repeatedly create gaps in the file layout
- Random Updates: Updates that change indexed columns can cause index page splits
Impact of External Fragmentation
- Slower Sequential Scans: Range queries and full table scans become slower
- Increased I/O: More disk seeks required to read scattered pages
- Reduced Throughput: Lower overall database performance for read-heavy workloads
3. Why Fragmentation Happens Over Time
Fragmentation rarely shows up overnight. It builds quietly — often over years — as applications evolve, data volumes grow, and access patterns shift.
In modern relational databases like MySQL, PostgreSQL, and SQL Server, fragmentation is a natural byproduct of common operations:
- UPDATEs that increase row size may move rows to new pages
- DELETEs leave behind reusable, but unclaimed space
- Non-sequential inserts cause page splits in B+Trees
- Table growth and shrink cycles create gaps in file and page layouts
Across nearly every long-running system I've worked with, this pattern holds true: fragmentation builds silently until the day you migrate, back up, or export — and suddenly the database is a fraction of the size you thought it was.
Real-World Example: Aurora MySQL
One of the clearest examples I encountered was with an Aurora MySQL instance. It reported several terabytes of allocated storage. But after exporting and reloading the data into a fresh environment, the active footprint was less than 5% of the original size. The data hadn't changed — just the layout.
The root causes:
- InnoDB page splits from non-sequential inserts
- Updates and deletes that left behind unused space
- Lack of regular compaction or table rebuilds
The system continued to perform well on a day-to-day basis — but only because compute and storage were continuously scaled up to compensate. When it came time to migrate, the true cost of that accumulated fragmentation became painfully clear — in time, effort, and unnecessary storage spend.
Common Scenarios Leading to Fragmentation
1. High-Volume OLTP Systems
Systems with frequent INSERTs, UPDATEs, and DELETEs accumulate fragmentation quickly:
-- Example: E-commerce order system
-- Frequent order creation (INSERTs)
-- Order status updates (UPDATEs)
-- Order cancellations (DELETEs)
-- All contribute to fragmentation over time
2. Time-Series Data with Retention Policies
Databases that delete old data periodically:
-- Example: Logging system
DELETE FROM logs WHERE created_at < DATE_SUB(NOW(), INTERVAL 90 DAY);
-- Leaves gaps that aren't immediately reclaimed
3. UUID Primary Keys
Using UUIDs or random strings as primary keys causes non-sequential inserts:
-- Non-sequential inserts cause page splits
CREATE TABLE events (
id CHAR(36) PRIMARY KEY, -- UUID causes random inserts
data TEXT
);
4. Frequent Schema Changes
ALTER TABLE operations can cause fragmentation:
-- Adding columns, changing data types
ALTER TABLE users ADD COLUMN metadata JSON;
-- May require row relocation
4. Detecting and Fixing Fragmentation
Let's take an example of MySQL using the InnoDB storage engine — also applicable to Aurora MySQL, which inherits the same fragmentation patterns under the hood.
You can detect and fix it using simple built-in tools.
4.1 Detecting Fragmentation
MySQL/InnoDB Detection
Use the SHOW TABLE STATUS command to identify fragmentation:
SHOW TABLE STATUS LIKE 'your_table';
Key Columns to Understand
- Data_length - The total space (in bytes) used to store the actual table rows — including padding, gaps, and any unused room within pages.
- Data_free - The amount of allocated but currently unused space. This is your primary indicator of internal fragmentation. It includes gaps left by deletes, updates, or page splits that haven't been compacted.
- Index_length - The total space used by all indexes associated with the table — both primary (clustered) and secondary (non-clustered).
Example Analysis
If Data_length = 10 GB and Data_free = 3 GB, roughly 30% of the table is allocated but not actively used — a clear sign of fragmentation.
Comprehensive Fragmentation Query
Here's a query to identify all fragmented tables in your database:
SELECT
TABLE_NAME,
ROUND(((DATA_LENGTH + INDEX_LENGTH) / 1024 / 1024), 2) AS 'Size (MB)',
ROUND((DATA_FREE / 1024 / 1024), 2) AS 'Free (MB)',
ROUND((DATA_FREE / (DATA_LENGTH + INDEX_LENGTH + DATA_FREE) * 100), 2) AS 'Fragmentation %'
FROM information_schema.TABLES
WHERE TABLE_SCHEMA = 'your_database_name'
AND DATA_FREE > 0
ORDER BY DATA_FREE DESC;
This query shows: * Tables with the most free space * Fragmentation percentage * Tables that would benefit most from optimization
PostgreSQL Detection
For PostgreSQL, use the following queries:
-- Check table and index sizes
SELECT
schemaname,
tablename,
pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) AS total_size,
pg_size_pretty(pg_relation_size(schemaname||'.'||tablename)) AS table_size,
pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename) -
pg_relation_size(schemaname||'.'||tablename)) AS index_size
FROM pg_tables
WHERE schemaname NOT IN ('pg_catalog', 'information_schema')
ORDER BY pg_total_relation_size(schemaname||'.'||tablename) DESC;
-- Check for dead tuples (internal fragmentation indicator)
SELECT
schemaname,
relname,
n_dead_tup,
n_live_tup,
ROUND(n_dead_tup * 100.0 / NULLIF(n_live_tup + n_dead_tup, 0), 2) AS dead_tuple_percent
FROM pg_stat_user_tables
WHERE n_dead_tup > 0
ORDER BY n_dead_tup DESC;
4.2 Fixing Fragmentation
MySQL/InnoDB: OPTIMIZE TABLE
If you identify fragmentation in a table, here are safe and commonly used methods to reclaim space:
OPTIMIZE TABLE your_table;
What it does: * Rebuilds the table to eliminate internal fragmentation * Requires creating a temporary copy of the table * May lock the table during execution — best used during off-peak hours * Ideal for medium-sized tables
Considerations: * Requires sufficient disk space (temporary copy needed) * Table is locked during optimization * Can take significant time for large tables * Automatically updates table statistics
MySQL/InnoDB: ALTER TABLE ... ENGINE=InnoDB
ALTER TABLE your_table ENGINE=InnoDB;
What it does: * Functionally similar to OPTIMIZE TABLE * Useful in automation pipelines or where explicit OPTIMIZE is restricted * Can also be used to trigger a rebuild in MySQL-compatible environments like Aurora
Advantages: * Can be scripted more easily * Works in environments where OPTIMIZE TABLE might be restricted * Same end result as OPTIMIZE TABLE
Rebuild Indexes (if index fragmentation is suspected)
ALTER TABLE your_table DROP INDEX idx_name, ADD INDEX idx_name (col);
What it does: * Useful for secondary indexes that show degraded performance * Requires downtime depending on table size and workload
When to use: * When specific indexes show poor performance * After bulk data loads * When index statistics are stale
PostgreSQL: VACUUM and REINDEX
For PostgreSQL, use VACUUM to reclaim space from dead tuples:
-- Standard VACUUM (doesn't lock table)
VACUUM ANALYZE your_table;
-- Full VACUUM (locks table, more thorough)
VACUUM FULL your_table;
-- Rebuild indexes
REINDEX TABLE your_table;
VACUUM vs VACUUM FULL:
* VACUUM: Reclaims space but doesn't reduce table size, doesn't lock table
* VACUUM FULL: Completely rebuilds table, locks table, reduces file size
4.3 Best Practices for Fragmentation Management
1. Regular Monitoring
Set up regular checks for fragmentation:
-- Create a stored procedure to check fragmentation
DELIMITER //
CREATE PROCEDURE CheckFragmentation()
BEGIN
SELECT
TABLE_NAME,
ROUND((DATA_FREE / (DATA_LENGTH + INDEX_LENGTH + DATA_FREE) * 100), 2) AS 'Fragmentation %'
FROM information_schema.TABLES
WHERE TABLE_SCHEMA = DATABASE()
AND DATA_FREE > 0
AND (DATA_FREE / (DATA_LENGTH + INDEX_LENGTH + DATA_FREE) * 100) > 10
ORDER BY DATA_FREE DESC;
END //
DELIMITER ;
2. Scheduled Maintenance
Automate optimization during low-traffic periods:
#!/bin/bash
# Optimize tables with >20% fragmentation during maintenance window
mysql -u admin -p -e "
SELECT CONCAT('OPTIMIZE TABLE ', TABLE_NAME, ';') AS command
FROM information_schema.TABLES
WHERE TABLE_SCHEMA = 'your_database'
AND (DATA_FREE / (DATA_LENGTH + INDEX_LENGTH + DATA_FREE) * 100) > 20
" | grep -v command | mysql -u admin -p your_database
3. Prevent Fragmentation
Design your schema to minimize fragmentation:
-- Use sequential primary keys when possible
CREATE TABLE orders (
id INT AUTO_INCREMENT PRIMARY KEY, -- Sequential inserts
order_data JSON
);
-- Avoid random UUIDs as primary keys
-- Instead, use UUIDs as secondary unique keys if needed
CREATE TABLE events (
id BIGINT AUTO_INCREMENT PRIMARY KEY,
event_uuid CHAR(36) UNIQUE, -- UUID as secondary key
data TEXT
);
4. Monitor After Optimization
Track the impact of optimization:
-- Before optimization
SHOW TABLE STATUS LIKE 'your_table';
-- After optimization
SHOW TABLE STATUS LIKE 'your_table';
-- Compare Data_free values to measure improvement
5. Cloud Database Considerations
Aurora MySQL/PostgreSQL
Aurora databases inherit fragmentation patterns from their underlying engines but with some differences:
- Managed Storage: Aurora manages storage automatically, but fragmentation still accumulates
- Background Maintenance: Some maintenance happens automatically, but not always sufficient
- Snapshot Costs: Fragmentation increases snapshot sizes and costs
- Migration Impact: Fragmentation becomes obvious during cross-region migrations
RDS MySQL/PostgreSQL
RDS instances require manual optimization:
- Maintenance Windows: Use scheduled maintenance windows for optimization
- Read Replicas: Optimize replicas during low-traffic periods
- Multi-AZ: Consider failover timing when optimizing primary instances
Cost Impact
Fragmentation directly impacts cloud costs:
- Storage Costs: Paying for unused space
- Backup Costs: Larger backups = higher backup storage costs
- Snapshot Costs: Fragmentation inflates snapshot sizes
- Migration Costs: Data transfer costs based on allocated, not actual, size
Final Thoughts
Database fragmentation often goes unnoticed — until it becomes too big to ignore. It rarely breaks things outright, but it silently inflates your storage footprint, slows down queries, increases backup and snapshot sizes, and wastes money — especially in cloud environments where storage costs scale automatically.
Even in systems like Aurora MySQL, which advertise managed storage and background maintenance, fragmentation can accumulate for years without triggering alerts. It's only during migrations, exports, or cost audits that its impact becomes obvious.
Regularly checking large or frequently updated tables is one of the simplest yet most effective ways to maintain database health. By detecting and fixing fragmentation early, you can:
- Reduce storage costs significantly
- Improve query performance
- Decrease backup and snapshot sizes
- Prepare for smoother migrations
- Optimize overall database efficiency
The key is to make fragmentation monitoring part of your regular database maintenance routine. A few minutes of proactive optimization can save hours of troubleshooting and significant cloud costs down the line.