Data Storage Solution: Rationale (Part 1 of 3)

Sun, 12/18/2016 - 17:00 — will

Introduction
Time to upgrade my ailing, 99% filled 350GB spinning disk mirror.

There were several end user requirements:

Capacity: The final solution must provide enough storage to cover my current requirements, backing my entire music collection in FLAC and 100 blurays. Because data will grow over time (and probably exponentially) I will deploy at max 50% provisioned.
Reliability: The final solution must provide total protection against failure. In other words data loss is not acceptable.
Security: Data access (sharing) must be from authorized sources only. The final solution must provide an adequate deterrent against prying eyes and criminal elements.
Quality of Life: Must ensure high quality of life. In this case I'll need a solution with no VOCs, noise or anything obnoxious. So this means SSD.

There are also a number of optional features that should be met, within reason:

Availability
Performance
Usability

A review of each is below starting with data classification.

Data Classification
Explicit data classification is necessary to provide correct and adequate treatment of data whether this be security requirements, availability, backup requirements, performance requirements, etc.

Data is organized by class and priority and each priority receives the same treatment. This ensures consistent and predictable handling of data, particularly security.

Class	Priority	Definition	Example
personal	Mission Critical	maximum availability, must be protected at all cost	financial and health records, project and design docs
config	Critical	availability not critical, must be protected at all cost	/etc, parts of /var
media	Important	maximum availability, too large to backup as desired, streaming performance	FLAC and Bluray rips, podcast archives
other	Unessential	this data is not important, easily replaceable	everything not captured above

Data Management
Mission Critical - maximum availability, must be protected at all cost

Task	Description
Mirror	protect against single hard failure
filesystem	proven code; access control
AIDE	protect against bit rot
AIDE	protect against file system bugs / site power loss / unauthorized modification
rdiff	protect against unintentional/unauthorized modification/deletion
cloud	available locally and remotely; protected against site compromise

Critical - availability not critical, must be protected at all cost

Task	Description
Mirror	protect against single hard failure
filesystem	proven code; access control
AIDE	protect against bit rot
AIDE	protect against file system bugs / site power loss / unauthorized modification
rdiff	protect against unintentional/unauthorized modification/deletion

Important - maximum availability, too large to backup as desired, streaming performance

Task	Description
Mirror	protect against single hard failure
filesystem	proven code; access control
AIDE	protect against bit rot
AIDE	protect against file system bugs / site power loss / unauthorized modification
AIDE	protect against file system bugs / site power loss / unauthorized modification
cloud	available locally and remotely; protected against site compromise

Unessential - this data is not important, easily replaceable

Task	Description
Do nothing

Usage
These locations have specific functions:

/home/will/Documents: all projects and personal data
/mnt/scratch: all experimental data and svn checkouts
/mnt/scratch/Download: all web browser downloads
/mnt/files: all media goes here (e.g. photos, videos, music, books)
/mnt/files/backup: backup information and temporary backup data goes here
/opt: all packages that are compiled or do not come from a debian archive go here

Capacity Review
The requirement is for storing my entire FLAC collection and 100 Bluray discs. I also have (compared to 100 Bluray) a small amount of storage. Provisioning is 50% max at time of deployment.

Item	Capacity
FLAC	300GB
Bluray	100*35GB=3500GB
Provisioning	50%
Total	7600GB (7.6TB)

A quick look at pricing indicates that a 100% SSD solution is prohibitive. At time of writing (2016) between FLAC and misc the storage required is 460GB. Doubling to 920GB and deferring the Bluray collection (a project I plan to start in approximately 3 years) 1TB will do. 1TB SSD are fairly affordable so that is what I will purchase.

Performance Review
Maximize performance without breaking the budget. In most cases performance corresponds with wear level mitigation.

Disk: These are high performance SSD. They do require maintenance (see below).

Bus: My server has 6 IO ports. These are organized as follows:

	1x6G port for 500GB SSD
		/ mounted here
	2x3G ports for 1TB SSD mirror
		/home /var mounted to mirror

mdadm --detail <md_dev>

Partition: all partitions are aligned:

		HDD	phys_sector	partitions aligned?	raid aligned?	fs aligned?
		sda	512byte		yes			NA		4096bytes
		sdb	512byte		yes			yes		4096bytes
		sdc	512byte		yes			yes		4096bytes/64k stripe

Filesystem
- filesystem type based on performance demands, file size, access type, etc.
- filesystem and mdadm are aligned to disk.
- Chunk size for media partition is 64k (instead of 4k).
Operating System
- Parameters are mostly default; tuned where needed
- IO scheduler is Deadline
Network
- Local gigabit fabric
- High speed local wifi for mobile devices (currently only laptop and smartphone)
- High speed WAN connection with fast uplink and downlink
User: none
Site: none

Longevity Review
I'd like to ensure that SSD read/write is done sparingly and intelligently. This can be managed at a number of levels:

Disk: none, this is handled internally by the drive and a bit of a black box really
Partition:
- partitions are write aligned to disk
File System:
- filesystem & SSD are aligned
- mdadm aligned to fs
- partitions are mounted relatime to significantly reduce incidental metadata writes caused by the filesystem
- large chunk size used on media partitions
OS:
- ramdisk: Debian does an excellent job of offloading many file system services to ram disks. Most system resources are mounted as tmpfs or other types of RAM disks. Swappiness is also tuned (see Performance). Use mount | column -t to see list of mounted partitions.
- IO scheduler: deadline scheduler optimizes SSD cache usage and allows diskto schedule its writes best. See Performance.
Services:
- The meat in the pie. These are allowed to do what they need to get the job done, albeit with some tuning for egregious services
- Logging: Logging on this system is somewhat heavy. I have mitigated this by fixing system bugs and in some cases reducing log levels where it wasn't providing much useful information.
- cron: most of these are benign or already discussed elsewhere
- backup: this is a heavy writing function. Files are prepared once a month for offsite backup and both services and user data have local copies made daily.
- monitoring: These services can be quite taxing. I have reduced file system integrity checking to once a month. That's an entire disk read. The rest of the monitors aren't terribly taxing except when they log.
Network: none
User: none
Site: none

Security Review
Ideally create a strong enough deterrent that drive by or opportunity crime is deterred. More than this creates an exponentially larger maintenance overhead with little added benefit. Those with the resources will gain access to your data.

Access Control

Disk: none
Partition: none
File System: standard Linux permissions and access control; other users are read-only or not at all. Special group for full access.
Server: none
OS
- protected kernel memory space
- protected process memory space
- services are run in containers or are run by non-root limited access users/groups
Network:
- 2x Firewalls
- MAC, IP and port restrictions/whitelisting
- Blacklisting bad actors
User: only trusted users are given access to the network or physical location
Site: Standard lock and key; largest risk is theft.

Confidentiality/Privacy

Disk: none
Partition: encrypted
File System: access control and permissions
Server: none
OS: none
Network:
- data on the wire is encrypted, locally and over WAN
User: policy against sharing passwords, access details, etc
Site: location is not public

Integrity

Disk: SSD contains ECC and wear leveling to ensure integrity is maintained at a bit for bit level
Partition: none
File System:
- XFS and EXT4 have a proven track record of robustness and include many checks to ensure data integrity
- file scanners detect unauthorized changes to files and metadata
Server: hardware/BIOS updates only from trusted sources (i.e. vendor)
OS: OS and apps receive regular security service only from trusted sources; See Confidentiality, See Data Management
Network:
- almost all connections are ethernet (less prone to tampering)
- most transactions over the wire have CRC to ensure integrity
- encryption is a natural deterrent against tampering
User: none
Site: none

Non-repudiation

Disk: none
Partition: none
File System: metadata logs modification times
OS: none
Services: extensive logging of system activity
Network: 2xFirewalls log firewall rule violations
User: access/logins and login failures are logged
Site: none

Availability Review

Disk: RAID mirror
Partition: none
File System: mirrored
Server: reputable, high reliability hardware
OS: Linux is one of the most robust OS available (only Solaris and *BSD better?); hardware is monitored for failures and down time
Network: reputable, high reliability hardware
User: none
Site: off-site backups

Maintenance

Queued trim can be performed on drives that support it. It is not recommended on my Samsung SSD (and is blacklisted in the kernel.
Manual Trim: run a weekly trim (via systemd). Do not use DISCARD.
It is not recommended to defragment SSD. Both XFS and EXT4 support online defrag.
EXT4: e4defrag -c -v /dev/sdxX
XFS: xfs_fsr -v /mount/point -OR- xfs_fsr -v /path/to/specific/file

Future Plans
There are a few items I could improve upon in the future.

Encryption: Encrypt /home and /var. Risky and requires lots of time.
Offsite backup. New internet speed and cloud budget reqd. No way to cloud backup large av files/design folder. There is simply too much data.
XFS and EXT4 do not have a good provision for online periodic health detection or repariing (e.g. online fsck) thus requiring a reboot
Mobile and cloud solution. New os and sw reqd. There is a desire for local cloud hosting or more folder syncing (syncthing) available for data to device independence. Need to more fully incorporate mobile devices and networks outside my own. At the moment I manually copy some folders I want to share via syncthing. They inevitably drift and I have to sync them. If I had a large backup repository I could use syncthing (which I do not trust) to share these folders directly. Some folders I do not want to sync, I just want to cache a specific folder but stream the rest if I want to.
New server with kvm to improve security, capacity and performance. Enhanced availibility.

Appendix: Wear Level Monitoring
smartctl -a <dev>:
START Dec 11 2016:

SSD	Power_On_Hours	Power_Cycle_Count	Wear_Leveling_Count	Total_LBAs_Written
/dev/sda	5854	418	11	5968708308
/dev/sdb	129	6	0	864487023
/dev/sdc	129	6	1	2817683918

INITIAL CHECKPOINT Jan 11 2017:

SSD	Power_On_Hours	Power_Cycle_Count	Wear_Leveling_Count	Total_LBAs_Written
/dev/sda	6602	420	12	7126658407
/dev/sdb	877	8	0	1484208418
/dev/sdc	877	8	1	3437404118

2ND CHECKPOINT Dec 13 2017:

SSD	Power_On_Hours	Power_Cycle_Count	Wear_Leveling_Count	Total_LBAs_Written
/dev/sda	14655	423	19	12032358308
/dev/sdb	8930	11	2	8058372986
/dev/sdc	8930	11	3	10011333856

On Dec 14, 2017 the partition table of /dev/sda was corrected and aligned to 2048 (instead of 63).

3RD CHECKPOINT Dec 2 2018:

SSD	Power_On_Hours	Power_Cycle_Count	Wear_Leveling_Count	Total_LBAs_Written
/dev/sda	22815	443	23	14925486911
/dev/sdb	17090	34	5	15374381385
/dev/sdc	17090	34	6	17327381417

4TH CHECKPOINT Dec 4 2019:

SSD	Power_On_Hours	Power_Cycle_Count	Wear_Leveling_Count	Total_LBAs_Written
/dev/sda	31621	447	26	16753269226
/dev/sdb	25896	38	7	22056318529
/dev/sdc	25896	38	8	24112734909

5TH CHECKPOINT Dec 6 2020:

SSD	Power_On_Hours	Power_Cycle_Count	Wear_Leveling_Count	Total_LBAs_Written
/dev/sda	40422	459	29	18204062081
/dev/sdb	34696	50	13	31256954903
/dev/sdc	34696	50	13	33352406665

6TH CHECKPOINT Dec 27 2021:

SSD	Power_On_Hours	Power_Cycle_Count	Wear_Leveling_Count	Total_LBAs_Written
/dev/sda	49564	468	40	22534818155
/dev/sdb	43838	59	24	43882936851
/dev/sdc	43839	59	22	45978346490

7TH CHECKPOINT Dec 31 2024:

SSD	Power_On_Hours	Power_Cycle_Count	Wear_Leveling_Count	Total_LBAs_Written
/dev/sda	75831	576	69	40229227743
/dev/sdb	70104	168	60	85948354903
/dev/sdc	70104	168	65	90535063110

Tags:

Linux & Software

Hardware & Design

Main menu

Recent Content

Tags

You are here

Data Storage Solution: Rationale (Part 1 of 3)

Monthly Archive

Pages

Main menu

Recent Content

Tags

You are here

Data Storage Solution: Rationale (Part 1 of 3)

Search form

Monthly Archive

Pages

User login