You are here

Data Storage Solution: Rationale

Introduction
Time to upgrade my ailing, 99% filled 350GB spinning disk mirror.

There were several end user requirements:

  1. Must be silent. So this means SSD.
  2. The final solution must provide total protection against failure.
  3. The final solution must provide an adequate deterrent against prying eyes and criminal elements.
  4. The final solution must provide enough storage to cover my current requirements, backing my entire music collection in FLAC and 100 blurays. This works out to between 1TB and 4 TB.

It was important to satisfy a number of functional requirements beyond the basic capacity requirement:

  • Availability
  • Security
  • Performance

A review of each is below starting with data classification.

Data Classification
Explicit data classification is necessary to provide correct and adequate treatment of data whether this be security requirements, availability, backup requirements, performance requirements, etc.

Data is organized by class and priority and each priority receives the same treatment. This ensures consistent and predictable handling of data, particularly security.

Class Priority
personal Mission Critical
media Critical
config Important
other Unessential

Data Management
Mission Critical - maximum availability, must be protected at all cost

Task Description
AIDE protect against bit rot
Mirror protect against single hard failure
AIDE protect against file system bugs / site power loss / unauthorized modification
rdiff protect against unintentional/unauthorized modification/deletion
cloud protect against site compromise

Critical - maximum availability, too large to backup as desired

Task Description
AIDE protect against bit rot
Mirror protect against single hard failure
AIDE protect against file system bugs / site power loss / unauthorized modification

Important - availability not critical, must be protected at all cost

Task Description
Mirror protect against single hard failure
AIDE protect against file system bugs / site power loss / unauthorized modification
rdiff protect against unintentional/unauthorized modification/deletion
cloud protect against site compromise

Unessential - this data is not important, easily replaceable

Task Description
Do nothing

Performance Review
Maximize performance without breaking the budget. In most cases performance corresponds with wear level mitigation.

  • Disk: These are high performance SSD. SSD maintenance (trim) is deferred for now.
  • Server: My server has 6 IO ports. These are organized as follows:
    	1x6G port for 500GB SSD
    		/ mounted here
    	2x3G ports for 1TB SSD mirror
    		/home /var mounted to mirror
    
  • Partition: (most) partitions are aligned:
    		HDD	phys_sector	partitions aligned?	raid aligned?	fs aligned?
    		sda	512byte		no			NA		4096bytes
    		sdb	512byte		yes			yes		4096bytes
    		sdc	512byte		yes			yes		4096bytes/64k stripe
    
  • Filesystem: filesystem and mdadm are aligned to disk. Chunk size for media partition is 64k (instead of 4k).
  • IO scheduler: deadline is best performance and lowest latency.
  • Server: incidental writes are reduced as much as possible. See Longevity Review.
  • OS: swapiness is reduced to minimum value, maximizing RAM usage:
    	vm.swapiness=1
    
  • Network: none
  • User: none
  • Site: none

Longevity Review
I'd like to ensure that SSD read/write is done sparingly and intelligently. This can be managed at a number of levels:

  • Disk: none
  • Partition: partitions are write aligned to disk
  • File System: filesystem & SSD are aligned; partitions are mounted relatime to significantly reduce incidental writes (overhead) caused by the filesystem; similarly large chunk size used on the media partition. mdadm aligned to fs.
  • Server: none
  • OS:
    • IO scheduler: deadline scheduler optimizes SSD cache usage and allows diskto schedule its writes best.
    • system: Debian system resources are well managed. Most system resources are mounted as tmpfs or other types of RAM disks. Remaining I/O comes from system services, maintenance tasks and monitoring. Swapiness is reduced from default setting.
    • services: The meat in the pie. These are allowed to do what they need to get the job done
    • user: I am monitoring desktop usage to see how much the desktop writes to disk. Baloo is responsible for pretty hefty amounts of read/write. We will see the impact on disk wear.
    • Logging: Logging on this system is somewhat heavy. I have mitigated this by fixing system bugs and in some cases reducing log levels where it wasn't providing much useful information.
    • cron: most of these are benign or already discussed elsewhere
    • backup: this is a heavy writing function. Files are prepared once a month for offsite backup and both services and user data have local copies made daily.
    • monitoring: These services can be quite taxing. I have reduced file system integrity checking to once a month. That's an entire disk read. The rest of the monitors aren't terribly taxing except when they log.
  • Network: none
  • User: none
  • Site: none

Security Review
Ideally create a strong enough deterrent that drive by or opportunity crime is deterred. Must more than this creates an exponentially larger maintenance overhead with little added benefit. Those with the resources will gain access to your data.

Confidentiality

  • Disk: none
  • Partition: none
  • File System: standard Linux permissions; other users are read-only or not at all. Special group for full access.
  • Server: none
  • OS: Apps are in containers, per user per service
  • Network: 2xFirewalls
  • User: authentication to device required
  • Site: Household; largest risk is theft. Legal activity is basically 0%.

Integrity

  • Disk: none
  • Partition: none
  • File System: XFS proven track record of robustness; file scanners detect unauthorized changes
  • Server: BIOS updates only from vendor
  • OS: OS and apps receive regular security service; See Confidentiality, See Data Management
  • Network: connections are ethernet and most transactions over the wire are encrypted
  • User: only trusted users given access to the network
  • Site: Normal household security

Availability

  • Disk: none
  • Partition: none
  • File System: mirrored
  • Server: reputable, OTS hardware
  • OS: Linux is one of the most robust OS available (only Solaris and *BSD better?)
  • Network: reputable, OTS hardware
  • User: none
  • Site: none

Non-repudiation
At all levels I have enabled logging to track machine load, network traffic, firewall rule violations, deep packet inspection results, etc.

Future Plans
There are a few items I could improve upon in the future.

  1. Correct sda block alignment partitions (to create alignment partition needs to start at 2048)
  2. Encrypt /home and /var
  3. Explore the possibility of moving /home to ext4 and /mnt/scratch to xfs
  4. Evaluate pros/cons of using btrfs or any number of flash nand file systems
  5. No way to cloud backup large av files/design folder. There is simply too much data.
  6. What is the best way to monitor filesystem health? periodic online fsck? periodic online defragmentation? wear leveling and disk health? Need to implement automatic trim maintenance.
  7. Verify wear leveling (see Wear Level Monitoring below)
  8. Provide a small UPS to hold up server during power and brown out
  9. Find a way to offsite backup >1GB of data.

Wear Level Monitoring
START Dec 11 2016:

SSD Power_On_Hours Power_Cycle_Count Wear_Leveling_Count Total_LBAs_Written
/dev/sda 5854 418 11 5968708308
/dev/sdb 129 6 0 864487023
/dev/sdc 129 6 1 2817683918

END Jan 11 2017:

SSD Power_On_Hours Power_Cycle_Count Wear_Leveling_Count Total_LBAs_Written
/dev/sda 6602 420 12 7126658407
/dev/sdb 877 8 0 1484208418
/dev/sdc 877 8 1 3437404118

Add new comment

Filtered HTML

  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <table> <tr> <td> <ul> <ol> <li> <dl> <dt> <pre> <dd> <img> <sub> <sup>
  • Lines and paragraphs break automatically.

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.
CAPTCHA
This question is for testing whether you are a human visitor and to prevent automated spam submissions.
1 + 2 =
Solve this simple math problem and enter the result. E.g. for 1+3, enter 4.