You are here

Data Storage Solution: Rationale

Introduction
Time to upgrade my ailing, 99% filled 350GB spinning disk mirror.

There were several end user requirements:

  1. Capacity: The final solution must provide enough storage to cover my current requirements, backing my entire music collection in FLAC and 100 blurays. Because data will grow over time (and probably exponentially) I will deploy at max 50% provisioned.
  2. Reliability: The final solution must provide total protection against failure. In other words data loss is not acceptable.
  3. Security: Data access (sharing) must be from authorized sources only. The final solution must provide an adequate deterrent against prying eyes and criminal elements.
  4. Quality of Life: Must ensure high quality of life. In this case I'll need a solution with no VOCs, noise or anything obnoxious. So this means SSD.

There are also a number of optional features that should be met, within reason:

  • Availability
  • Performance
  • Usability

A review of each is below starting with data classification.

Data Classification
Explicit data classification is necessary to provide correct and adequate treatment of data whether this be security requirements, availability, backup requirements, performance requirements, etc.

Data is organized by class and priority and each priority receives the same treatment. This ensures consistent and predictable handling of data, particularly security.

Class Priority Definition Example
personal Mission Critical maximum availability, must be protected at all cost financial and health records, project and design docs
config Critical availability not critical, must be protected at all cost /etc, parts of /var
media Important maximum availability, too large to backup as desired, streaming performance FLAC and Bluray rips, podcast archives
other Unessential this data is not important, easily replaceable everything not captured above

Data Management
Mission Critical - maximum availability, must be protected at all cost

Task Description
Mirror protect against single hard failure
filesystem proven code; access control
AIDE protect against bit rot
AIDE protect against file system bugs / site power loss / unauthorized modification
rdiff protect against unintentional/unauthorized modification/deletion
cloud available locally and remotely; protected against site compromise

Critical - availability not critical, must be protected at all cost

Task Description
Mirror protect against single hard failure
filesystem proven code; access control
AIDE protect against bit rot
AIDE protect against file system bugs / site power loss / unauthorized modification
rdiff protect against unintentional/unauthorized modification/deletion

Important - maximum availability, too large to backup as desired, streaming performance

Task Description
Mirror protect against single hard failure
filesystem proven code; access control
AIDE protect against bit rot
AIDE protect against file system bugs / site power loss / unauthorized modification
AIDE protect against file system bugs / site power loss / unauthorized modification
cloud available locally and remotely; protected against site compromise

Unessential - this data is not important, easily replaceable

Task Description
Do nothing

Capacity Review
The requirement is for storing my entire FLAC collection and 100 Bluray discs. I also have (compared to 100 Bluray) a small amount of storage. Provisioning is 50% max at time of deployment.

Item Capacity
FLAC 300GB
Bluray 100*35GB=3500GB
Provisioning 50%
Total 7600GB (7.6TB)

A quick look at pricing indicates that a 100% SSD solution is prohibitive. At time of writing (2016) between FLAC and misc the storage required is 460GB. Doubling to 920GB and deferring the Bluray collection (a project I plan to start in approximately 3 years) 1TB will do. 1TB SSD are fairly affordable so that is what I will purchase.

Performance Review
Maximize performance without breaking the budget. In most cases performance corresponds with wear level mitigation.

  • Disk: These are high performance SSD. SSD maintenance (trim) is deferred for now.
  • Bus: My server has 6 IO ports. These are organized as follows:
    	1x6G port for 500GB SSD
    		/ mounted here
    	2x3G ports for 1TB SSD mirror
    		/home /var mounted to mirror
    

    mdadm --detail <md_dev>

  • Partition: all partitions are aligned:
    		HDD	phys_sector	partitions aligned?	raid aligned?	fs aligned?
    		sda	512byte		yes			NA		4096bytes
    		sdb	512byte		yes			yes		4096bytes
    		sdc	512byte		yes			yes		4096bytes/64k stripe
    
  • Filesystem:
    • filesystem type based on performance demands, file size, access type, etc.
    • filesystem and mdadm are aligned to disk.
    • Chunk size for media partition is 64k (instead of 4k).
  • OS:
    • IO scheduler: deadline is best performance and lowest latency.
      	Add block/sda/queue/scheduler = deadline to /etc/sysfs.conf
      	Add block/sdb/queue/scheduler = deadline to /etc/sysfs.conf
      	Add block/sdc/queue/scheduler = deadline to /etc/sysfs.conf
      
    • swapiness is reduced to minimum value, maximizing RAM usage:
      	Add vm.swapiness=1 to /etc/sysctl.conf
      
    • services:incidental writes are reduced as much as possible. See Longevity Review.
  • Network
    • Local gigabit fabric
    • High speed local wifi for mobile devices (currently only laptop and smartphone)
    • High speed WAN connection with fast uplink and downlink
  • User: none
  • Site: none

Longevity Review
I'd like to ensure that SSD read/write is done sparingly and intelligently. This can be managed at a number of levels:

  • Disk: none, this is handled internally by the drive and a bit of a black box really
  • Partition:
    • partitions are write aligned to disk
  • File System:
    • filesystem & SSD are aligned
    • mdadm aligned to fs
    • partitions are mounted relatime to significantly reduce incidental metadata writes caused by the filesystem
    • large chunk size used on media partitions
  • OS:
    • ramdisk: Debian does an excellent job of offloading many file system services to ram disks. Most system resources are mounted as tmpfs or other types of RAM disks. Swappiness is also tuned (see Performance). Use mount | column -t to see list of mounted partitions.
    • IO scheduler: deadline scheduler optimizes SSD cache usage and allows diskto schedule its writes best. See Performance.
  • Services:
    • The meat in the pie. These are allowed to do what they need to get the job done, albeit with some tuning for egregious services
    • Logging: Logging on this system is somewhat heavy. I have mitigated this by fixing system bugs and in some cases reducing log levels where it wasn't providing much useful information.
    • cron: most of these are benign or already discussed elsewhere
    • backup: this is a heavy writing function. Files are prepared once a month for offsite backup and both services and user data have local copies made daily.
    • monitoring: These services can be quite taxing. I have reduced file system integrity checking to once a month. That's an entire disk read. The rest of the monitors aren't terribly taxing except when they log.
  • Network: none
  • User: none
  • Site: none

Security Review
Ideally create a strong enough deterrent that drive by or opportunity crime is deterred. More than this creates an exponentially larger maintenance overhead with little added benefit. Those with the resources will gain access to your data.

Access Control

  • Disk: none
  • Partition: none
  • File System: standard Linux permissions and access control; other users are read-only or not at all. Special group for full access.
  • Server: none
  • OS: protected kernel memory space, protected process memory space; services are run in containers; Services are run by non-root limited access users/groups
  • Network:
    • 2x Firewalls
    • MAC, IP and port restrictions/whitelisting
    • Blacklisting bad actors
  • User: only trusted users are given access to the network or physical location
  • Site: Standard lock and key; largest risk is theft.

Confidentiality/Privacy

  • Disk: none
  • Partition: encrypted
  • File System: access control and permissions
  • Server: none
  • OS: none
  • Network:
    • data on the wire is encrypted, locally and over WAN
  • User: policy against sharing passwords, access details, etc
  • Site: location is not public

Integrity

  • Disk: SSD contains ECC and wear leveling to ensure integrity is maintained at a bit for bit level
  • Partition: none
  • File System:
    • XFS and EXT4 have a proven track record of robustness and include many checks to ensure data integrity
    • file scanners detect unauthorized changes to files and metadata
  • Server: hardware/BIOS updates only from trusted sources (i.e. vendor)
  • OS: OS and apps receive regular security service only from trusted sources; See Confidentiality, See Data Management
  • Network:
    • almost all connections are ethernet (less prone to tampering)
    • most transactions over the wire have CRC to ensure integrity
    • encryption is a natural deterrent against tampering
  • User: none
  • Site: none

Non-repudiation

  • Disk: none
  • Partition: none
  • File System: metadata logs modification times
  • OS: none
  • Services: extensive logging of system activity
  • Network: 2xFirewalls log firewall rule violations
  • User: access/logins and login failures are logged
  • Site: none

Availability Review

  • Disk: RAID mirror
  • Partition: none
  • File System: mirrored
  • Server: reputable, high reliability hardware
  • OS: Linux is one of the most robust OS available (only Solaris and *BSD better?); hardware is monitored for failures and down time
  • Network: reputable, high reliability hardware
  • User: none
  • Site: off-site backups

Future Plans
There are a few items I could improve upon in the future.

  1. reqs lots of time: Encrypt /home and /var
  2. reqs budget: Provide a small UPS to hold up server during power and brown out
  3. reqs budget: rdiff-backup should be on physical media that is different from the source
  4. new internet reqd: Find a way to offsite backup >1GB of data.
  5. new internet reqd: No way to cloud backup large av files/design folder. There is simply too much data.
  6. new internet reqd: Even partial backup is incomplete: user (/home/user/blah) or system (/etc) config data. I also do not backup some docs located in the docs folder
  7. new os reqd: What is the best way to monitor filesystem health? need specific commands for all the things I am monitoring. add periodic online fsck? periodic online defragmentation? wear leveling and disk health? Need to implement automatic trim maintenance.
  8. new os reqd: There is a desire for local cloud hosting or more folder syncing (syncthing)

Appendix: Wear Level Monitoring
smartctl -a <dev>:
START Dec 11 2016:

SSD Power_On_Hours Power_Cycle_Count Wear_Leveling_Count Total_LBAs_Written
/dev/sda 5854 418 11 5968708308
/dev/sdb 129 6 0 864487023
/dev/sdc 129 6 1 2817683918

INITIAL CHECKPOINT Jan 11 2017:

SSD Power_On_Hours Power_Cycle_Count Wear_Leveling_Count Total_LBAs_Written
/dev/sda 6602 420 12 7126658407
/dev/sdb 877 8 0 1484208418
/dev/sdc 877 8 1 3437404118

2ND CHECKPOINT Dec 13 2017:

SSD Power_On_Hours Power_Cycle_Count Wear_Leveling_Count Total_LBAs_Written
/dev/sda 14655 423 19 12032358308
/dev/sdb 8930 11 2 8058372986
/dev/sdc 8930 11 3 10011333856

On Dec 14, 2017 the partition table of /dev/sda was corrected and aligned to 2048 (instead of 63).

Add new comment

Filtered HTML

  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <table> <tr> <td> <ul> <ol> <li> <dl> <dt> <pre> <dd> <img> <sub> <sup>
  • Lines and paragraphs break automatically.

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.
CAPTCHA
This question is for testing whether you are a human visitor and to prevent automated spam submissions.
3 + 4 =
Solve this simple math problem and enter the result. E.g. for 1+3, enter 4.