You are here

Perfecting the CD Rip

You can rip any CD with just about any software and any drive and get a good, listenable copy of the original. But how close is it to exactly what the studio recorded? And can you tell the difference? Who knows, but let's discuss how to produce the perfect rip.

There are a couple aspects to this: the music itself and then all of the metadata. Neither are as easy as it seems. The biggest problem is that you can have a close to perfect copy with small defects in the CD. You may not ever notice. Another problem is that CDs are not very good at seeking so the beginning or the end might get cut off (usually much less than a second). And of course if you want a perfect copy don't use MP3 or AAC, you have to use FLAC or ALAC which are both lossless codecs. They result in much larger files but they do not throw away any data like MP3 and AAC do.

Once you conquer that problem you still need to verify. So there is an online database for that now which helps immensely. It stores only checksums. However that database is full of good and bad checksums. It works by crowdsourcing. If 100 people upload results then that must be a perfect rip. Unless of course 100 people with the same error (like a drive offset error which is very common). So to some extent you have no way of knowing if your copy is perfect.

Metadata. It's pretty easy to download the arist/album/title data. But you run into issues with the 100 different systems for capitilization and grammar. Embedding an image? Make sure it's compatible before all your devices. And there's no perfect copy of an image. Lyrics? Good luck.

Also, what of subjective data? Here I'm primarily thinking of track ratings. This is highly subjective but I use it to make playlists or load up my portable devices to it's valuable. I don't want to rely on software to store this data, I'd rather it be contained within the file itself. No software that I'm away of supports this though.

Then, after you've made a copy of your data how can you be sure it stays that way? You can monitor for data changes or physical failures in your disk. For all the work it takes getting the perfect rip, it's worth a little more effort to maintain integrity.

And btw I rip data tracks and extra DVDs as well, so video and surround sound tracks that are available these days.

So here is my system:

  1. Rip & Encode
    1. General
      1. Data Format: Native
      2. File Naming: TRACK# - ARTIST – TRACK
    2. CD-DA
      1. Ripper/Encoder: morituri, rip.sh
      2. Verification: Accuraterip
      3. Codec: FLAC
      4. Metadata: musicbrainz
    3. Downloads
      1. Codec: Native (Various)
    4. AC-3
      1. Ripper/Encoder: mplayer/ffmpeg
      2. Verification: Filesystem/CRC
      3. Codec: AC-3
      4. Metadata: NA
    5. Video
      1. Ripper/Encoder: mplayer/ffmpeg
      2. Verification: Filesystem/CRC
      3. Codec: Native (Various)
      4. Metadata: NA
  2. Album Art
    1. Type: Front Cover
    2. Format: JPG/PNG
    3. Size: 500x500 (min(audi,android,squeezebox,htpc))
    4. Utilities: albumpicresize.sh, renamepics.sh, importpics.sh
  3. Apply ReplayGain
    1. Style: Album, Track
    2. Utilies: replaygain.sh
  4. Metadata
    1. Fix Song Tags
      1. Coverage
        1. artist, album, track, track #, year, genre
        2. Use 'original artist' for remixes/covers
      2. Grammar
        1. First Letter Of Each Word Capitalized
        2. Lower articles, conjunctions, prepositions: to, a, an, the, and, of, but, or, as, at, by, for, in, on, from, into, onto, with
        3. First and last word capitalized
        4. Always capitalize after a “:”, “;” and “(“
        5. Proper nouns capitalized; leave intact
      3. Add software versions into “Encoded-By” tag
    2. Standardized directory naming
      1. ARTIST/YEAR – ALBUM [(DISC n)]/
      2. Year is year disc was originally released.
      3. Disc # is only used on releases with more than one disc.
      4. Make sure directory and flac year tag match.
      5. This borks morituri re rip, but that's okay.
    3. Do not rename files
      1. They are linked to morituri and the cue/m3u files.
    4. Utilities: easytag
  5. Archive
    1. Copy files to network share
      1. /Downloads: Legal purchases
      2. /Collection: FLAC backup of CD collection
      3. Storage is in a mirror to protect against HDD failure.
      4. Checksum software to monitor against sector defects or accidental deletions.
    2. Update catalogs
      1. Update collection in Amarok
      2. Add media to KmusicdB
      3. Add media to discogs.com
    3. Store physical media
      1. Alphabetical
      2. Drop articles, conjunctions, prepositions when sorting
      3. Leave proper names intact
  6. Rate tracks
    1. Sort by length; rate
    2. Then sort by rating; re-rate
    3. Add entire collection into a playlist and look for tracks with missing rating
  7. Verification
    1. Files
      1. Verify complete album is present in Amarok
    2. Audio
      1. Verify album art is attached and correct
      2. Copy Protection check: play last track of each album ripped
      3. Pregap: For fun listen to hidden tracks in pregap
      4. Check ReplayGain is applied in Amarok
    3. Video
      1. Playback all video to ensure no errors

Most of the shell scripts I wrote myself.

There are still a few items that are not completely solved in my mind. Fortunately they can all be added later without having to re rip.

  1. I need to establish a set of rules and write a script to convert FLAC into MP3 or AAC that can be used in my car/smartphone
  2. Artwork besides front cover. Having the entire artwork from the entire CD would be awesome. I wonder what software takes advantage of this?
  3. Song lyrics. This is tough because there is no great source for lyrics. But it could be done. But all software that supports this automatically downloads lyrics anyways.
  4. Rating. As mentioned above it would be nice to preserve track rating. I will have to write a script that reads the Amarok database and adds a metadata tag to each file.
  5. Also mentioned above, the directory and song name naming isn't perfect. They don't match the metadata and they don't match what is online. And if you rename anything it breaks morituri.

So I've just started applying this method to my CD collection to replace the lossy RIPs of my existing backup. (And FWIW I've eliminated 99% of the downloads I didn't pay for. The rest I can't find to buy). It's extremely time consuming but I'm satisfied with the results, knowing they have high quality information and are preserved for all of eternity.