I remember years ago working on the Wii, and there was a restriction on how often you could read the flash to avoid premature wearing. Not sure if that was just the specific type of storage, as googling suggests that NAND is subject to this and NOR isn't. I think pretty much all USB drives now use NOR flash, so maybe this isn't actually an issue any more.
DRAM works that way but flash doesn't. Read disturb is a different issue.
pretty much all USB drives now use NOR flash
Nope, NOR flash is much more expensive than NAND so NOR is only used for firmware and everything else is NAND.
This only happens very rarely, though more frequently as NAND flash goes QLC and beyond.
Besides, other experiments have shown that data remanence is way more of an issue with drives that are almost completely worn out (way beyond their specified TBW) and about to croak. Even then you only get rare bitrot that can be checked for and compensated quite cheaply in most cases.
If you take fresh media, write it just once or a few times at most, use substantial overprovisioning to keep the drive in its fast pseudo-SLC mode, and reread the media periodically, NAND can be a good enough storage system for most casual needs.
While i've had generally solid experience with sandisk for almost 20 years and had a few old drives (which i hear are slc-based so its not surprising) hold files for over 5 years no issue, i recently almost lost over 4 years of photos.
I had purchased some lexar drives from costco since they were dual interface (usb A / usb C) about 2 years ago, and it was usefull to just get some pictures off my phone. I usually don't rely on such a setup for long term but as with all things I was delayed tending to it. I figured there were 2 per box so i just copied them twice, and diffed them several times to make sure they were exact copies.
After 24 months, one of the drives had a %95 loss, almost every picture was lost cut-off bottom half or so. The other drive surprisingly seemed fine, though it had been plugged in every 6-9 months I recall, as I wanted to browse it a few times, it seems that this action saved the volume. Upon further inspection the good drive still lost 10 pictures in about 5 thousand, so it wasn't perfect.
Lexar.
https://www.ebay.com/itm/176810492981?chn=ps&_trkparms=ispr%...
If these are JPEGs with a grey or green lower half, it's likely only a few 16x16 macroblocks are corrupted and you can recover the rest.
This cannot be done programmatically because you have to guess what the average colour of the block was, but it can be worth it for precious pictures.
With JPEG one of the big problems is that the data is Huffman-encoded without any inter-block markers (except maybe Restart Markers, if you're lucky). This means that a single bitflip can result in a code changing length, causing frameshifts in many subsequent blocks and rendering them all undecodable. If you have a large block of missing data (e.g. a 4k-byte sector of zeros), then you have to guess where in the image the bitstream resumes, in addition to guessing the value of the running DC components.
https://www.tomshardware.com/pc-components/ssds/fake-samsung...
https://news.ycombinator.com/item?id=45274277 (Apple Photos corrupts images on import - images truncated)
i either copied them with $cp -ar or $rsync -a
then distinctly remembered diffing each drive against each other with $diff -qr <drive1> <drive2>
For the rest of us, a USB spinning rust hard drive formatted as exFAT is going to be hard to beat. You'll be able to plug this into virtually any computer made in the next few decades (modulo a USB adapter or two) and just read it. They are cheap (even allowing for the rising cost of storage), fast, and most importantly, they are easy. The data is stored magnetically, so is not susceptible to degradation just from sitting like SSDs or flash drives are.
Of course, you should not store any important data on only ONE drive. The 3-2-1 backup rule applies to archives as well: 3 copies, 2 different media, 1 off-site.
(Though probably not appropriate if you’re primarily not a mac user, or won’t be in the future.)
The Linux HFS+ driver is basically unmaintained, and cannot write to journaled disks. On Windows, the only choice a paid driver. I guess it's fine if you're strictly a Mac user, but it's a real problem if you need to access the disk on another machine. Even if you don't, I still wouldn't trust Apple for long-term support of anything.
Meanwhile exFAT has native support on Windows, Mac, and Linux, and there are drivers for BSDs and others.
So 20 years down the line, you'll certainly have something that can read an exFAT drive without much if any pain, regardless of which platform you're using at the time. HFS+? Who knows.
That said, I'd consider ZFS or btrfs for HDD archival. Granted broad (Mac/Windows) support is weaker than FAT, but at least the filesystems are completely open source. But what really makes them interesting is their automatic data checksumming to detect (and possibly repair) bitrot, which is particularly useful for archival.
Yes, journaling. Power cuts or unclean unmounts are enough of a risk for me that I don't see any reason to use a file system without journaling.
> The Linux HFS+ driver is basically unmaintained, and cannot write to journaled disks. On Windows, the only choice a paid driver. I guess it's fine if you're strictly a Mac user, but it's a real problem if you need to access the disk on another machine. Even if you don't, I still wouldn't trust Apple for long-term support of anything.
I just don't expect Linux or Windows support to be relevant to me or my family's use, or the cost of the Windows driver to be a problem if it ever came up.
If in a decade Apple drops HFS+, it's not something they're going to do without notice, it's something where I'll have plenty of notice to take the relatively small required effort to migrate my archives to a different file system.
> That said, I'd consider ZFS or btrfs for HDD archival. Granted broad (Mac/Windows) support is weaker than FAT, but at least the filesystems are completely open source. But what really makes them interesting is their automatic data checksumming to detect (and possibly repair) bitrot, which is particularly useful for archival.
I use btrfs for non-archival storage, but don't really see it as useful for archival storage - it's effectively unusable for my wife if I get hit by a bus.
> So 20 years down the line, you'll certainly have something that can read an exFAT drive without much if any pain, regardless of which platform you're using at the time. HFS+? Who knows.
You're optimizing for a problem that isn't in my risk assessment - i.e. I don't care if can shelf a drive and easily read from it in 20 years, I just want to maximize reliability over a 20 year timespan where I'm willing to take maintenance action if required. (And I think you're overly negative on Apple's support of old tech. e.g. Apple's didn't drop software Firewire support for a decade after they stopped selling their last Firewire device - that's plenty of time for a migration if my archival drives were using a Firewire connection. HFS+ is Apple's currently-supported file system for non-SSD storage, and I don't see a medium-term path where they extend APFS support to HDDs or drop HDD support entirely.)
In any case, if the situation changes, I expect there'll be enough lead time for me to adjust my strategy -- the failure scenario is completely different than rotting physical media.
If we’re talking the average tech-illiterate to literate-but-cost-and-space-constrained person, probably Blu-Ray. A burner+reader combo with a stack of dual-layer discs is probably cost-effective. High-capacity HDDs would probably be equally effective if you can guarantee that they’re stored away from accidents and mishandling, but if it requires a SATA-to-USB adapter with assembly then it might possibly be out of reach for some consumers, and any risk of damage from movement could rule it out entirely.
If we’re talking tech-savvy consumers who don’t have the IT budget of a corporation, maybe LTO-5 or LTO-6 tapes could work. Tapes themselves are very affordable and have a good shelf lifespan. Used libraries can be had for under $600. The primary issues would be finding one with an interface that works with your existing equipment and software to support tape read and write.
Not even kidding.
With any other media, you have to hope that the drives are still available. Paper routinely lasts hundreds of years and we all have readers built right in.
But I keep hoping someone will finish the 'use lasers to burn 5d storage into glass chips' project silica concept and bring it to market so I can have isolinear star trek chips.
Go with inorganic blu ray media if you want longevity. Most HTL blu rays made currently will last around 100 years if properly stored. If you need longer there are M-Disc's, but they are expensive and rumor has it that ALL verbatim 100Gb blu rays are essentially M-Discs with different labels these days.
For all practical purposes any Blu ray larger than 25Gb is probably inorganic HTL, but if you worry a lot you can buy more expensive "archival grade" discs from Japan as well that have been vettted and tested.
Tape.
Obviously extreme prosumer, but for long-term archival of lots of data, LTO tape wins in several ways:
- Discs just aren't actually that high capacity relative to modern HDD capacities. BD XL maxes out at 128 GB, while there are now 30 TB HDDs readily available. That's 240 discs per HDD. Modern LTO tapes store 12-18 TB, or 2-3 tapes per HDD.
- Anything flash-based is a bad choice for long-term storage. SSDs are very fast, but also (relatively) expensive at 15-20¢/GB. Reputable SD cards are in the same neighborhood. Despite the OP redditor's results here, flash is only expected to retain data for 5-10 years.
- Tape is the absolute lowest cost-per-GB you can find of any storage medium. At the moment, LTO 8/9 tape can be had on Amazon for ½¢/GB. Compare with BD-R at 2¢/GB, or BR-R XL M-disc at 15¢/GB. HDDs (spinning rust) are 2-3¢/GB.
- Consider also write speed. LTO can write 300+ MB/s. BD 16x maxes out around 68 MB/s.
- Manufacturers rate tapes for 30 years sitting on a shelf, and it wouldn't be surprising if they still read after 50 years¹. Plain BD-R lasts 5-20 years. M-disc is the interesting outlier, rated 100-1000 years.
Of course, the biggest problem with tape is the drives. While the media is dirt cheap, the drives are crazy expensive. It looks like you can pick up a used LTO-6 drive (2.5 TB tapes) on ebay for around $500. A brand new LTO-9 drive (18 TB tapes) will be $4000-5000.
In terms of breakeven points, a used LTO-6 drive + tapes beats plain BD after about 25 TB. Because of the cost of M-discs, they stop making sense after 1-2 TB. Purely on cost, a brand new LTO-9 drive + tapes doesn't beat used LTO-6 + tapes until about 800 TB (LTO-9 tape is ½¢/GB while LTO-6 tape is 1¢/GB), but there's definitely a point in there where the larger capacity of LTO-9 makes dealing with the physical media a whole lot easier.
So if you're looking for long-term storage for your photo album, a M-disc BD XL is probably a good choice. If you only have a few hundred GB of data, a couple discs + burner can be had for $300 or so, and you can be pretty sure your mom could manage to read the disc if necessary.
But if you're looking to back up your 100 TB homelab NAS, discs are not really feasible. You'll have to spend the next month swapping discs every 25 minutes², and then deal with your new thousand disc collection. Here's where a used LTO-6 drive makes a lot of sense. This is a real sweet spot if you can find a decent drive; all-in you'd spend about $1500 to back up your 100 TB.
This is what I do to backup my NAS — found an old LTO-6 drive and got a bunch of tapes. The drive plugs in to a SAS port (you might need a HBA PCI card, $50), and that's pretty much it. Linux has the drivers built in; it will show up as /dev/st0 and you can just point tar³ at it.
Finally, just to compare with cloud options, storing that 100 TB in AWS Glacier Deep Archive would run you slightly over $100/mo, so you're ahead with your own tapes after little over a year. Oh and don't forget to set aside an extra $8000 for data transfer fees should you ever actually want to retrieve your data lol.
---
¹ eg the Unix v4 tape that was recently found and successfully read after 52 years — https://news.ycombinator.com/item?id=45840321
² Or get a disc-swapping robot, but those run $4000-5000, at which point… you're better off with a brand new tape drive.
³ Thus using the Tape ARchiver program for its original purpose. Use -M to span tapes, tar will prompt you to swap.
For the kids, I'd rather make physical photo albums.
All I can say for sure is that you should not trust any flash for long term storage, thumb drive or otherwise. In serious enough, high usage, high heat enviornments where everything working without problems or delay is part of what they are paying us to be responsible for, it is standard practice to clone fresh images to nvmes every time, with multiple spares that can be swapped out in minutes when they inevitably fail anyways.
Flash media relies on recharging, which may or many not happen often enough.
"I filled 10 32-GB Kingston flash drives with pseudo-random data."
"The years where I'll first touch a new drive (assuming no errors) are: 1, 2, 3, 4, 6, 8, 11, 15, 20, 27"
And from the blog: "Q: You know you powered the drive by reading it, right? A: Yes, that’s why I wrote 10 drives to begin with. We want to see how something works if left unpowered for 1 year, 2 years, etc."
I needed one last week, and had to throw most of them away, they had all died from presumably dormancy, even new in the package.
First the elephant in the room. Why solid state? because the drives to read the media are often the weak link. When the drives are no longer being manufactured how hard is it to make one? reading solid state drives is a relatively low precision electrical process compared to the high precision mechanical process needed for most media.
First on the chopping block was bulk storage. It tends to be delicate and hard to read and short lifespans. But if I limited myself to small storage there are some interesting options. fusible proms were promising but top out at a few megabytes. Mask roms? does anyone offer a mask rom service anymore?
Put a mask rom into a sd card... no, sd cards are too physically small. For a song album we want something bigger to put album art on. A thing the size of the original gameboy cartridge with a usb interface and a mask rom?
My conclusion, for that specific goal, indefinite future storage of a song album. Vinyl records. low tech enough that it is easy to make a player for them.
It should be mentioned that the phone board often gets warm during operation or battery charging, and the temperature is stated an important harmful factor in a different comment.
So if you have some old files on an old device, and assume that they are still there because their records in the file system still look fine, you might be surprised.
Apparently, multi-year storage tests are still valuable for validating whether those estimates match reality. Who knew...
My guess is: regular graphite pencil on porous paper is best. Any ideas about further things I have to take into account?
Pencil definitely lasts if the paper is undisturbed. I have some paperwork that's 100+ years old and with legible pencil text. On the flip side, if the paper is handled a lot, the writing will gradually fade because graphite particles just sit on the surface and can flake off.
On some level, the medium is your main problem. Low-grade paper, especially if stored in suboptimal conditions (hot attic, moist crawlspace, etc), may start falling apart in 20 years or less. Thick, acid-free stock stored under controlled conditions can survive hundreds of years.
Acid-free paper sounds like the way to go. Do you have experience with this? Or is it common knowledge? Just curious!
I also read letters from my grandparents, stored by my parents in a simple shoe box. No special conditions, just light-free and inside the home for decades. They were still very much readable. I did not pay enough attention, but I guess it was blue ink from back in the day that they used.
I collect vintage stuff that sometimes comes with paperwork, usually after spending a decade or two stashed away in the attic.
It is easy to find either roller pens or ink for fountain pens that are pigment-based, lightfast and waterproof. It is very easy to test yourself that with such ink water has no effect whatsoever over the written text (after a minute or so of drying after writing). For example, there are many kinds of Uni-Ball roller pens with waterproof ink, but there also many other brands with similar products. (Only fountain pens, roller pens and gel pens may have pigment-based ink, the paste ink used by ball-point pens is easily washed by alcohol or other organic solvents.)
In my experience with graphite on paper, it is a much worse choice, because the writing will fade over the years, due to the rubbing of the paper sheets from each other.
The pigment-based black inks also use carbon, like graphite pencils, but any ink contains not only a pigment, but also a glue that binds the pigment to the paper, so it will not be rubbed out by touching it.
Do you just use regular graphite pencils, like with the HB scale or something?
There are at least 4 dangers for a written text: mechanical rubbing, fading due to light, water and organic solvents (e.g. alcohol).
There are many pigment-based inks that are specified to be lightfast and resistant to water and organic solvents, according to various archiving standards. Such inks are available for fountain pens or they are used in certain kinds of roller pens.
If you use such inks on paper that is somewhat porous, they will also be resistant to rubbing. There are certain kinds of "permanent pens", which have excellent resistance to rubbing even when you write on surfaces like plastic, glass or metal, not only on glossy paper, and which may also be lightfast and waterproof, but the text written with such permanent pens is easily washed with alcohol or other organic solvents (like also for text written with ball-point pens).
So the answer depends on your goal, but usually what you want is either a roller pen or ink for a fountain pen that are clearly specified as being pigment-based, lightfast and waterproof, together with paper on which you have checked that rubbing does not remove the written text. When using fountain pens, one must check that the archival pigment-based ink is known to be compatible with the model of fountain pen, otherwise clogging may occur. (For example, I use pigment-based ink cartridges from Sailor Japan, seiboku or souboku, with Sailor fountain pens, so compatibility is guaranteed.)
While graphite-based pencils produce writing that is lightfast and resistant to solvents, in my experience the inherent rubbing of the sheets of paper when you handle the notebook, or whatever you had used for writing, leads over the years to a fading of the text, so I do not like this method.
No no no no no!
Archival data should never be made on intrinsically valuable material; doing so makes it subject to theft or re-use for something "better".
Example: There is a reason why more marble statues survive from antiquity than bronze statues.... Bronze has an intrinsic value (theft) and future artists would also melt down existing bronze statues to make something "better".
The engraving pen on glass is a good one. Any experience with it?
https://www.diamineinks.co.uk/products/diamine-30ml-archival...
> Waterproof archival quality fountain pen ink in Blue-Black. Initially writes Blue, then oxidises to Black over time as it bonds to the paper. Traditionally used to record births, deaths & marriages.
And from another source :-
> Permanent archival blue-black ink based on an iron-gall formulation, as used by registrars and the clergy for official documents.
> Iron gall ink formulations have been used for around 1,500 years, and many of the world's most historic documents have been written using it. This ink will remain legible for hundreds of years.
> Please Note: This is an iron-gall ink, which contains particles that can clog fountain pen feeds. It's also acidic, which can damage steel nibs. Use with caution, and at your own risk. Not for use in valuable pens.
Definitely not a medium to passively store anything long term without power! Use Hard drives or Blu-ray instead.
> they might not last one year.
> Definitely not a medium to passively store anything long term without power!
Do you have any evidence to back up this claim? I'm much more interested in data than fear mongering.
[1] https://www.ni.com/en/support/documentation/supplemental/12/...
I'll probably get a spinner and a flash drive and hope one of them survives the years.
I suppose your `dd` implementation itself could do so, but I don't know why it would.
I will never forget when I mixed up `if` and `of` during a routine backup.
`cat /dev/sda > /mnt/myDisk2` is so much safer, explicit, and in unix norms. It's also faster because you don't have to tune block size parameters.
Plus you can also do `pv /dev/sda > /mnt/myDisk2` to get transfer speed details.
Friends don't let friends use `dd` where `cat` can do the same job.
> Friends don't let friends use `dd` where `cat` can do the same job.
Technically yes... but I like being able to explicitly set block sizes and force sync writes.
Let's say someone made an expansion board with a cool feature: there are 5 documented I/O addresses, but accessing any other address fries the stored firmware. What would you do? No, not leaving a lot of comments in code in CAPS LOCK. No, not printing the correct hexadecimal values in red to put the message on the wall. You make a driver that only allows access to the correct addresses, and configure the rest of the system to make sure that it can only work through that driver.
Let's say there's a loading bay at the chemical plant with multiple flanges. If strong acid from the tanker is pumped into the main acid tank, everything is fine. If it is pumped into any other tank, the whole plant may explode and burn. What should be done? No, not promising that drivers will be fired, then shot by the firing squad if they make a mistake. Each connection is independently locked, and the driver only gets a single matching key.
You have wonderful programmable devices that allow you to solve non-standard problems with non-standard tools. What should be done is making a wrapper for dd that just does not allow you to do anything you don't want to happen. Even the most basic script with checks and confirmation is enough.