Found in 1 comment on Hacker News
tlb · 2022-09-24 · Original thread
These days, most people need the opposite. They're probably using systems optimized for spinning disks but they're running it on flash, and all the layers of complexity added over the years to optimize disk latency are just slowing things down.

Like, this is the kind of bullshit people used to do research on: https://tlb.org/docs/usenixw95.pdf (I'm an author). This paper (and 100s of others) exist only because sequential reads & writes on disks were much faster than random access, because you had to (a) move the head, and (b) wait for the sector you're interested in to come around. At 7200 RPM, this is up to 4.2 milliseconds, so it was worth burning 10000s of CPU instructions to try to sort read requests in some order that might avoid waiting for another rotation. Many popular disk controllers couldn't read sequential sectors, so it was better to sort write requests so that it hit every second sector or something. Madness.

Anyway, today there are only 3 kinds of secondary storage worth caring about: flash (sector granularity, but no seeks), cloud (like S3), and archive (like Glacier).

But if you're working with some old-timey environment and really need to know about optimizing for spinning disks, most of the ideas were published in the late 80s and 90s. This book https://www.amazon.co.uk/Design-Implementation-Operating-Add... (McKusick et al) is excellent, and describes a filesystem that works well on a wide variety of workloads. Or see the references in the Blackwell paper above.

Fresh book recommendations delivered straight to your inbox every Thursday.