Is it possible to verify data of multiple sequential writes just by checking the last n bytes written?

37 Views Asked by At

Just to be clear, my questions are language/OS agnostic (independent).

I am working on a program (support many OSs, currently written in Golang) that receive many chunks of data (like a stream of data chunks) and sequentially write it all down to a pre-specified position (pos >= 0) in a file. Only 1 process with 1 thread accessing the file. I use regular write function that use write system call (which one depends on the OS it runs) internally, not buffered IO.

Supposed that while my program was writing, suddenly system crashed (the most severe kind of crash: power failure).

When the system is turned back on, I need to verify how many chunks is completely written to HDD. (*)

The HDD that my program writes to is just today regular desktop or laptop HDD (not some fancy one (with battery-backed) found in some high-end servers).

Supposed that bit corruption during transfer to and reading from HDD is very highly unlikely to happen and is negligible.

My questions are:

  1. Do I need to do checksum on all of the written chunks to verify (*)?
  2. Or do I just need to check and confirm that the nth chunk is correct and assume all the chunks before it (0 -> n-1) is correct too?

    2.1 If 2. is enough, does that means sequential writes order is guaranteed to be preserved by FS of any OS (random writes can still be reordered though)?

  3. Is my way of doing recovery rely on the same principle as append-only log file as seen in many crash-proof databases?
1

There are 1 best solutions below

1
On

I suspect your best bet is to study up on Cyclic Redundancy Checks (CRC).
As I understand it a CRC would allow you to verify that what was intended to be written actually was.

I also suggest that worrying about the cause of any errors is not very worthwhile (transmission errors vs. errors for any other reason such as a power failure).

Hope this helps.