Appropriate way to cancel saving file via file stream?

291 Views Asked by At

A tool I'm writing is responsible for downloading thousands of image files over a matter of many hours. Originally, using TIdHTTP, I would Get the file(s) into a TMemoryStream, and then save that to a file, so long as there were no exceptions. In order to improve speed, I changed the TMemoryStream to a TFileStream.

However, now if the resource was not found, or otherwise any sort of exception which results in no actual file, it still saves an empty file.

Completely understandable, since I simply create a file stream just prior to the download...

FileStream:= TFileStream.Create(FileName, fmCreate);
try
  Web.Get(AURL, FileStream);
finally
  FileStream.Free;
end;

I know I could simply delete the file if there was an exception. But it seems far too sloppy. I'm sure there's a more appropriate method of aborting such a situation.

How should I make this to not save a file if there was an exception, while not altering the performance (if at all possible)?

2

There are 2 best solutions below

3
David Heffernan On BEST ANSWER

How should I make this to not save a file if there was an exception, while not altering the performance (if at all possible)?

This isn't possible in general. Errors and failures can happen at any step if the way, including part way through the download. Once this point is understood, then you must accept that the file can be partially downloaded and then abandoned. At which point where do you store it?

The obvious choices are memory and file. You don't want to store to memory, which leaves to file.

This takes you back to your current solution.

I know I could simply delete the file if there was an exception.

This is the correct approach. There are a few variants on this. For instance you might download to a temporary file that is created with flags to arrange its deletion when closed. Only if the download completes do you then copy to the true destination. This is the approach that a browser takes. But the basic idea is to download to file and deal with any failure by tidying up.

2
Jim McKeeth On

Instead of downloading the entire image in one go, you could consider using HTTP range requests if the server supports it. Then you could chunk the file into smaller parts, requesting the next part after the first finishes (or even requesting multiple parts at the same time to increase performance). If there is an exception then you can about the future requests, so they never start in the first place.

YouTube and a number of streaming media sites started doing this a while ago. It used to be if you started playing a video, then paused it, then it would eventually cache the entire video. Now it only caches a little ahead of the current position. This saves a ton of bandwidth because of the abandon rate for videos.

You could write the partial file to disk or keep it in memory.