I am trying to move some Logfiles, which are located on an external Webserver to an Amazon S3 bucket. This should happen every 7 days without manually activating it. Additionally I'd like it to be "failsafe", so it probably would be best if the copying operation would be done in the Amazon Cloud. I have already read something about the AWS Data Pipelining solution but I couldn't find anything on how to get it to work with an external (that means not hosted by Amazon) data source, let alone downloading a file from a webserver and then processing it. Has somebody got experience with a similar problem and any advice for me where to start?
Thank you!
I don't believe any of the existing components will do what you want out of the box, but you can always run a script as part of a data pipeline. I've used it that way to run a script that grabs files from an external FTP and then loads them into an S3 bucket every hour.