Sharing files between microservices

13.4k Views Asked by At

I'm trying to move a project from its current monolithic state to microservices architecture. The project is in Node.js, so I've started looking into Seneca.js, especially with its seneca-mesh module. Moving image manipulation (crop, resize, etc.) into a microservice seemed the most sensible first step, since it drastically slows down my application now.

When the application is monolithic, there is no problem in passing certain files into file-manipulation logic — just read it from local storage disk. With microsevices, however, if we keep in mind scalability, it becomes more difficult. Of course, I could build an image manipulation microservice, scale it up within the same host machine, and share directories I need between it, so they, too, can read from a local disk.

What if I want a truly scalable microservice, that can be run and scaled on different machines with different IP-adresses that don't share the same filesystem? I thought that maybe I could take advantage of Node's streaming API and send these files back and forth via HTTP or TCP or sockets or you name it.

As far as I've learned, Seneca.js cannot do it the right way. Of course, I could send a file from the main app to image manipulation service via Seneca.js like so:

fs.createReadStream('/files/hello.jpg')
  .on('data', function(data) {
    seneca.act({ role: 'file', cmd: 'chunk', data: data }, cb);
  })
  .on('end', function(err) {
    seneca.act({ role: 'file', cmd: 'end' });
  })
  .on('error', function(err) {
    seneca.act({ role: 'test', cmd: 'error' });
  });

And receive it in chunks:

seneca.add({ role: 'file', cmd: 'chunk' }, writeToFileCb);
seneca.add({ role: 'file', cmd: 'end' }, endFileWriteCb);

But this approach seems ugly and wheel-reinventive.

Another way would be to come up with some HTTP server and send files as multipart/form-data or application/octet-stream, like so:

fs.createReadStream('file.json')
  .pipe(request.post('http://image-manipulator'))

But this means reinventing the framework for microservice communication. All in all, I ask for advice on file sharing between distributed microservices and possible frameworks for this.

2

There are 2 best solutions below

2
On

If you are approaching a microservice architecture, you should think of a microservice for managing files! Don't stream files around if you are microservice environment. For example you might create a FileManagerService with an API exposed for with CRUD implementation and only use seneca act/add for serving the important data ... file-url, size, etc..

5
On

If you use Seneca I strongly recommend to read The Tao of Microservices, by Richard Rodger the author of Seneca himself.

He addresses directly your question this way (Chapter 3 Section 15):

Bandwith matters.

The networked nature of microservices systems means that they are very vulnerable to bandwidth limitations. Even if you start out with a plentiful supply, you must adopt a mentality of scarcity. Misbehaving microservices can easily cause an internally generated denial-of-service attack. Keep your messages small and lean. Do not use them to send much actual data, send references to bulk data storage instead. [...]

To send images between services, do not send the image binary data, send a URL pointing the image.

Back to your specific case, you should use a service that allows you to store/retrieve files, and pass only the URL of the files in the messages between your Seneca services. It is not trivial to build such a system in a pure distributed fashion, so I would rather use AWS S3 or equivalent.