Java Design: dealing with data objects used by many classes

288 Views Asked by At

First, a brief description of the library that brought this up:

I have a library that listens continuously on a provided serial port, reading in blocks of bytes and passing them along to be processed in some meaningful way (details aren't important to the question). To make the library a bit more reusable, processing these bytes was abstracted out to an interface (FrameProcessor). A few default implementations exist in the library itself to handle processing that's always going to occur regardless of the application using it. However, there's support for adding in custom processors to do things the application specifically cares about.

Aside from the bytes being passed to these processors, there's a data object (ReceiverData) that contains information that most (but not guaranteed to be all) of the processors might find interesting. It's maintained entirely by the library itself (i.e. it isn't the responsibility of the application to setup/maintain any instances of ReceiverData. They shouldn't have to care how the data is made available, just that it is available).

Right now, ReceiverData is being passed as a parameter to each processor:

public interface FrameProcessor {

    public boolean process(byte[] frame, ReceiverData receiverData);
}

However, I really don't like this approach, as it's requiring data to be passed to something that might not necessarily care about it. Also, for the processors that do care about ReceiverData, they've got to pass the object reference around in whatever other method calls they make (provided those method calls need to access that data).

I've considered changing the FrameProcessor to an abstract class and then defining a setter for a protected ReceiverData member. But that seems kind of gross as well - having to iterate over the list of all FrameProcessors and setting the ReceiverData instance.

I've also thought about some kind of static threaded context object(necessarily threaded since the library supports listening on multiple ports at once). Essentially, you'd have something like the following:

public class ThreadedContext {

    private static Map<Long, ReceiverData> receiverData;

    static {
        receiverData = new HashMap<Long, ReceiverData>();
    }

    public static ReceiverData get() {
        return receiverData.get(Thread.currentThread().getId());
    }

    public static void put(ReceiverData data) {
        receiverData.put(Thread.currentThread().getId(), data);
    }
}

That way, when each thread in the library started, it could just add a reference to its ReceiverData to ThreadedContext, which would then be available as needed to the processors without needing to pass it around.

This is certainly a pedantic question, as I've already got a solution that works fine. It just bothered me. Thoughts? Better approaches?

3

There are 3 best solutions below

3
On BEST ANSWER

I like your current approach the best. It's inherently thread-safe (because stateless). It allows for the same processor to be used by multiple threads. It's easy to understand and use. It's very similar to, for example, the way a servlet works: the request and the response objects are passed to the servlet, even if they don't care about them. And it's also very easy to unit test, because you don't have to setup a thread-local context to be able to test a processor. You just pass a ReceiverData (real or fake), and that's it.

You could, instead of passing a byte array and a ReceiverData, mix both in a single argument.

1
On

Encapsulate the byte[] and ReceiverData into a new class and pass that to the frame processors. Not only does it mean they can pass the same, single object around to their own methods, but it allows for future expansion if necessary.

public class Frame {
    private byte[] rawBytes;
    private ReceiverData receiverData;

    public ReceiverData getReceiverData() { return receiverData; }
    public byte[] getRawBytes() { return frame; }
}

public interface FrameProcessor {
    public boolean process(Frame frame);
}

While this may seem like overkill and require the processors to make unnecessary method calls, you may find that you don't want to provide access to the raw byte array. Perhaps you want to use a ByteChannel instead and provide read-only access. It depends on your library and how it's intended to be used, but you may find you can provide a nicer API inside Frame than a simple byte array.

0
On

As the OP states, the problem with process(byte[] frame, ReceiverData data) is that ReceiverData may or may not be used by the implementation. Thus, it is "wrong" that process() has a dependency on ReceiverData. Instead, a FrameProcessor implementation should use a Provider that can provide the instance of ReceiverData for the current frame on-demand.

The example below illustrates this. I've used dependency injection for clarity, but you can just as well pass those objects in the constructors. The FrameContext would use ThreadLocal<T>s, much like suggested in the OP. See this link for implementation hints. A DIY Provider<T> implementation would likely depend on FrameContext directly.

If you want to go this route, consider using a DI-framework such as Google Guice or CDI. Guice is probably easier when working with custom scopes.

public class MyProcessor implements FrameProcessor {

    @Inject
    private Provider<ReceiverData> dataProvider;

    public boolean process(byte[] frame) {
        ...
        ReceiverData data = dataProvider.get();
        ...
    }
}

public class Main {

    @Inject
    private FrameContext context;

    public void receiveFrame(byte[] frame, ... ) {

        context.begin();
        ...
        context.setReceiverData(...); // receiver data is thread-local
        ...

        for (FrameProcessor processor : processors)
            processor.process(frame);

        context.end();
    }
}

This approach is very extensible; future needed objects can be added to the context/scope object and corresponding providers injected into processors:

public class MyProcessor ... {

    @Inject private Provider<FrameMetaData>;
    @Inject private Provider<FrameSource>;
    ...
}

As you can see from this example, this approach will also allow you to avoid future situations where you would add "sub-objects" to ReceiverData, leading to a kitchen-sink object situation (for example, ReceiverData.metaData, ReceiverData.frameSource, ... ).

Note: Ideally, you'd have processing objects with a lifetime equal to a single frame. Then you could just declare (and inject!) the dependencies for processing a single frame in the constructor and create a new processor for each frame. But I assume you are processing a lot of frames and therefore want to stick with the current approach for performance reasons.