Ignite PeerClassLoading for Transitive Dependencies

53 Views Asked by At

I am using ignite compute task to retreive data from BigQuery. Here is the configuration on both thick client and server sides:

<property name="peerClassLoadingEnabled" value="true"/>
<property name="deploymentMode" value="CONTINUOUS"/>

The compute task can be described as folows:

public class BigQueryStorageReadTask implements IgniteClosure<LoadingRequest, Long> {
  @IgniteInstanceResource
  private Ignite ignite;

  @Override
  public Long apply(LoadingRequest query) {
    return loadToCache(query);
  }

  private void readFromBigQuery(LoadingRequest query) {

    try (BigQueryReadClient client = BigQueryReadClient.create()) {
      // Read data from BigQuery using storage read api
      .....
    } catch (IOException e) {
      throw new RuntimeException(e);
    }
  }
}

The dependency issue occurs when I read data from BigQuery using storage read api, which needs the dependency

implementation platform('com.google.cloud:libraries-bom:26.22.0')
implementation 'com.google.cloud:google-cloud-bigquery:2.23.2'

If I use simple SQL api to read data from BigQuery, which only needs 'com.google.cloud:google-cloud-bigquery:2.23.2', then there's no issue and compute task works fine, but with the bom dependency

com.google.cloud:libraries-bom

The remote compute task will give various error due to lack of dependencies. Since my ignite server nodes start with xml script instead of a spring application, I tried add library jars manually into user_libs, but still lack some low-level dependencies.

I wonder if there's any way to solve this dependency management for remote compute task? I think it's the peerClassLoading that does not work properly.


Updates:

After some further investigations, I find that the issue is not related to whether including bom or not, as it is just a version reference for other imported libraries.

I guess the root cause is about whether ignite peerClassLoading can download all dependencies including transitive dependencies.

I tested locally by turning off peerClassLoading and manually providing all jars as follows

docker run -v /local_path/to/dir_with_libs/:/opt/ignite/apache-ignite/libs/user_libs apacheignite/ignite

which works well.

So I think I should ask: how to properly configure the peerClassLoading strategy to make it aware of the full set of dependencies?

1

There are 1 best solutions below

1
Stanislav Lukyanov On

AFAIU you're having problems when you specify only the BOM dependency.

This looks like an issue with your Gradle usage. The statement platform('com.google.cloud:libraries-bom:26.22.0') doesn't itself declare any dependencies. BOM files are used to specify sets of dependencies of specific versions that are supposed to work together, and then you don't need to specify versions for these included dependencies. E.g.

implementation platform('com.google.cloud:libraries-bom:26.22.0')
implementation 'com.google.cloud:google-cloud-bigquery'
implementation 'com.google.cloud:google-cloud-storage'

If you only declare dependency on the BOM, or only copy the BOM file, you won't get the actual code you need.


Peer class loading is a viable option to load the libraries if you want to run the tasks from an app. In that case, your app is supposed to have the code of the task + all of the dependencies. The server isn't supposed to have ANY of these dependencies. Then, when the server tries to execute the task, it'll ask the client to share all of the required classes.


Another good option to deploy Maven dependencies is to use Code Deployment feature of the GridGain Control Center. You can connect your cluster to the Control Center and use its UI to manipulate the deployed code - you specify the Maven artifacts or upload files you want deploy, and the system takes care of the dependencies, versioning, etc. Again, specifying just the BOM won't work - you need the actual libraries your application uses (e.g. com.google.cloud:google-cloud-bigquery:2.23.2).