OOM error when running gremlin queries asynchronously with JAVA

Question

OOM error when running gremlin queries asynchronously with JAVA

474 Views Asked by Vikas At 08 October 2020 at 10:00

We have created a rest API that executes a gremlin query on the Janus graph and returns the result in JSON format. API works file for small result sets. But for large result sets, when we hit the API asynchronously, it gives the following error, (max heap size -Xmx4g

java.lang.OutOfMemoryError: GC overhead limit exceeded

I am using curl with & to hit API asynchronously,

curl --location --request GET 'http://HOST:PORT/graph/search?gremlin=query &
curl --location --request GET 'http://HOST:PORT/graph/search?gremlin=query &
curl --location --request GET 'http://HOST:PORT/graph/search?gremlin=query &
curl --location --request GET 'http://HOST:PORT/graph/search?gremlin=query &

Code to connect to janus graph,

cluster = Cluster.open(config);
connect = cluster.connect();

submit = connect.submit(gremlin);
Iterator<Result> resultIterator = submit.iterator();
int count=0;
while (resultIterator.hasNext()){
    //add to list, commented to check OOM error
}

Configurations,

config.setProperty("connectionPool.maxContentLength", "50000000");
config.setProperty("connectionPool.maxInProcessPerConnection", "30");
config.setProperty("connectionPool.maxInProcessPerConnection", "30");
config.setProperty("connectionPool.maxSize", "30");
config.setProperty("connectionPool.minSize", "1");
config.setProperty("connectionPool.resultIterationBatchSize", "200");

Gremlin driver,

org.apache.tinkerpop.gremlin-driver:3.4.6

How to handle large resultset like a cursor so that not all the data is loaded in the memory?
Is there any configuration that I am missing? Highly appreciate any help.

Gremlin query:

g.withSack(0).V().hasLabel(%27material%27).has(%27dim_batchid%27,within(5028245,5080395,5366265,5159380,4872924,5093856,5216023,5068771,5093820,5154387,4703406,4872835,5214752,4893085,4866319,4556751,5342365,5075448,5074467,4835525,4987972,5347712,4986643,5204689,4755232,5076490,5028246,4922387,4659627,4597456,4743346,5080956,5370167,5260125,5134845,4613324,4720631,4937766,5356972,5148510,5210986,4930135,4984021,4720172,5028031,4836893,5068621,5333830,5020806,5081693,4988567,4869467,4709219,4958246,5021639,4607913,4923487,4614485,5066054,4869093,5339365,5204715,4980349,5215913,5342616,4959705,4959549,4929369,5022805,4920163,5204563,5027627,5208788,4712451,4862298,5019103,4982159,4727160,5395618,4924536,5390450,4943986,5071744,5208844,4898192,5347546,5204875,4710474,4794222,4962808,5269053,4836267,4602886,5359126,5393203,4780380,5148475,5092749,5351705,5339311,4601782,4869039,5366475,4959070,4963475,5346888,4923494,5279816,5297980,5154181,5030501,5142954,5392329,4839306,4890656,5134911,4893104,4989444,5069672,4961009,5027559,5029007,5285813,4820025,5287707,4959634,5148474,5362926,5362211,4557278,5353486,4933573,4785560,4890658,4930937,4553089,5030503,5341503,4783801,5068529,4821152,5208845,4766406,5043752,4770709,4733416,5204713,4815450,4981053,4963427,4980830,5340154,4771353,5204561,4920161,4794149,5275867,5021788,5364102,5205411,5356459,4794233,4923438,4610509,5392350,4746342,5022804,4936411,5361555,4890888,4980829,4959869,4869092,4891157,4815449,5267434,4836975,4684010,5281322,5071746,4711290,5289333,5021638,5299283,5210803,5348731,5068491,4776862,5196532,4766677,4930133,5210984,4608878,5261295,4826630,4786051,4779996,4930134,5020804,4766678,4869064,5286802,4545299,4693065,4930844,4816538,4888415,4711706,4923002,4780402,5044968,5148437,4753993,5074466,4890805,5074558,5076491,4547035,5092021,5262308,5205445,5213382,5159381,5263280,5351407,4890706,4659738,5344469,5075928,4613336,5065866,4863764,5217111,4792255,5210914,5204691,4890806,5148438,4986897,4817686,4712337,5196528,5280266,4929327,5134843,5393007,5019151,4923482,4763007,4929395)).emit().repeat(sack(sum).by(constant(1)).inE().outV()).project(%27level%27,%27properties%27).by(sack()).by(tree().by(valueMap().by(fold().unfold())).by(valueMap().by(fold())))

From profiling, it is clear that the gremlin driver is causing the issue but I am not sure how to fix it and release the memory.

Also, the threads go into a frozen state for more than 5 mins,

Original Q&A

There are 1 best solutions below

**stephen mallette** · Accepted Answer · 2020-10-08T16:33:34.340000

I think that it is possible that you are running into this issue TINKERPOP-2424. Basically the queue that holds the incoming results is filling faster than you can consume the results and you blow the heap. You can see that there is a patch there that in the issue that seems to solve the problem but I'm not convinced that it's the best solution just yet so it hasn't be implemented. If you have suggestions for how to resolve the problem, please feel free to comment on the ticket. If that is not the issue you are facing I think you'd have to provide a way to replicate your problem or do some profiling to isolate your issue further. Perhaps it would be good to do some profiling anyway as you should be able to prove that TINKERPOP-2424 is your problem that way. If you have a look at the mailing list link in that post you should see the approach taken to verify the problem.

OOM error when running gremlin queries asynchronously with JAVA

There are 1 best solutions below

Related Questions in JAVA

Related Questions in OUT-OF-MEMORY

Related Questions in GREMLIN

Related Questions in JANUSGRAPH

Related Questions in TINKERPOP3

Trending Questions

Popular # Hahtags

Popular Questions