Making direct gremlin queries from gremlin-python

502 Views Asked by At

I've encountered several issues with gremlin-python that aren't such in pure gremlin:

  • I can't directly select a given vertex type (g.V('customer')) without iterating over all vertices (g.V().hasLabel('customer'))
  • I get "Maximum Recursion reached" errors from Python. The same query in gremlin works smooth and fast
  • The ".next()" command works really slow in gremlin-python while in gremlin takes 1 sec

So, from Python/gremlin-python, I would like to be able to make a pure gremlin query to the server and directly store its result in a Python variable. Is that possible?

(I'm using gremlin-python on Apache Zeppelin if that matters)

1

There are 1 best solutions below

0
On BEST ANSWER

I can't directly select a given vertex type (g.V('customer')) without iterating over all vertices (g.V().hasLabel('customer'))

g.V('customer') in Gremlin means "find a vertex with the id 'customer'" not "find vertices with label 'customer'. For the latter you need what you wrote g.V().hasLabel('customer'). Those rules are the same in every variation of Gremlin including Python. And, you are correct that a query like g.V().hasLabel('customer') will be expensive as there aren't many graphs that optimize this type of operation. On large graphs this would typically be considered an OLAP query that you would do with with Gremlin Spark.

I get "Maximum Recursion reached" errors from Python. The same query in gremlin works smooth and fast

That was a bug. It is resolved now, but the fix is not released to pypi. A release is currently being prepared and so you will see this on 3.2.10 and 3.3.4. If you need an immediate patch, you can see that the fix was fairly trivial.

The ".next()" command works really slow in gremlin-python while in gremlin takes 1 sec

I'm not sure what you're seeing exactly. I think you might want to detail more about your environment with specifics as to how to recreate the difference. Perhaps you should bring that question to the gremlin-users mailing list.

So, from Python/gremlin-python, I would like to be able to make a pure gremlin query to the server and directly store its result in a Python variable. Is that possible?

That's perfectly possible and is exactly what gremlin-python is meant to do. It enables you to write Gremlin in Python and get results back from the server to do with as you need on the client side.