Project by, with optional properties

117 Views Asked by At

I believe this question is for Tinkerpop, not specific to the CosmosDB implementation; just some semantics might be baked into my query examples.

I've developed a data layer that creates queries based on some metadata information. Currently, my data layer will only persist non-null data values to the graph vertex; this is causing troubles with my retrieval mechanism.

Provided the following data model, where the field "HomeRoute" may or may not exist on the actual vertex (depending on whether it was populated or not).

{
"ApplicationModule": string
"Title": string
"HomeRoute": string?
}

My initial query structure is as follows, which does not support the optional properties (discussed later).

g.V()
.has('ApplicationsTest', 'partitionId', '')
.project('ApplicationModule','Title','HomeRoute')
.by('ApplicationModule')
.by('Title')
.by('HomeRoute');

To simulate, we can insert a vertex:

g.addV('ApplicationsTest')
.property('partitionId', '')
.property('ApplicationModule', 'TestApp')
.property('Title', 'Test App')
.property('HomeRoute', 'testapphome');

And we can successfully query it using my base query noted above, which returns it in my desired JSON format.

[
  {
    "ApplicationModule": "TestApp",
    "Title": "Test App",
    "HomeRoute": "testapphome"
  }
]

If we now insert a vertex without the HomeRoute property (since it was null within the application layer), my base query will fail.

g.addV('ApplicationsTest')
.property('partitionId', '')
.property('ApplicationModule', 'TestApp')
.property('Title', 'Test App');

Executing my base query now results in error:

Gremlin Query Execution Error: Project By: Next: The provided traverser of key "HomeRoute" maps to nothing.

I can apply a coalesce operation against "optional" fields; my current understanding has allowed me to return a constant value in the case of undefined properties. Updating my base query as follows will return "!dbnull" when a property does not exist on the vertex:

g.V()
.has('ApplicationsTest', 'partitionId', '')
.project('ApplicationModule','Title','HomeRoute')
.by('ApplicationModule')
.by('Title')
.by(values('HomeRoute')
    .fold()
    .coalesce(unfold(), constant('!dbnull')));

This query when executed returns the values as expected, again in JSON format.

[
  {
    "ApplicationModule": "TestApp",
    "Title": "Test App",
    "HomeRoute": "testapphome"
  },
  {
    "ApplicationModule": "TestApp",
    "Title": "Test App",
    "HomeRoute": "!dbnull"
  }
]

My question (still new to Gremlin / Tinkerpop queries) - is there any way that I can get this result with only the properties which are present on the respective vertices?

My desired output from this example is below, which would allow my data layer to only unbundle the values present on the graph vertex and not have to consider string "!dbnull" values.

[
  {
    "ApplicationModule": "TestApp",
    "Title": "Test App",
    "HomeRoute": "testapphome"
  },
  {
    "ApplicationModule": "TestApp",
    "Title": "Test App"
  }
]
2

There are 2 best solutions below

0
On

I've found a way to achieve what I'm looking for. Would still love input from the community though, if there's optimizations or other considerations.

g.V()
.has('ApplicationsTest', 'partitionId', '')
.project('ApplicationModule','Title','HomeRoute')
.by('ApplicationModule')
.by('Title')
.by(values('HomeRoute')
    .fold()
    .coalesce(unfold(), constant('!dbnull')))
.local(unfold()
    .where(select(values).is(without('!dbnull')))
    .group().by(select(keys)).by(select(values)))
2
On

If you only need specific keys that already exist on the vertex you can use valueMap no need to use project:

g.V()
.has('ApplicationsTest', 'partitionId', '')
.valueMap("ApplicationModule", "Title", "HomeRoute").by(unfold())

example: https://gremlify.com/9fua9jsu0dh