I am looking at the LDBC benchmark which has contributions from Neo4j and TigerGraph. I want to understand how entries are ingested to measure performance.
Here are two example entries from "Person_likes_Post".
{"creationDate":1296583977045,"deletionDate":1577664000000,"explicitlyDeleted":false,"PersonId":13194139533355,"PostId":412316861128}
{"creationDate":1296750065049,"deletionDate":1296750075058,"explicitlyDeleted":true,"PersonId":13194139533355,"PostId":412316861129}
Does it mean only the edge is deleted when "explicitlyDeleted":true ?
When "explicitlyDeleted":false, does it mean the src node is deleted, dst node is deleted or both?
Link to the benchmark doc:
https://ldbcouncil.org/ldbc_snb_docs/ldbc-snb-specification.pdf
Download link to the example LDBC dataset containing these entries:
https://ldbcouncil.org/ldbc_snb_datagen_spark/social-network-sf0.003-bi-composite-merged-fk.zip
(I wanted to tag LDBC but there is no such an option.)
The
explicitlyDeletedattribute indicates whether there is a delete operation that targets specifically the given entity (i.e. a node or edge in the graph). This distinction is needed because the LDBC SNB workloads have cascading deletes where the deletion of an entity may trigger the deletion of other entities.For example, a
Person_likes_Postedge can be deleted due to various explicit delete operations:Person_likes_PostedgePersonPostForumthat contains its targetPostPersonwhoseAlbum/Wall(which areForumsubtypes) contains itstarget PostFor the
Person_likes_Postedge, theexplicitlyDeletedattribute is true in case 1, and false for the other cases.Note that this attribute is only part of the raw data set. The data sets used for the actual workload executions (Interactive, BI) only contain explicit delete operations, hence they omit this attribute.