Mongo - Find all duplicate docs based on criteria

31 Views Asked by Sandeep Nair At 16 June 2023 at 08:37

I have following document

{
  "url": "/some/listing/url",
  "title": "HOTELS WITH RIVER VIEW IN SALZBURG",
  "pageType": "LISTING_PAGE",
  "pageMetaData": {
    "$placeId": 123,
    "hotels" : "1,2,3",
    "priority": 1
  }
},
{
  "url": "/some/listing/url",
  "title": "HOTELS FOR FAMILY IN SALZBURG",
  "pageType": "LISTING_PAGE",
  "pageMetaData": {
    "$placeId": 123,
    "hotels" : "1,2",
    "priority": 2
  }
},
{
  "url": "/some/listing/url",
  "title": "HOTELS in AUSTRIA",
  "pageType": "LISTING_PAGE",
  "pageMetaData": {
    "$placeId": 1,
    "hotels" : "1,2,3",
    "priority": 1
  }
},
{
  "url": "/some/listing/url",
  "title": "HOTELS with MOUNTAIN VIEW IN SALZBURG",
  "pageType": "LISTING_PAGE",
  "pageMetaData": {
    "$placeId": 123,
    "hotels" : "1,2,3",
    "priority": 2
  }
},
{
  "url": "/some/listing/url",
  "title": "HOTELS WITH RIVER VIEW IN HALLSTATT",
  "pageType": "LISTING_PAGE",
  "pageMetaData": {
    "$placeId": 34,
    "hotels" : "5,6",
    "priority": 1
  },
  {
  "url": "/some/listing/url",
  "title": "HOTELS FOR FAMILY IN HALLSTATT",
  "pageType": "LISTING_PAGE",
  "pageMetaData": {
    "$placeId": 34,
    "hotels" : "5,6",
    "priority": 2
  },
  {
  "url": "/some/external/url",
  "title": "HOTELS FOR FAMILY IN HALLSTATT",
  "pageType": "EXTERNAL_PAGE",
  "pageMetaData": {
    "$placeId": 34,
    "hotels" : "5,6",
    "priority": 3
  }
}

//just a note: hotels is string and already sorted, so if two pages have same set of hotels it will be same string

I am trying to find all the docs which are duplicate for place and sort it based on priority for a page type

For example for Listing_page type the result would based on priority ascending

"HOTELS WITH RIVER VIEW IN SALZBURG"  //has prio 1 and has duplicate hotels
"HOTELS WITH RIVER VIEW IN HALLSTAT" //has prio 1 and has duplicate hotels
"HOTELS with MOUNTAIN VIEW IN SALZBURG" //has prio 2 and has duplicate hotels
"HOTELS WITH FAMILY IN HALLSTAT" //has prio 2 and has duplicate hotels

since hotels in these places are duplicate of each other. Note even if austria as same hotel, but since it has different place is not considered duplicate

I know how to create a sql query where I could use "not exists" and get the duplicates. But I see that mongo does not have something like not exists. Also I dont want to do group by, as it will lose the pages and I want all pages with duplicate hotels sorted on priority

Original Q&A

Mongo - Find all duplicate docs based on criteria

There are 0 best solutions below

Related Questions in MONGODB

Related Questions in MONGODB-QUERY

Related Questions in REACTIVE-MONGO-JAVA

Trending Questions

Popular # Hahtags

Popular Questions