Data Model for Blog Post with comments in Elastic Search

669 Views Asked by At

What's the best way to structure a post/comment system using elasticsearch? I'm using elasticsearch as a secondary database.

There would be a post with a multithreaded commenting system, maybe two levels deep. Each post can have up to 500-1000 comments. There will be incremental counters for both likes and comments for each comment and post. This means a lot of indexing. Also, I would like to fetch the Blog Post with their comments on applying filters.

Right now, my structure looks like this. In this one, the blog post and the user details would be edited very rarely, but the tags and comments would be added frequently.

{
"_index": "brainstormer_ideas_with_comments",
"_type": "_doc",
"_id": "1",
"_version": 1,
"_seq_no": 0,
"_primary_term": 1,
"found": true,
"_source": {
    "id": 1,
    "brainstormer_id": 1,
    "idea": "cCZhvy",
    "description": "2jJPo3hYbqeh2VBnDJeGtylVu7qfe_MRp77hTK6t7SN57GzeQG8c",
    "user": {
        "id": "user-1",
        "login": "pO2DSqIS--"
    },
    "created_at": "2020-08-13T20:35:17+00:00",
    "like_count": 41,
    "comment_count": 45,
    "tags": [
        "bU37X",
        "a_Rl5b",
        "vxD.ZMo",
        "AmvtHVuQ",
        "yx9oSx-_D"
    ],
    "comments": [
        {
            "id": "comment-1",
            "comment": "7ewh-Cqf4gQqmIK53jXbR7",
            "tags": [
                "mJN",
                "jFm-",
                "hV0pi",
                "ONGNOw",
                "HtzmDfO",
                "dawVLk09"
            ],
            "created_at": "2020-08-08T20:35:17+00:00",
            "user": {
                "id": "user-1",
                "login": "Tl6CDNawUh"
            }
        },
        {
            "id": "comment-1",
            "comment": "BKj8sAcbJJXWxAPk3HQFTZWtvQm",
            "tags": [
                "sYj",
                "XRLw",
                "xtAeH",
                "Oq6dBR",
                "lj4_hOI",
                "n3lhc2ig"
            ],
            "created_at": "2020-09-21T20:35:17+00:00",
            "user": {
                "id": "user-2",
                "login": "AF3KT415uf"
            }
        },
        {
            "id": "comment-1",
            "comment": "vzt7XEe2WIP3OszpLmcF8J",
            "tags": [
                "YCH",
                "kodm",
                "RGv2B",
                "Qk5R1D",
                "ICrDjmz",
                "4mmfLK16"
            ],
            "created_at": "2020-07-08T20:35:17+00:00",
            "user": {
                "id": "user-3",
                "login": "7xTLOuCeWD"
            }
        },
        {
            "id": "comment-1",
            "comment": "Jm6E3PrlOI",
            "tags": [
                "IrZ",
                "TJlf",
                "__HQy",
                "5VH2Vs",
                "btvxG51",
                "5iRoVR_k"
            ],
            "created_at": "2020-07-19T20:35:17+00:00",
            "user": {
                "id": "user-4",
                "login": "zr32RlxNak"
            }
        },
        {
            "id": "comment-1",
            "comment": "jKGzoZhCpUv4DrvoebamXLnmvyX_CK0",
            "tags": [
                "Osa",
                "OKlQ",
                "cBcjt",
                "2BcQD7",
                "K7lLhS7",
                "ZK1t_GXl"
            ],
            "created_at": "2020-07-14T20:35:17+00:00",
            "user": {
                "id": "user-5",
                "login": "B8LGMpPWwv"
            }
        },
        {
            "id": "comment-1",
            "comment": "L-PryTXsa1FbEnIJdH_5vlsdpfnckB1kmMJI4EVwszhc45qlW6e",
            "tags": [
                "kRJ",
                "Mkka",
                "ari.I",
                "pgWcUk",
                "w78vFir",
                "eOx.zRx9"
            ],
            "created_at": "2020-08-07T20:35:17+00:00",
            "user": {
                "id": "user-6",
                "login": "IG1Oo_fOcr"
            }
        }
    ]
}

}

Is it better to go with nested objects OR parent/child OR something else? Any advice on structure and how often to update elastic search would be most appreciated.

Thanks,

1

There are 1 best solutions below

0
On

Both nested objects and parent-child relationships are costly.

One way is to create a separate document in Elasticsearch for each comment/reply on the main post and instead of a strict parent-child relationship, just have a field in which tells what is the parent post, ie have loose coupling/relationship between your documents.

The default refresh interval for elasticsearch is 1 sec for providing the NRT, if you want you can keep this default or fine-tune it according to your use-case and performance requirements.