Why Ceph calculate PG ID by object hash rather than CRUSH algorithm?

310 Views Asked by Wizmann At 28 July 2025 at 00:32

Ceph using CRUSH algorithm for PG->OSD mapping and it works fine for increasing/decreasing of OSD nodes.

But for obj->PG mapping, Ceph still uses the traditional hash, which is pgid = hash(obj_name) % pg_num. This approach may lead to massive data migration if we change the number of PGs, even reduce the availability of the system.

Why Ceph doesn't use CRUSH algirhtm (say straw2) for obj->PG mapping which could have optimal amount of data migration when the number of PGs is changed?

Original Q&A

There are 1 best solutions below

uncleDuo On 12 December 2020 at 06:00

There are different scenarios and CRUSH is not a silver bullet I think.

PG->OSD is a one-to-many function while obj->PG is a one-to-one function.
Additions and deletions of OSD are fairly frequent, while PG is considered fairly stable.
The OSD group could be partially unavailable while PG will not.

This is my perception, criticism or discussion is welcome.

Why Ceph calculate PG ID by object hash rather than CRUSH algorithm?

There are 1 best solutions below

Related Questions in CEPH

Related Questions in CEPHFS

Related Questions in RADOSGW

Related Questions in CRUSH

Trending Questions

Popular # Hahtags

Popular Questions