One of the attributes of CUDA memory pools is CU_MEMPOOL_ATTR_REUSE_ALLOW_OPPORTUNISTIC, described in the doxygen as follows:
Allow reuse of already completed frees when there is no dependency between the free and allocation.
If a free (a cuFreeAsync() I presume) depends on an allocation - how can that free be completed when the allocation needs to happen? Or - am I misunderstanding what this attribute allows?
This flag is explained in the CUDA programming guide.