Can I get a POPCNT on a YMM register?

287 Views Asked by At

I'm vectorizing some image processing code using 32 bit hand-written assembly to access AVX2 instructions. However I've run into a roadblock. The results of the vector operations end up in a YMM register and I need to get a population count(POPCNT) on that register. I cannot seem to find information on any instruction or tricks I could use to quickly get a population count on a YMM register.

My only recourse for the moment would be to copy the contents of the YMM register into memory and use the normal 32 bit POPCNT to compute it. This would require eight calls to POPCNT as well as 7 additions to sum it. It would be nice if there was a way to get the population count of the YMM register using less instructions.

It would have been perfect if AVX2 allowed me to do something like:-

POPCNT [EBP - 4], YMM1
0

There are 0 best solutions below