I am working on creating an ALS model in Pyspark using implicit data ( retail transactional data - taking # of units bought as implicit data).
Before throwing the data into the model, do we need to do some kind of standardization/normalization of the data? If not, how does it takes care of cases where an item is overbought or a user is an overbuyer. eg. milk is usually bought more than TVs, User1 usually buys lesser than User2?
Any pointers would be helpful. Thanks