I`m doing basic descriptive statistics of a household survey dataframe. I have a column who reports the number of times of an event in certain period of time. The survey comes with a factor column equivalent to the weight of the observation.
So, when I use this code
times_theater<- descr(data17$s08a_02, report.nas = F, stats = "all")
times_theather
I get this
Descriptive Statistics
data17$s08a_02
N: 38201
s08a_02
----------------- ---------
Mean 2.58
Std.Dev 2.41
Min 1.00
Q1 1.00
Median 2.00
Q3 3.00
Max 40.00
MAD 1.48
IQR 2.00
CV 0.93
Skewness 5.80
SE.Skewness 0.08
Kurtosis 64.28
N.Valid 1027.00
Pct.Valid 2.69
This are the "brute" values, so I need to apply the weights:
times_theater<- descr(data17$s08a_02, report.nas = F, weights = data17$factor, stats = "all")
times_theather
And the output is this:
Weighted Descriptive Statistics
data17$s08a_02
Weights: factor
N: 38201
s08a_02
--------------- -----------
Mean 2.55
Std.Dev 2.31
Min 1.00
Median 2.00
Max 40.00
N.Valid 288118.00
Pct.Valid 2.57
As you can see, I lost the quartile's information (Q1, Q3, IQR) and I would really like them to show up in the same output.
Any ideas on how to solve this?
pd: I know in this case the differences are almost non-existent, but there are some spending and income variables I would really need to get the quartiles later on.
Edit2: I know the documentation says descr() quartiles won't work with weights, I want a way to calculate them and insert them in the previous output.
Hmiscpackage contains a bunch of weighted functions includingwtd.quantile. Consider the following snippet:Which will result:
As its defaults to quartile values. Of course one can specify any quantile values. Cf.
?wtd.quantileand there is alsosurvey::svyquantile, in case you have a complex sampling design.