How transform a data-frame into spc classes with package "zipfR"?

624 Views Asked by At

I have a data.frame that represents the frequency of frequency of RT(retweets). I have this type of data.frame:

data.frame':368 obs.of 2 variables: $ Var1: Factor w/ 368 levels "1","2","3","4",..: 1 2 3 4 5 6 7 8 9 10 ... $ Freq: int 71482 16111 7720 4555 2949 2053 1620 1210 978 775 ...

I want to use the following comand by "zipfR" package:

gigp_pos <- lnre("gigp",cost="chisq",method="NLM",rt_pos.spc)

then I must transform this data-frame into spc object. This type of object needs to express some variables:m,v,N,Vm.

I put: Vm<- frq_frq_pos$Freq m<- frq_frq_pos$Var1

but I don't understand the difference between the variable V and the variable N. Can you help me?

3

There are 3 best solutions below

0
On

I'm a new user of zipfR as well but I believe you can use

spc(Vm=frq_frq_pos$Freq, m=frq_frq_pos$Var1)

Maybe you'll also want to unfactor Var1 -- why is it a factor anyways?

  • V = number of unique terms (called 'types' in the package); =sum(Vm)
  • N = total number of observations/occurances (called 'tokens' in the package); =sum(Vm*m)
  • 'spc' stands for spectrum.
1
On
rt_pos = your data.frame
Vm = rt_pos$Freq
m = 1:length(Vm)
rt_pos.spc = spc(Vm, m)

you cant use ?spc to see the details

0
On

N is size of sample (number of tokens), V is its vocabulary (number of types). If for some reasons you want to avoid creating spc object (see below) you can get N and V in an easy way.

N <- sum(frq_frq_pos$Freq * frq_frq_pos$Var1)
V <- sum(frq_frq_pos$Freq)

A better way is to use spc function

your.spc <- spc(Vm=frq_frq_pos$Freq, m=frq_frq_pos$Var1)

then you won't have to calculate N and V because it is already there in spc object:

 N(your.spc)
 V(your.spc)

BUT, if you have an access to raw data (I guess it is some text?), then the easiest way to obtain a spc object is a function text2spc.fnc (from languageR package):

 your.spc <- text2spc.fnc(your.text) 

Then you can call:

 your.spc$Vm
 your.spc$m
 N(your.spc)
 V(your.spc)