kernel density estimator in R

362 Views Asked by At

I'm using the last column from the following data, Data

And I'm trying to apply the idea of a kernel density estimator to this dataset which is represented by Equation setup for kernal density

where k is some kernal, normally a normal distribution though not necessarily., h is the bandwidth, n is the length of the data set, X_i is each data point and x is a fitted value. So using this equation I have the following code,

AstroData=read.table(paste0("http://www.stat.cmu.edu/%7Elarry",
                   "/all-of-nonpar/=data/galaxy.dat"),
            header=FALSE)
x=AstroData$V3
xsorted=sort(x)
x_i=xsorted[1:1266]
hist(x_i, nclass=308)
n=length(x_i)
h1=.002
t=seq(min(x_i),max(x_i),0.01)
M=length(t)
fhat1=rep(0,M)
for (i in 1:M){
 fhat1[i]=sum(dnorm((t[i]-x_i)/h1))/(n*h1)}
lines(t, fhat1, lwd=2, col="red")

Which produces a the following plot, Resulting plot

which is actually close to what I want as the final result should appear as this once I remove the histograms, Final plot

Which if you noticed is finer tuned and the red lines which should represent the density are rather rough and are not scaled as high. The final plot that you see is run using the density function in R,

plot(density(x=y, bw=.002))

Which is what I want to get to without having to use any additional packages.

Thank you

1

There are 1 best solutions below

2
On BEST ANSWER

After some talk with my roommate he gave me the idea to go ahead and decrease the interval of the t-values (x). In doing some I changed it from 0.01 to 0.001. So the final code for this plot is as appears,

AstroData=read.table(paste0("http://www.stat.cmu.edu/%7Elarry",
               "/all-of-nonpar/=data/galaxy.dat"),
        header=FALSE)
x=AstroData$V3
xsorted=sort(x)
x_i=xsorted[1:1266]
hist(x_i, nclass=308)
n=length(x_i)
h1=.002
t=seq(min(x_i),max(x_i),0.001)
M=length(t)
fhat1=rep(0,M)
for (i in 1:M){
fhat1[i]=sum(dnorm((t[i]-x_i)/h1))/(n*h1)}
lines(t, fhat1, lwd=2, col="blue")

Which in terms gives the following plot, which is the one that I wanted,

Correct plot