I've used ggbiplot to produce a biplot. I want points of different colours for different factor levels and the size of points to represent sample size. The problem that's driving me nuts is that part of the size vector c(1,2,2,1,3,2,2,1,3,3,2,1,1,1,2,3,2,2,3,1
is printed in the legend:
I used:
library(ggbiplot)
pplot <- function(pcaMatrix, cultures, sample.size = NA, FIRST=1, SECOND=2, ELLIPSE.PROB=0.68, SCALE = FALSE, LABELS=NULL){
PCA <- prcomp(na.omit(pcaMatrix), scale. = SCALE)
plot <- ggbiplot(PCA, obs.scale = 1, var.scale = 1, labels=LABELS, ellipse = {Ellipse.prob>0}, groups=cultures, circle = FALSE, choices=c(FIRST,SECOND), ellipse.prob=ELLIPSE.PROB)
plot <- plot + geom_point(aes_string(colour=cultures, size = sample.size)) #+ scale_size_identity()
plot <- plot + scale_color_discrete(name = '')
plot <- plot + scale_size_continuous(labels = c("200+", "500+", "1000+", "2000+", "4000+"))
plot <- plot + theme(legend.direction = 'horizontal', legend.position = 'top')
plot <- plot + theme(aspect.ratio=1)
print(plot)
}
pplot(grav, grav.cultures, grav.sample, 1,2, ELLIPSE.PROB=0)
where
> head(grav)
Microlithisme Substrat BpBG Crans Gm PD F
1 25.0 29.9 61.94226 0.5 0.3 27.0 0.3
2 44.8 29.3 20.93023 2.1 0.0 47.9 0.0
3 7.0 66.2 57.58929 0.5 0.2 5.8 0.2
4 13.2 12.9 55.41126 6.6 0.0 40.0 0.2
5 2.0 9.0 75.03045 0.1 0.0 6.9 0.0
6 16.4 17.3 51.36519 0.9 0.2 20.9 0.1
> head(grav.cultures)
[1] GR GR GR GR GR GR
Levels: AUR AZI EG EGL GR GR.Italy MAG SOL
The actual sample size is modified to control manually the size of points:
> head(grav.sample)
[1] 360 986 599 457 1222 899
grav.sample[grav.sample>=200 & grav.sample < 500] <- 1
grav.sample[grav.sample>=500 & grav.sample<1000] <- 2
grav.sample[grav.sample>=1000 & grav.sample<2000] <- 3
grav.sample[grav.sample>=2000 & grav.sample<4000] <- 4
grav.sample[grav.sample>=4000 & grav.sample<8000] <- 5
> head(grav.sample)
[1] 1 2 2 1 3 2
I really just can't figure this one out. Why is part of the size vector printing in my legend?