Tuesday, January 24, 2012

Winsorisation in R

I wrote a function to hard-winsorise each column of a data.frame to 3 standard deviations. I couldn't find anything that did this neatly.
It doesn't do any checking of the data, you need to do that yourself. This is licensed under GPL3 or later. Please link back here if cross-posting it elsewhere.

winsor_clip_data <- function (x, std = 3, na.rm = TRUE)
{
clip_vec <- function(dat, min, max){
# hard clip dat to the rangs max, min
dat[dat > max] <- max
dat[dat < min] <- min
return(dat)
}

sds<-as.matrix(apply(x, 2, sd, na.rm=TRUE))
means<-as.matrix(apply(x, 2, mean, na.rm=TRUE))
mins<-means-3*sds
maxs<- means+3*sds
output<- mapply(clip_vec, x, mins, maxs)

return(output)
}

No comments:

Post a Comment

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.