I've been looking for an easy way of creating a data.frame of summary statistics in R, and haven't been able to find anything. The
summary() function seems to output a list, and it isn't easily malleable into a data.frame. This makes it hard to add other stats to the list, or to query it from other functions. I've written a simple function that uses boxplot() plus a few other bits to make a nice data.frame.
It doesn't do any checking of the data, you need to do that yourself. This is licensed under GPL3 or later. Please link back here if cross-posting it elsewhere.
summary_stats <- function(these_data, output_dir) {
num_NAs=as.data.frame(t(colSums(is.na(these_data))))
rownames(num_NAs)<-"NA count"
means<-as.data.frame(t(colMeans(these_data, na.rm=TRUE)))
rownames(means)<-"means"
num_dat=as.data.frame(t(rep(nrow(these_data),ncol(these_data))))
rownames(num_dat)<-"num data"
names(num_dat)<-names(these_data)
stats<-boxplot(these_data,plot=FALSE)
stats<-as.data.frame(stats$stats[1:5,])
names(stats)<-names(these_data)
rownames(stats)<-c("minimum (excl outliers)","lower quartile","median", "upper quartile", "maximum (excl outliers)")
output<-as.data.frame(rbind(
num_NAs,
num_dat,
means,
stats
))
return(output)
}