Data Analysis

Sorting Data in R

Experimental RLab #2

Sorting data is one of the fundamental techniques of any data analysis. In R, sorting is all done in-memory with a variety of functions (e.g. sort, order). SQL -like queries are also possible through the `sqldf` library.


#------------------------------------------------------------------------------
# SORT() and ORDER()

## SORT()
# a random, normal vector
set.seed(1)
x <- rnorm(10)
x

# sort x using sort()
sort(x)

# retrieve the sorted order of the values in x:
# and apply to x
idx_x <- sort(x, index.return=TRUE)
x[idx_x]

# # ORDER()
# like sort(x, index.return=TRUE) this returns an index
# of the ordered elements
idx_x2 <- order(x)
identical(x[idx_x], x[idx_x2])

#------------------------------------------------------------------------------
# PRACTICE: the mtcars dataset
library(datasets)
data(mtcars)

# sort mtcars by horsepower, returning the top 5 cars and the horsepower
idx_hp <- order(mtcars$hp, na.last=NA)
mtcars[idx_hp, ]

# deterministic search, removing 'ties' where cars have equal horsepower
# sort on hp, then on mpg
idx_hp_mpg <- order(mtcars["hp"], mtcars["mpg"], na.last=NA)
mtcars[idx_hp_mpg, ]

# let's make hte sort a little clearer, and take only the subset we are interested in
selected <- subset(mtcars[idx_hp_mpg, ], select=c(mpg, hp))
selected

#------------------------------------------------------------------------------
# SQLDF

## the sqldf library provides SQL-like queries over data frames
# so we can do the same:

library(sqldf)
sqlResults <- sqldf("SELECT row_names, mpg, hp
                     FROM mtcars
                     ORDER BY hp, mpg", row.names=TRUE)

identical(selected, sqlResults)

The official CRAN package site for sqldf is: http://cran.r-project.org/web/packages/sqldf/index.html
Google Code Page: https://code.google.com/p/sqldf/
Google Group for sqldf: https://groups.google.com/forum/#!forum/sqldf
GitHub repo: https://github.com/cran/sqldf

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s