The goal of quickinspectR is to make it easier (and quicker) for
beginner R programmers to graphically inspect their data. The package is
centered around an expanding collection of simple inspect
functions
that help users visualize their data.
Currently, this package supports:
inspect_normality
: graphically inspect the distribution of numeric variables in your data.inspect_balance
: graphically inspect the class balance (or lack thereof) in your data.inspect_missing
: graphically inspect missingness in your data.
You can install the development version of quickinspectR from GitHub with:
# install.packages("devtools")
devtools::install_github("andrewfullerton/quickinspectR")
By design, all quickinspectR functions require only one argument: a
data frame or a tibble (data
). Unless otherwise specified, functions
will display all the relevant variables contained in the data frame and
default to easy-to-read plot styling.
To get started, load the package.
library(quickinspectR)
If you want to see how the numeric variables in your data are
distributed (e.g. check if they are skewed), you can use
inspect_normality
.
inspect_normality(data = iris)
If you want to check for class imbalance (e.g. unequal distribution of
classes within categorical variables), you can use inspect_balance
.
Tip: if you’re only interested in a few key variables in your data,
then you can use the vars
argument to manually specify which variables
will be displayed.
inspect_balance(data = palmerpenguins::penguins, vars = c("species", "island"))
If you want to quickly see if you’re missing any data (and more
importantly: where you’re missing data), then you can use
inspect_missing
.
inspect_missing(data = palmerpenguins::penguins)
Even though all inspect
functions will run with data
as their sole
argument, there may be times when you want to stylize your plots a bit
more. To make this more accessible, each inspect
function contains
several additional (but completely optional) arguments to enable basic
plot customization. Here’s an example using inspect_missing
:
inspect_missing(data = palmerpenguins::penguins,
na_colour = "purple", # Change colour used to represent missing values
fill_colour = "darkgreen", # Change colour used to represent non-missing values
title = "Missing values in penguins dataset") # Add title
And while advanced plot customization is not the goal of quickinspectR, all its functions are built on top of ggplot2 and support more advanced customization via additional arguments. To learn more about this, you can read more in the “Advanced usage” section of this vignette.
Statement of data usage: This package and its documentation make use of
the iris
and airquality
datasets from the datasets
package as well
as the penguins
dataset from the palmerpenguins
package