How to use subset in R

base-r
subset
Master subset in R programming with clear examples. Complete tutorial covering syntax, use cases, and best practices.
Published

February 22, 2026

Introduction

The subset() function in R provides an intuitive way to filter rows and select columns from data frames based on specific conditions. It’s particularly useful for data exploration and creating focused views of your data without modifying the original dataset.

Getting Started

library(tidyverse)
data(mtcars)

Example 1: Basic Usage

The Problem

We need to filter the mtcars dataset to find cars with good fuel efficiency. Let’s extract cars that get more than 20 miles per gallon.

Step 1: Filter rows with simple condition

We’ll use subset() to filter cars based on mpg values.

# Filter cars with mpg > 20
efficient_cars <- subset(mtcars, mpg > 20)
head(efficient_cars, 3)

This creates a new data frame containing only cars that exceed 20 mpg.

Step 2: Add multiple conditions

Now we’ll combine multiple criteria to be more specific about our selection.

# Filter cars with mpg > 20 AND automatic transmission
efficient_auto <- subset(mtcars, mpg > 20 & am == 0)
nrow(efficient_auto)

We now have cars that are both fuel-efficient and have automatic transmission.

Step 3: Select specific columns

We can also choose which columns to include in our filtered results.

# Filter rows and select specific columns
car_basics <- subset(mtcars, mpg > 20, 
                    select = c(mpg, hp, wt))
head(car_basics)

This gives us a cleaner view with only the variables we’re interested in analyzing.

Example 2: Practical Application

The Problem

Imagine you’re a car dealer looking for vehicles to recommend to customers who want powerful yet reasonably efficient cars. You need cars with horsepower above 100, mpg above 15, and want to exclude the heaviest vehicles.

Step 1: Define the target criteria

We’ll start by filtering based on horsepower and fuel efficiency requirements.

# Find moderately powerful and efficient cars
target_cars <- subset(mtcars, hp > 100 & mpg > 15)
cat("Found", nrow(target_cars), "cars meeting basic criteria")

This initial filter gives us a good starting point for our search.

Step 2: Refine with weight restrictions

Now we’ll add weight considerations to avoid recommending overly heavy vehicles.

# Add weight restriction (less than 3.5 thousand lbs)
ideal_cars <- subset(mtcars, 
                    hp > 100 & mpg > 15 & wt < 3.5,
                    select = c(mpg, hp, wt, qsec))
print(ideal_cars)

We now have cars that balance power, efficiency, and reasonable weight.

Step 3: Create a summary view

Let’s organize our results to make them more presentable for customers.

# Add row names as a column and arrange the data
final_recommendations <- ideal_cars |>
  rownames_to_column("car_model") |>
  arrange(desc(mpg))
print(final_recommendations)

This creates a customer-friendly list ranked by fuel efficiency.

Step 4: Validate the selection

Finally, we’ll verify our selection makes sense by checking some basic statistics.

# Check the range of values in our selection
summary(final_recommendations[, c("mpg", "hp", "wt")])

The summary confirms our filtered data meets all the specified criteria and shows the range of values customers can expect.

Summary

  • subset() provides an intuitive way to filter data frames using logical conditions with simple syntax
  • You can combine multiple conditions using logical operators like & (and) and | (or) for complex filtering
  • The select parameter allows you to choose specific columns while filtering, reducing data complexity
  • Multiple conditions can be chained together to create very specific data selections for analysis
  • Always verify your subset results to ensure the filtering logic worked as expected