How to Use keep() and discard() in R

purrr

purrr keep()

Filter list elements with purrr’s keep() and discard(). Learn to select or remove elements based on predicate functions.

Published

April 3, 2026

Introduction

The keep() and discard() functions filter list elements based on a predicate (a function that returns TRUE or FALSE). They’re the purrr equivalent of filtering, but for lists instead of data frame rows.

keep() - keeps elements where the predicate is TRUE
discard() - removes elements where the predicate is TRUE (keeps FALSE)

Getting Started

library(tidyverse)
library(palmerpenguins)

keep(): Select Elements

Basic usage

Keep elements that match a condition:

# Keep only even numbers
numbers <- list(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
keep(numbers, \(x) x %% 2 == 0)

Keep by type

# Mixed list
mixed <- list(
  a = 1:3,
  b = "hello",
  c = 4:6,
  d = "world"
)

# Keep only numeric elements
keep(mixed, is.numeric)

Keep by length

# Lists of different lengths
data <- list(
  short = 1:2,
  medium = 1:5,
  long = 1:10,
  tiny = 1
)

# Keep elements with more than 3 items
keep(data, \(x) length(x) > 3)

discard(): Remove Elements

Remove elements matching a condition

numbers <- list(1, 2, 3, 4, 5, NA, 6, NA, 7)

# Remove NA values
discard(numbers, is.na)

Remove empty elements

data <- list(
  a = 1:3,
  b = character(0),  # empty
  c = 4:6,
  d = NULL,          # NULL
  e = integer(0)     # empty
)

# Discard empty vectors
discard(data, \(x) length(x) == 0)

# Or use is_empty helper
discard(data, is_empty)

Practical Example: Data Frame Columns

Keep numeric columns

penguins |>
  keep(is.numeric)

Discard columns with missing values

# Remove columns that have ANY missing values
penguins |>
  discard(\(x) any(is.na(x)))

Keep columns above a threshold

# Keep numeric columns with mean > 100
penguins |>
  keep(is.numeric) |>
  keep(\(x) mean(x, na.rm = TRUE) > 100)

Working with Nested Lists

Filter nested data

# Nested list structure
nested_data <- list(
  group1 = list(n = 100, values = 1:100),
  group2 = list(n = 5, values = 1:5),
  group3 = list(n = 50, values = 1:50)
)

# Keep groups with n > 20
keep(nested_data, \(x) x$n > 20)

Filter model results

# Fit multiple models
models <- list(
  m1 = lm(mpg ~ wt, data = mtcars),
  m2 = lm(mpg ~ wt + hp, data = mtcars),
  m3 = lm(mpg ~ cyl, data = mtcars)
)

# Keep models with R-squared > 0.8
keep(models, \(m) summary(m)$r.squared > 0.8)

compact(): Remove NULL and Empty Elements

A special shortcut for a common operation:

# List with NULLs
data <- list(a = 1, b = NULL, c = 3, d = NULL, e = 5)

# Remove NULLs with compact()
compact(data)

# Equivalent to:
discard(data, is.null)

Combining with Other purrr Functions

Filter then transform

# Keep numeric, then calculate stats
penguins |>
  keep(is.numeric) |>
  map(\(x) list(
    mean = mean(x, na.rm = TRUE),
    sd = sd(x, na.rm = TRUE)
  ))

Conditional processing pipeline

# Process only valid elements
results <- list(
  success = list(status = "ok", value = 42),
  failure = list(status = "error", value = NULL),
  success2 = list(status = "ok", value = 100)
)

# Keep successful, extract values
results |>
  keep(\(x) x$status == "ok") |>
  map_dbl(\(x) x$value)

detect() and detect_index(): Find First Match

Find the first element matching a condition:

x <- c(3, 5, 8, 2, 9, 1)

# Find first element > 6
detect(x, \(i) i > 6)        # 8

# Find its position
detect_index(x, \(i) i > 6)  # 3

Find from the right

# Find last element > 6
detect(x, \(i) i > 6, .dir = "backward")  # 9

detect_index(x, \(i) i > 6, .dir = "backward")  # 5

Practical use: find first valid result

# Find first non-empty result
results <- list(NULL, character(0), "found it!", "also valid")

detect(results, \(x) length(x) > 0 && !is.null(x))
# "found it!"

head_while() and tail_while()

Keep elements from start/end while condition is TRUE:

x <- c(1, 2, 3, 10, 11, 12, 4, 5)

# Keep from start while < 10
head_while(x, \(i) i < 10)   # 1, 2, 3

# Keep from end while < 10
tail_while(x, \(i) i < 10)   # 4, 5

Useful for sorted data

# Dates in order
dates <- as.Date(c("2024-01-01", "2024-01-15", "2024-02-01", "2024-03-01"))

# Get dates before February
head_while(dates, \(d) d < as.Date("2024-02-01"))

Base R Comparison

# Base R Filter()
Filter(is.numeric, penguins)

# purrr keep() - equivalent
keep(penguins, is.numeric)

# purrr advantage: formula/lambda syntax
keep(penguins, ~ mean(.x, na.rm = TRUE) > 100)

# Base R equivalent is more verbose
Filter(\(x) is.numeric(x) && mean(x, na.rm = TRUE) > 100, penguins)

keep_at() and discard_at(): By Position or Name

Filter by position or name instead of predicate:

data <- list(a = 1, b = 2, c = 3, d = 4, e = 5)

# Keep by name
keep_at(data, c("a", "c", "e"))

# Discard by position
discard_at(data, c(2, 4))

# Keep by pattern (using tidyselect)
keep_at(data, starts_with("a"))

Common Mistakes

1. Confusing keep/discard with filter

# filter() is for data frames
penguins |> filter(species == "Adelie")

# keep() is for lists
list(1, 2, 3, 4) |> keep(\(x) x > 2)

2. Predicate must return single TRUE/FALSE

# This doesn't work - returns vector
# keep(list(1:3, 4:6), \(x) x > 2)

# This works - returns single logical
keep(list(1:3, 4:6), \(x) all(x > 2))

3. Not handling NA in predicates

data <- list(a = 1, b = NA, c = 3)

# This might behave unexpectedly
# keep(data, \(x) x > 2)  # NA comparison issues

# Handle NAs explicitly
keep(data, \(x) !is.na(x) && x > 2)

Summary

Function	Keeps Elements Where	Use Case
`keep()`	predicate is TRUE	Select matching elements
`discard()`	predicate is FALSE	Remove matching elements
`compact()`	element is not NULL	Remove NULLs
`keep_at()`	name/position matches	Select by name/position
`discard_at()`	name/position doesn’t match	Remove by name/position

keep() and discard() are opposites
Use compact() as a shortcut to remove NULLs
Predicates must return a single TRUE or FALSE
Chain with map() for filter-then-transform workflows

--- title: "How to Use keep() and discard() in R" description: "Filter list elements with purrr's keep() and discard(). Learn to select or remove elements based on predicate functions." date: 2026-04-03 categories: ['purrr', 'purrr keep()'] format: html: code-fold: false code-tools: true --- ## Introduction The `keep()` and `discard()` functions filter list elements based on a predicate (a function that returns TRUE or FALSE). They're the purrr equivalent of filtering, but for lists instead of data frame rows. - `keep()` - keeps elements where the predicate is TRUE - `discard()` - removes elements where the predicate is TRUE (keeps FALSE) ## Getting Started ```r library(tidyverse) library(palmerpenguins) ``` ## keep(): Select Elements ### Basic usage Keep elements that match a condition: ```r # Keep only even numbers numbers <- list(1, 2, 3, 4, 5, 6, 7, 8, 9, 10) keep(numbers, \(x) x %% 2 == 0) ``` ### Keep by type ```r # Mixed list mixed <- list( a = 1:3, b = "hello", c = 4:6, d = "world" ) # Keep only numeric elements keep(mixed, is.numeric) ``` ### Keep by length ```r # Lists of different lengths data <- list( short = 1:2, medium = 1:5, long = 1:10, tiny = 1 ) # Keep elements with more than 3 items keep(data, \(x) length(x) > 3) ``` ## discard(): Remove Elements ### Remove elements matching a condition ```r numbers <- list(1, 2, 3, 4, 5, NA, 6, NA, 7) # Remove NA values discard(numbers, is.na) ``` ### Remove empty elements ```r data <- list( a = 1:3, b = character(0), # empty c = 4:6, d = NULL, # NULL e = integer(0) # empty ) # Discard empty vectors discard(data, \(x) length(x) == 0) # Or use is_empty helper discard(data, is_empty) ``` ## Practical Example: Data Frame Columns ### Keep numeric columns ```r penguins |> keep(is.numeric) ``` ### Discard columns with missing values ```r # Remove columns that have ANY missing values penguins |> discard(\(x) any(is.na(x))) ``` ### Keep columns above a threshold ```r # Keep numeric columns with mean > 100 penguins |> keep(is.numeric) |> keep(\(x) mean(x, na.rm = TRUE) > 100) ``` ## Working with Nested Lists ### Filter nested data ```r # Nested list structure nested_data <- list( group1 = list(n = 100, values = 1:100), group2 = list(n = 5, values = 1:5), group3 = list(n = 50, values = 1:50) ) # Keep groups with n > 20 keep(nested_data, \(x) x$n > 20) ``` ### Filter model results ```r # Fit multiple models models <- list( m1 = lm(mpg ~ wt, data = mtcars), m2 = lm(mpg ~ wt + hp, data = mtcars), m3 = lm(mpg ~ cyl, data = mtcars) ) # Keep models with R-squared > 0.8 keep(models, \(m) summary(m)$r.squared > 0.8) ``` ## compact(): Remove NULL and Empty Elements A special shortcut for a common operation: ```r # List with NULLs data <- list(a = 1, b = NULL, c = 3, d = NULL, e = 5) # Remove NULLs with compact() compact(data) # Equivalent to: discard(data, is.null) ``` ## Combining with Other purrr Functions ### Filter then transform ```r # Keep numeric, then calculate stats penguins |> keep(is.numeric) |> map(\(x) list( mean = mean(x, na.rm = TRUE), sd = sd(x, na.rm = TRUE) )) ``` ### Conditional processing pipeline ```r # Process only valid elements results <- list( success = list(status = "ok", value = 42), failure = list(status = "error", value = NULL), success2 = list(status = "ok", value = 100) ) # Keep successful, extract values results |> keep(\(x) x$status == "ok") |> map_dbl(\(x) x$value) ``` ## detect() and detect_index(): Find First Match Find the first element matching a condition: ```r x <- c(3, 5, 8, 2, 9, 1) # Find first element > 6 detect(x, \(i) i > 6) # 8 # Find its position detect_index(x, \(i) i > 6) # 3 ``` ### Find from the right ```r # Find last element > 6 detect(x, \(i) i > 6, .dir = "backward") # 9 detect_index(x, \(i) i > 6, .dir = "backward") # 5 ``` ### Practical use: find first valid result ```r # Find first non-empty result results <- list(NULL, character(0), "found it!", "also valid") detect(results, \(x) length(x) > 0 && !is.null(x)) # "found it!" ``` ## head_while() and tail_while() Keep elements from start/end while condition is TRUE: ```r x <- c(1, 2, 3, 10, 11, 12, 4, 5) # Keep from start while < 10 head_while(x, \(i) i < 10) # 1, 2, 3 # Keep from end while < 10 tail_while(x, \(i) i < 10) # 4, 5 ``` ### Useful for sorted data ```r # Dates in order dates <- as.Date(c("2024-01-01", "2024-01-15", "2024-02-01", "2024-03-01")) # Get dates before February head_while(dates, \(d) d < as.Date("2024-02-01")) ``` ## Base R Comparison ```r # Base R Filter() Filter(is.numeric, penguins) # purrr keep() - equivalent keep(penguins, is.numeric) # purrr advantage: formula/lambda syntax keep(penguins, ~ mean(.x, na.rm = TRUE) > 100) # Base R equivalent is more verbose Filter(\(x) is.numeric(x) && mean(x, na.rm = TRUE) > 100, penguins) ``` ## keep_at() and discard_at(): By Position or Name Filter by position or name instead of predicate: ```r data <- list(a = 1, b = 2, c = 3, d = 4, e = 5) # Keep by name keep_at(data, c("a", "c", "e")) # Discard by position discard_at(data, c(2, 4)) # Keep by pattern (using tidyselect) keep_at(data, starts_with("a")) ``` ## Common Mistakes **1. Confusing keep/discard with filter** ```r # filter() is for data frames penguins |> filter(species == "Adelie") # keep() is for lists list(1, 2, 3, 4) |> keep(\(x) x > 2) ``` **2. Predicate must return single TRUE/FALSE** ```r # This doesn't work - returns vector # keep(list(1:3, 4:6), \(x) x > 2) # This works - returns single logical keep(list(1:3, 4:6), \(x) all(x > 2)) ``` **3. Not handling NA in predicates** ```r data <- list(a = 1, b = NA, c = 3) # This might behave unexpectedly # keep(data, \(x) x > 2) # NA comparison issues # Handle NAs explicitly keep(data, \(x) !is.na(x) && x > 2) ``` ## Summary | Function | Keeps Elements Where | Use Case | |----------|---------------------|----------| | `keep()` | predicate is TRUE | Select matching elements | | `discard()` | predicate is FALSE | Remove matching elements | | `compact()` | element is not NULL | Remove NULLs | | `keep_at()` | name/position matches | Select by name/position | | `discard_at()` | name/position doesn't match | Remove by name/position | - `keep()` and `discard()` are opposites - Use `compact()` as a shortcut to remove NULLs - Predicates must return a single TRUE or FALSE - Chain with `map()` for filter-then-transform workflows ## Related Posts - [How to Use map() in R](/purrr/how-to-use-map-in-r) - [How to Use pluck() in R](/purrr/how-to-use-pluck-in-r) - [How to Use safely() and possibly() in R](/purrr/how-to-use-safely-and-possibly-in-r) - [How to Use filter() in R](/dplyr/how-to-use-filter-in-r) - [How to Use select() in R](/dplyr/how-to-use-select-in-r)