How to Use keep() and discard() in R

purrr
purrr keep()
Filter list elements with purrr’s keep() and discard(). Learn to select or remove elements based on predicate functions.
Published

April 3, 2026

Introduction

The keep() and discard() functions filter list elements based on a predicate (a function that returns TRUE or FALSE). They’re the purrr equivalent of filtering, but for lists instead of data frame rows.

  • keep() - keeps elements where the predicate is TRUE
  • discard() - removes elements where the predicate is TRUE (keeps FALSE)

Getting Started

library(tidyverse)
library(palmerpenguins)

keep(): Select Elements

Basic usage

Keep elements that match a condition:

# Keep only even numbers
numbers <- list(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
keep(numbers, \(x) x %% 2 == 0)

Keep by type

# Mixed list
mixed <- list(
  a = 1:3,
  b = "hello",
  c = 4:6,
  d = "world"
)

# Keep only numeric elements
keep(mixed, is.numeric)

Keep by length

# Lists of different lengths
data <- list(
  short = 1:2,
  medium = 1:5,
  long = 1:10,
  tiny = 1
)

# Keep elements with more than 3 items
keep(data, \(x) length(x) > 3)

discard(): Remove Elements

Remove elements matching a condition

numbers <- list(1, 2, 3, 4, 5, NA, 6, NA, 7)

# Remove NA values
discard(numbers, is.na)

Remove empty elements

data <- list(
  a = 1:3,
  b = character(0),  # empty
  c = 4:6,
  d = NULL,          # NULL
  e = integer(0)     # empty
)

# Discard empty vectors
discard(data, \(x) length(x) == 0)

# Or use is_empty helper
discard(data, is_empty)

Practical Example: Data Frame Columns

Keep numeric columns

penguins |>
  keep(is.numeric)

Discard columns with missing values

# Remove columns that have ANY missing values
penguins |>
  discard(\(x) any(is.na(x)))

Keep columns above a threshold

# Keep numeric columns with mean > 100
penguins |>
  keep(is.numeric) |>
  keep(\(x) mean(x, na.rm = TRUE) > 100)

Working with Nested Lists

Filter nested data

# Nested list structure
nested_data <- list(
  group1 = list(n = 100, values = 1:100),
  group2 = list(n = 5, values = 1:5),
  group3 = list(n = 50, values = 1:50)
)

# Keep groups with n > 20
keep(nested_data, \(x) x$n > 20)

Filter model results

# Fit multiple models
models <- list(
  m1 = lm(mpg ~ wt, data = mtcars),
  m2 = lm(mpg ~ wt + hp, data = mtcars),
  m3 = lm(mpg ~ cyl, data = mtcars)
)

# Keep models with R-squared > 0.8
keep(models, \(m) summary(m)$r.squared > 0.8)

compact(): Remove NULL and Empty Elements

A special shortcut for a common operation:

# List with NULLs
data <- list(a = 1, b = NULL, c = 3, d = NULL, e = 5)

# Remove NULLs with compact()
compact(data)

# Equivalent to:
discard(data, is.null)

Combining with Other purrr Functions

Filter then transform

# Keep numeric, then calculate stats
penguins |>
  keep(is.numeric) |>
  map(\(x) list(
    mean = mean(x, na.rm = TRUE),
    sd = sd(x, na.rm = TRUE)
  ))

Conditional processing pipeline

# Process only valid elements
results <- list(
  success = list(status = "ok", value = 42),
  failure = list(status = "error", value = NULL),
  success2 = list(status = "ok", value = 100)
)

# Keep successful, extract values
results |>
  keep(\(x) x$status == "ok") |>
  map_dbl(\(x) x$value)

detect() and detect_index(): Find First Match

Find the first element matching a condition:

x <- c(3, 5, 8, 2, 9, 1)

# Find first element > 6
detect(x, \(i) i > 6)        # 8

# Find its position
detect_index(x, \(i) i > 6)  # 3

Find from the right

# Find last element > 6
detect(x, \(i) i > 6, .dir = "backward")  # 9

detect_index(x, \(i) i > 6, .dir = "backward")  # 5

Practical use: find first valid result

# Find first non-empty result
results <- list(NULL, character(0), "found it!", "also valid")

detect(results, \(x) length(x) > 0 && !is.null(x))
# "found it!"

head_while() and tail_while()

Keep elements from start/end while condition is TRUE:

x <- c(1, 2, 3, 10, 11, 12, 4, 5)

# Keep from start while < 10
head_while(x, \(i) i < 10)   # 1, 2, 3

# Keep from end while < 10
tail_while(x, \(i) i < 10)   # 4, 5

Useful for sorted data

# Dates in order
dates <- as.Date(c("2024-01-01", "2024-01-15", "2024-02-01", "2024-03-01"))

# Get dates before February
head_while(dates, \(d) d < as.Date("2024-02-01"))

Base R Comparison

# Base R Filter()
Filter(is.numeric, penguins)

# purrr keep() - equivalent
keep(penguins, is.numeric)

# purrr advantage: formula/lambda syntax
keep(penguins, ~ mean(.x, na.rm = TRUE) > 100)

# Base R equivalent is more verbose
Filter(\(x) is.numeric(x) && mean(x, na.rm = TRUE) > 100, penguins)

keep_at() and discard_at(): By Position or Name

Filter by position or name instead of predicate:

data <- list(a = 1, b = 2, c = 3, d = 4, e = 5)

# Keep by name
keep_at(data, c("a", "c", "e"))

# Discard by position
discard_at(data, c(2, 4))

# Keep by pattern (using tidyselect)
keep_at(data, starts_with("a"))

Common Mistakes

1. Confusing keep/discard with filter

# filter() is for data frames
penguins |> filter(species == "Adelie")

# keep() is for lists
list(1, 2, 3, 4) |> keep(\(x) x > 2)

2. Predicate must return single TRUE/FALSE

# This doesn't work - returns vector
# keep(list(1:3, 4:6), \(x) x > 2)

# This works - returns single logical
keep(list(1:3, 4:6), \(x) all(x > 2))

3. Not handling NA in predicates

data <- list(a = 1, b = NA, c = 3)

# This might behave unexpectedly
# keep(data, \(x) x > 2)  # NA comparison issues

# Handle NAs explicitly
keep(data, \(x) !is.na(x) && x > 2)

Summary

Function Keeps Elements Where Use Case
keep() predicate is TRUE Select matching elements
discard() predicate is FALSE Remove matching elements
compact() element is not NULL Remove NULLs
keep_at() name/position matches Select by name/position
discard_at() name/position doesn’t match Remove by name/position
  • keep() and discard() are opposites
  • Use compact() as a shortcut to remove NULLs
  • Predicates must return a single TRUE or FALSE
  • Chain with map() for filter-then-transform workflows