How to use lapply in R

base-r

lapply

Master lapply in R programming with clear examples. Complete tutorial covering syntax, use cases, and best practices.

Published

February 21, 2026

Introduction

The lapply() function applies a function to each element of a list or vector and returns the results as a list. It’s one of R’s most powerful tools for avoiding loops and processing data efficiently. Use lapply() when you need to perform the same operation on multiple elements and want consistent list output.

Getting Started

library(tidyverse)
data(mtcars)
data(penguins, package = "palmerpenguins")

Example 1: Basic Usage

The Problem

We want to calculate summary statistics for multiple numeric columns in the mtcars dataset. Instead of writing separate code for each column, we need an efficient way to apply the same function across multiple variables.

Step 1: Create a simple list

First, let’s create a basic list to understand how lapply() works.

# Create a simple numeric list
numbers <- list(
  group1 = c(1, 2, 3, 4, 5),
  group2 = c(10, 20, 30),
  group3 = c(100, 200, 300, 400)
)

This creates a list with three named elements, each containing different numeric vectors.

Step 2: Apply a function to each element

Now we’ll use lapply() to calculate the mean of each list element.

# Calculate mean for each group
result <- lapply(numbers, mean)
print(result)

The lapply() function applies the mean() function to each element and returns a list containing the three calculated means.

Step 3: Apply built-in functions

Let’s apply different summary functions to see the versatility of lapply().

# Apply multiple functions
sum_result <- lapply(numbers, sum)
length_result <- lapply(numbers, length)
max_result <- lapply(numbers, max)

Each lapply() call returns a list with the same structure as our input, but containing the calculated values instead of the original data.

Example 2: Practical Application

The Problem

We want to analyze the penguins dataset by calculating summary statistics for numeric columns grouped by species. This is a common data analysis task that requires applying functions to multiple subsets of data efficiently.

Step 1: Prepare the data

First, we’ll split the penguins data by species to create separate datasets.

# Split penguins by species
penguins_clean <- penguins |> 
  filter(!is.na(bill_length_mm))

penguin_groups <- split(penguins_clean, penguins_clean$species)

This creates a list where each element contains all the data for one penguin species.

Step 2: Extract numeric columns

We’ll create a function to extract bill length measurements from each species group.

# Extract bill lengths for each species
get_bill_length <- function(df) {
  return(df$bill_length_mm)
}

bill_lengths <- lapply(penguin_groups, get_bill_length)

Now we have a list containing bill length vectors for each species, ready for further analysis.

Step 3: Calculate statistics

Let’s apply multiple statistical functions to analyze the bill length data.

# Calculate various statistics
mean_bills <- lapply(bill_lengths, mean, na.rm = TRUE)
median_bills <- lapply(bill_lengths, median, na.rm = TRUE)
sd_bills <- lapply(bill_lengths, sd, na.rm = TRUE)

Each function call produces a list with summary statistics for bill lengths across all three penguin species.

Step 4: Create custom analysis function

We can write a custom function to get comprehensive statistics in one step.

# Custom summary function
bill_summary <- function(x) {
  c(mean = mean(x, na.rm = TRUE),
    median = median(x, na.rm = TRUE),
    sd = sd(x, na.rm = TRUE))
}

comprehensive_stats <- lapply(bill_lengths, bill_summary)

This approach returns a list where each element contains multiple statistics, providing a complete summary for each species.

Summary

lapply() applies functions to list or vector elements and always returns a list
It’s more efficient than writing loops and produces cleaner, more readable code
You can use built-in functions like mean(), sum(), or create custom functions for complex operations
The function is particularly useful for grouped analysis and repetitive calculations across datasets
Remember to handle missing values with parameters like na.rm = TRUE when working with real data

--- title: "How to use lapply in R" description: "Master lapply in R programming with clear examples. Complete tutorial covering syntax, use cases, and best practices." date: 2026-02-21 categories: ['base-r', 'lapply'] format: html: code-fold: false code-tools: true --- ## Introduction The `lapply()` function applies a function to each element of a list or vector and returns the results as a list. It's one of R's most powerful tools for avoiding loops and processing data efficiently. Use `lapply()` when you need to perform the same operation on multiple elements and want consistent list output. ## Getting Started ```r library(tidyverse) data(mtcars) data(penguins, package = "palmerpenguins") ``` ## Example 1: Basic Usage ### The Problem We want to calculate summary statistics for multiple numeric columns in the mtcars dataset. Instead of writing separate code for each column, we need an efficient way to apply the same function across multiple variables. ### Step 1: Create a simple list First, let's create a basic list to understand how `lapply()` works. ```r # Create a simple numeric list numbers <- list( group1 = c(1, 2, 3, 4, 5), group2 = c(10, 20, 30), group3 = c(100, 200, 300, 400) ) ``` This creates a list with three named elements, each containing different numeric vectors. ### Step 2: Apply a function to each element Now we'll use `lapply()` to calculate the mean of each list element. ```r # Calculate mean for each group result <- lapply(numbers, mean) print(result) ``` The `lapply()` function applies the `mean()` function to each element and returns a list containing the three calculated means. ### Step 3: Apply built-in functions Let's apply different summary functions to see the versatility of `lapply()`. ```r # Apply multiple functions sum_result <- lapply(numbers, sum) length_result <- lapply(numbers, length) max_result <- lapply(numbers, max) ``` Each `lapply()` call returns a list with the same structure as our input, but containing the calculated values instead of the original data. ## Example 2: Practical Application ### The Problem We want to analyze the penguins dataset by calculating summary statistics for numeric columns grouped by species. This is a common data analysis task that requires applying functions to multiple subsets of data efficiently. ### Step 1: Prepare the data First, we'll split the penguins data by species to create separate datasets. ```r # Split penguins by species penguins_clean <- penguins |> filter(!is.na(bill_length_mm)) penguin_groups <- split(penguins_clean, penguins_clean$species) ``` This creates a list where each element contains all the data for one penguin species. ### Step 2: Extract numeric columns We'll create a function to extract bill length measurements from each species group. ```r # Extract bill lengths for each species get_bill_length <- function(df) { return(df$bill_length_mm) } bill_lengths <- lapply(penguin_groups, get_bill_length) ``` Now we have a list containing bill length vectors for each species, ready for further analysis. ### Step 3: Calculate statistics Let's apply multiple statistical functions to analyze the bill length data. ```r # Calculate various statistics mean_bills <- lapply(bill_lengths, mean, na.rm = TRUE) median_bills <- lapply(bill_lengths, median, na.rm = TRUE) sd_bills <- lapply(bill_lengths, sd, na.rm = TRUE) ``` Each function call produces a list with summary statistics for bill lengths across all three penguin species. ### Step 4: Create custom analysis function We can write a custom function to get comprehensive statistics in one step. ```r # Custom summary function bill_summary <- function(x) { c(mean = mean(x, na.rm = TRUE), median = median(x, na.rm = TRUE), sd = sd(x, na.rm = TRUE)) } comprehensive_stats <- lapply(bill_lengths, bill_summary) ``` This approach returns a list where each element contains multiple statistics, providing a complete summary for each species. ## Summary - `lapply()` applies functions to list or vector elements and always returns a list - It's more efficient than writing loops and produces cleaner, more readable code - You can use built-in functions like `mean()`, `sum()`, or create custom functions for complex operations - The function is particularly useful for grouped analysis and repetitive calculations across datasets - Remember to handle missing values with parameters like `na.rm = TRUE` when working with real data --- ## Related Posts - [How to use mapply in R](/base-r/how-to-use-mapply-in-r.html) - [How to use read.csv in R](/base-r/how-to-use-readcsv-in-r.html) - [How to use order in R](/base-r/how-to-use-order-in-r.html) - [How to use select() in R](/dplyr/how-to-use-select-in-r.html) - [How to use mutate() in R](/dplyr/how-to-use-mutate-in-r.html)

Introduction

Getting Started

Example 1: Basic Usage

The Problem

Step 1: Create a simple list

Step 2: Apply a function to each element

Step 3: Apply built-in functions

Example 2: Practical Application

The Problem

Step 1: Prepare the data

Step 2: Extract numeric columns

Step 3: Calculate statistics

Step 4: Create custom analysis function

Summary

Remember to handle missing values with parameters like na.rm = TRUE when working with real data

Related Posts

Remember to handle missing values with parameters like `na.rm = TRUE` when working with real data