How to get the last row of each group in R

dplyr

dplyr last

Learn get the last row of each group in r with clear examples and explanations.

Published

March 26, 2026

Introduction

The dplyr package provides powerful functions for extracting specific rows and performing grouped operations on data frames. These functions help you access the first, last, or nth observations in your data, calculate proportions within groups, and handle consecutive runs of data. These techniques are essential for data exploration, summary statistics, and data cleaning tasks.

Setup

Let’s start by loading the necessary packages and preparing our data:

library(tidyverse)
library(palmerpenguins)

We’ll clean the penguins dataset by removing any rows with missing values:

penguins <- penguins |>
  drop_na()

This gives us a complete dataset with 333 penguin observations to work with.

Extracting Specific Rows

Getting the Last Row

The last() function extracts the final row from your dataset:

penguins |>
  last()

This returns the complete last observation, showing all variables for the final penguin in the dataset.

Getting the First Row

Similarly, first() extracts the initial row:

penguins |>
  first()

You’ll see the first penguin observation with all its measurements and characteristics.

Getting the Nth Row

Use nth() to extract any specific row by position:

penguins |>
  nth(n = 10)

This retrieves the 10th row in the dataset, useful when you need a specific observation by its position.

Using Position Functions in Summaries

Summarizing with Last Values

You can use position functions within summarize() to extract specific values:

penguins |>
  summarize(last_species = last(species))

This creates a summary showing only the species of the last penguin in the dataset.

Summarizing with First Values

The same approach works for first values:

penguins |>
  summarize(first_species = first(species))

This tells you which species appears first in your ordered dataset.

Slicing Rows by Groups

Getting First Row per Group

Combine group_by() with slice_head() to get the first observation from each group:

penguins |>
  group_by(species) |>
  slice_head(n = 1)

This returns one penguin from each species - specifically the first one that appears for each species in the data.

Getting Single First Row

Without grouping, slice_head() returns the first row(s) from the entire dataset:

penguins |>
  slice_head(n = 1)

This gives you just the very first penguin observation.

Getting Last Row per Group

Use slice_tail() with grouping to get the final observation from each group:

penguins |>
  group_by(species) |>
  slice_tail(n = 1)

This returns the last penguin observation for each of the three species.

Calculating Proportions

Basic Proportions

Calculate what proportion each group represents of the total:

penguins |>
  count(species) |>
  mutate(prop = n / sum(n))

This shows both the count and proportion of each penguin species in the dataset.

Alternative Proportion Calculation

You can also use prop.table() for the same result:

penguins |>
  count(species) |>
  mutate(freq = prop.table(n))

Both approaches give you the relative frequency of each species as decimals that sum to 1.

Proportions with Multiple Variables

Calculate proportions across combinations of variables:

penguins |>
  count(species, sex) |>
  mutate(prop = n / sum(n))

This shows what proportion each species-sex combination represents of all penguins.

Working with Consecutive Values

Identifying Consecutive Groups

The consecutive_id() function helps identify runs of consecutive identical values:

df <- data.frame(x = c(0, 0, 1, 0), y = c(2, 2, 2, 2))
df

Let’s create groups based on consecutive runs of the same x and y values:

df |>
  group_by(id = consecutive_id(x, y), x, y)

This assigns a unique ID to each consecutive run of identical x,y combinations.

Summarizing Consecutive Groups

Count how many observations are in each consecutive group:

df |>
  group_by(id = consecutive_id(x, y), x, y) |>
  summarise(n = n())

This reveals that we have three distinct consecutive groups: two single observations and one pair of consecutive identical values.

Summary

These dplyr functions provide flexible ways to extract specific rows, work with grouped data, and calculate proportions. The first(), last(), and nth() functions are perfect for accessing specific observations, while slice_head() and slice_tail() work excellently with grouped data. Proportion calculations help you understand the relative frequency of different categories, and consecutive_id() is invaluable for identifying patterns in sequential data. Master these functions to make your data exploration and summarization tasks more efficient and insightful.

--- title: "How to get the last row of each group in R" description: "Learn get the last row of each group in r with clear examples and explanations." date: 2026-03-26 categories: ['dplyr', 'dplyr last'] format: html: code-fold: false code-tools: true --- ## Introduction The dplyr package provides powerful functions for extracting specific rows and performing grouped operations on data frames. These functions help you access the first, last, or nth observations in your data, calculate proportions within groups, and handle consecutive runs of data. These techniques are essential for data exploration, summary statistics, and data cleaning tasks. ## Setup Let's start by loading the necessary packages and preparing our data: ```r library(tidyverse) library(palmerpenguins) ``` We'll clean the penguins dataset by removing any rows with missing values: ```r penguins <- penguins |> drop_na() ``` This gives us a complete dataset with 333 penguin observations to work with. ## Extracting Specific Rows ### Getting the Last Row The `last()` function extracts the final row from your dataset: ```r penguins |> last() ``` This returns the complete last observation, showing all variables for the final penguin in the dataset. ### Getting the First Row Similarly, `first()` extracts the initial row: ```r penguins |> first() ``` You'll see the first penguin observation with all its measurements and characteristics. ### Getting the Nth Row Use `nth()` to extract any specific row by position: ```r penguins |> nth(n = 10) ``` This retrieves the 10th row in the dataset, useful when you need a specific observation by its position. ## Using Position Functions in Summaries ### Summarizing with Last Values You can use position functions within `summarize()` to extract specific values: ```r penguins |> summarize(last_species = last(species)) ``` This creates a summary showing only the species of the last penguin in the dataset. ### Summarizing with First Values The same approach works for first values: ```r penguins |> summarize(first_species = first(species)) ``` This tells you which species appears first in your ordered dataset. ## Slicing Rows by Groups ### Getting First Row per Group Combine `group_by()` with `slice_head()` to get the first observation from each group: ```r penguins |> group_by(species) |> slice_head(n = 1) ``` This returns one penguin from each species - specifically the first one that appears for each species in the data. ### Getting Single First Row Without grouping, `slice_head()` returns the first row(s) from the entire dataset: ```r penguins |> slice_head(n = 1) ``` This gives you just the very first penguin observation. ### Getting Last Row per Group Use `slice_tail()` with grouping to get the final observation from each group: ```r penguins |> group_by(species) |> slice_tail(n = 1) ``` This returns the last penguin observation for each of the three species. ## Calculating Proportions ### Basic Proportions Calculate what proportion each group represents of the total: ```r penguins |> count(species) |> mutate(prop = n / sum(n)) ``` This shows both the count and proportion of each penguin species in the dataset. ### Alternative Proportion Calculation You can also use `prop.table()` for the same result: ```r penguins |> count(species) |> mutate(freq = prop.table(n)) ``` Both approaches give you the relative frequency of each species as decimals that sum to 1. ### Proportions with Multiple Variables Calculate proportions across combinations of variables: ```r penguins |> count(species, sex) |> mutate(prop = n / sum(n)) ``` This shows what proportion each species-sex combination represents of all penguins. ## Working with Consecutive Values ### Identifying Consecutive Groups The `consecutive_id()` function helps identify runs of consecutive identical values: ```r df <- data.frame(x = c(0, 0, 1, 0), y = c(2, 2, 2, 2)) df ``` Let's create groups based on consecutive runs of the same x and y values: ```r df |> group_by(id = consecutive_id(x, y), x, y) ``` This assigns a unique ID to each consecutive run of identical x,y combinations. ### Summarizing Consecutive Groups Count how many observations are in each consecutive group: ```r df |> group_by(id = consecutive_id(x, y), x, y) |> summarise(n = n()) ``` This reveals that we have three distinct consecutive groups: two single observations and one pair of consecutive identical values. ## Summary These dplyr functions provide flexible ways to extract specific rows, work with grouped data, and calculate proportions. The `first()`, `last()`, and `nth()` functions are perfect for accessing specific observations, while `slice_head()` and `slice_tail()` work excellently with grouped data. Proportion calculations help you understand the relative frequency of different categories, and `consecutive_id()` is invaluable for identifying patterns in sequential data. Master these functions to make your data exploration and summarization tasks more efficient and insightful.