How to perform t-test in R

t.test()

Learn how to perform perform t-test in R. Step-by-step statistical tutorial with examples.

Published

August 27, 2024

Introduction

A t-test is a statistical test used to compare means between groups or against a known value. It’s essential for determining whether observed differences are statistically significant or due to random chance.

Getting Started

library(tidyverse)
library(palmerpenguins)

Example 1: Basic Usage

The Problem

We want to test if the average body mass of Adelie penguins differs significantly from 4000 grams. This is a one-sample t-test comparing our sample mean to a known value.

Step 1: Prepare the data

First, we’ll filter our dataset to focus on Adelie penguins only.

adelie_data <- penguins |>
  filter(species == "Adelie") |>
  drop_na(body_mass_g)

head(adelie_data)

This creates a clean dataset with 146 Adelie penguins, removing any missing body mass values.

Step 2: Explore the data

Let’s examine the distribution and calculate basic statistics.

adelie_data |>
  summarise(
    mean_mass = mean(body_mass_g),
    sd_mass = sd(body_mass_g),
    n = n()
  )

The average body mass is approximately 3706 grams, which appears different from our test value of 4000 grams.

Step 3: Perform one-sample t-test

Now we’ll conduct the statistical test to determine if this difference is significant.

t_result <- t.test(adelie_data$body_mass_g, 
                   mu = 4000)
print(t_result)

The p-value is much less than 0.05, indicating that Adelie penguins have significantly different body mass from 4000 grams.

Example 2: Practical Application

The Problem

A researcher wants to compare flipper lengths between male and female Adelie penguins. This requires a two-sample t-test to determine if there’s a significant difference between the two groups.

Step 1: Prepare comparison data

We’ll filter for Adelie penguins and remove any missing values for sex and flipper length.

adelie_comparison <- penguins |>
  filter(species == "Adelie") |>
  drop_na(sex, flipper_length_mm)

head(adelie_comparison)

This gives us a clean dataset ready for comparing flipper lengths between sexes.

Step 2: Visualize the differences

Before testing, let’s visualize the data to understand the distributions.

adelie_comparison |>
  ggplot(aes(x = sex, y = flipper_length_mm, fill = sex)) +
  geom_boxplot(alpha = 0.7, outlier.shape = NA) +
  geom_jitter(width = 0.15, alpha = 0.6) +
  labs(title = "Flipper Length by Sex in Adelie Penguins",
       subtitle = "Two-sample t-test of flipper length by sex",
       x = "Sex", y = "Flipper Length (mm)") +
  theme_minimal() +
  theme(legend.position = "none")

Boxplot with jitter in R with ggplot2 comparing male and female Adelie penguin flipper length for a two-sample t-test

The boxplot suggests male penguins have longer flippers than females, but we need statistical confirmation.

Step 3: Calculate group statistics

Let’s examine the summary statistics for each group.

adelie_comparison |>
  group_by(sex) |>
  summarise(
    mean_flipper = mean(flipper_length_mm),
    sd_flipper = sd(flipper_length_mm),
    n = n()
  )

Males show higher average flipper length (approximately 192mm) compared to females (approximately 188mm).

Step 4: Perform two-sample t-test

Now we’ll test if this observed difference is statistically significant.

t_test_two_sample <- t.test(flipper_length_mm ~ sex, 
                           data = adelie_comparison)
print(t_test_two_sample)

The p-value indicates whether the difference in flipper lengths between sexes is statistically significant.

Step 5: Interpret results

Let’s extract key information from our test results.

# Extract confidence interval and p-value
cat("P-value:", t_test_two_sample$p.value, "\n")
cat("95% Confidence Interval:", 
    t_test_two_sample$conf.int[1], "to", 
    t_test_two_sample$conf.int[2])

These results help us make informed conclusions about the difference between male and female flipper lengths.

Summary

One-sample t-tests compare a sample mean against a known value using t.test(data, mu = value)
Two-sample t-tests compare means between two groups using t.test(variable ~ group, data = dataset)
P-values less than 0.05 typically indicate statistically significant differences
Always explore your data visually before conducting statistical tests
The t.test() function provides confidence intervals, test statistics, and p-values for interpretation

--- title: "How to perform t-test in R" description: "Learn how to perform perform t-test in R. Step-by-step statistical tutorial with examples." date: 2024-08-27 categories: ['t.test()'] format: html: code-fold: false code-tools: true --- ## Introduction A t-test is a statistical test used to compare means between groups or against a known value. It's essential for determining whether observed differences are statistically significant or due to random chance. ## Getting Started ```r library(tidyverse) library(palmerpenguins) ``` ## Example 1: Basic Usage ### The Problem We want to test if the average body mass of Adelie penguins differs significantly from 4000 grams. This is a one-sample t-test comparing our sample mean to a known value. ### Step 1: Prepare the data First, we'll filter our dataset to focus on Adelie penguins only. ```r adelie_data <- penguins |> filter(species == "Adelie") |> drop_na(body_mass_g) head(adelie_data) ``` This creates a clean dataset with 146 Adelie penguins, removing any missing body mass values. ### Step 2: Explore the data Let's examine the distribution and calculate basic statistics. ```r adelie_data |> summarise( mean_mass = mean(body_mass_g), sd_mass = sd(body_mass_g), n = n() ) ``` The average body mass is approximately 3706 grams, which appears different from our test value of 4000 grams. ### Step 3: Perform one-sample t-test Now we'll conduct the statistical test to determine if this difference is significant. ```r t_result <- t.test(adelie_data$body_mass_g, mu = 4000) print(t_result) ``` The p-value is much less than 0.05, indicating that Adelie penguins have significantly different body mass from 4000 grams. ## Example 2: Practical Application ### The Problem A researcher wants to compare flipper lengths between male and female Adelie penguins. This requires a two-sample t-test to determine if there's a significant difference between the two groups. ### Step 1: Prepare comparison data We'll filter for Adelie penguins and remove any missing values for sex and flipper length. ```r adelie_comparison <- penguins |> filter(species == "Adelie") |> drop_na(sex, flipper_length_mm) head(adelie_comparison) ``` This gives us a clean dataset ready for comparing flipper lengths between sexes. ### Step 2: Visualize the differences Before testing, let's visualize the data to understand the distributions. ```r adelie_comparison |> ggplot(aes(x = sex, y = flipper_length_mm, fill = sex)) + geom_boxplot(alpha = 0.7, outlier.shape = NA) + geom_jitter(width = 0.15, alpha = 0.6) + labs(title = "Flipper Length by Sex in Adelie Penguins", subtitle = "Two-sample t-test of flipper length by sex", x = "Sex", y = "Flipper Length (mm)") + theme_minimal() + theme(legend.position = "none") ``` ![Boxplot with jitter in R with ggplot2 comparing male and female Adelie penguin flipper length for a two-sample t-test](/images/statistics/perform-t-test-in-r-flipper-length-by-sex-boxplot-ggplot.png) The boxplot suggests male penguins have longer flippers than females, but we need statistical confirmation. ### Step 3: Calculate group statistics Let's examine the summary statistics for each group. ```r adelie_comparison |> group_by(sex) |> summarise( mean_flipper = mean(flipper_length_mm), sd_flipper = sd(flipper_length_mm), n = n() ) ``` Males show higher average flipper length (approximately 192mm) compared to females (approximately 188mm). ### Step 4: Perform two-sample t-test Now we'll test if this observed difference is statistically significant. ```r t_test_two_sample <- t.test(flipper_length_mm ~ sex, data = adelie_comparison) print(t_test_two_sample) ``` The p-value indicates whether the difference in flipper lengths between sexes is statistically significant. ### Step 5: Interpret results Let's extract key information from our test results. ```r # Extract confidence interval and p-value cat("P-value:", t_test_two_sample$p.value, "\n") cat("95% Confidence Interval:", t_test_two_sample$conf.int[1], "to", t_test_two_sample$conf.int[2]) ``` These results help us make informed conclusions about the difference between male and female flipper lengths. ## Summary - One-sample t-tests compare a sample mean against a known value using `t.test(data, mu = value)` - Two-sample t-tests compare means between two groups using `t.test(variable ~ group, data = dataset)` - P-values less than 0.05 typically indicate statistically significant differences - Always explore your data visually before conducting statistical tests - The `t.test()` function provides confidence intervals, test statistics, and p-values for interpretation --- ## Related Posts - [How to paired t-test in R](/statistics/how-to-paired-t-test-in-r.html) - [How to t-test for two samples in R](/statistics/how-to-t-test-for-two-samples-in-r.html) - [How to perform multiple t-tests using tidyverse](/statistics/how-to-perform-multiple-t-tests-using-tidyverse.html) - [How to use select() in R](/dplyr/how-to-use-select-in-r.html) - [How to replace NA in a column with specific value](/dplyr/how-to-replace-na-in-a-column-with-specific-value.html)

Introduction

Getting Started

Example 1: Basic Usage

The Problem

Step 1: Prepare the data

Step 2: Explore the data

Step 3: Perform one-sample t-test

Example 2: Practical Application

The Problem

Step 1: Prepare comparison data

Step 2: Visualize the differences

Step 3: Calculate group statistics

Step 4: Perform two-sample t-test

Step 5: Interpret results

Summary

The t.test() function provides confidence intervals, test statistics, and p-values for interpretation

Related Posts

The `t.test()` function provides confidence intervals, test statistics, and p-values for interpretation