How to use vector in R
Introduction
Vectors are the fundamental building blocks of data in R, representing a collection of elements of the same type (numeric, character, or logical). They’re essential for storing and manipulating data, forming the backbone of data frames, lists, and most R operations.
Getting Started
library(tidyverse)
library(palmerpenguins)Example 1: Basic Vector Operations
The Problem
We need to understand how to create different types of vectors and perform basic operations on them. This foundation is crucial for all data manipulation tasks in R.
Step 1: Create numeric vectors
We’ll start by creating vectors using different methods.
# Create vectors using c() function
numbers <- c(10, 20, 30, 40, 50)
sequence <- 1:10
repeated <- rep(5, times = 4)
print(numbers)
print(sequence)This creates three different numeric vectors: a manual list, a sequence, and repeated values.
Step 2: Create character and logical vectors
Now we’ll create vectors with different data types.
# Character vector
species <- c("Adelie", "Chinstrap", "Gentoo")
colors <- c("red", "blue", "green", "yellow")
# Logical vector
conditions <- c(TRUE, FALSE, TRUE, FALSE)
print(species)Character vectors store text data, while logical vectors contain TRUE/FALSE values for conditional operations.
Step 3: Access vector elements
We can extract specific elements using indexing.
# Access elements by position
first_number <- numbers[1]
last_species <- species[length(species)]
multiple_elements <- numbers[c(1, 3, 5)]
print(first_number)
print(multiple_elements)Indexing allows us to retrieve specific values or subsets from our vectors using square brackets.
Example 2: Practical Application with Palmer Penguins
The Problem
Let’s analyze penguin data by extracting and working with vectors from a real dataset. We’ll calculate summary statistics and filter data based on conditions using vector operations.
Step 1: Extract vectors from data frame
We’ll pull specific columns as vectors for analysis.
# Extract vectors from penguins dataset
penguin_species <- penguins$species
bill_lengths <- penguins$bill_length_mm
body_mass <- penguins$body_mass_g
# Remove NA values
bill_lengths <- bill_lengths[!is.na(bill_lengths)]
head(bill_lengths)This extracts column data as vectors and removes missing values for clean analysis.
Step 2: Perform vector calculations
Now we’ll calculate summary statistics using vector operations.
# Calculate statistics
mean_bill_length <- mean(bill_lengths)
max_body_mass <- max(body_mass, na.rm = TRUE)
total_penguins <- length(penguin_species)
print(paste("Average bill length:", round(mean_bill_length, 2)))
print(paste("Max body mass:", max_body_mass))Vector functions like mean() and max() operate on entire vectors at once, making calculations efficient.
Step 3: Create conditional vectors
We’ll use logical operations to create filtering conditions.
# Create logical vectors for filtering
large_penguins <- body_mass > 4500
adelie_penguins <- penguin_species == "Adelie"
long_bills <- bill_lengths > 45
# Count TRUE values
count_large <- sum(large_penguins, na.rm = TRUE)
print(paste("Large penguins:", count_large))Logical vectors act as filters, helping us identify subsets of data that meet specific criteria.
Step 4: Combine vectors for analysis
Finally, we’ll combine multiple vectors to answer complex questions.
# Create named vector for species counts
species_counts <- table(penguin_species)
species_vector <- as.vector(species_counts)
names(species_vector) <- names(species_counts)
print(species_vector)Named vectors provide a convenient way to store related values with descriptive labels for easy reference.
Summary
- Vectors are R’s fundamental data structure, storing elements of the same type (numeric, character, or logical)
- Create vectors using
c(),:for sequences, orrep()for repeated values - Access elements using square bracket notation
[index]or logical conditions - Vector operations work element-wise, making calculations efficient across entire datasets
Extract vectors from data frames using
$notation for focused analysis and calculations