String vs str in Rust: Understanding rust's two string types
Rust's two string types might seem like overkill at first, but they're actually a superpower in disguise. Learn why String and &str exist, when to use each, and how mastering this distinction will level up your Rust game.
If you're coming from languages like TypeScript or Python, Rust's two string types might seem puzzling. Why does Rust need both String
and str
? Isn't one string type enough? Let's clear up this confusion and explore why this design choice actually gives you superpowers.
The Quick Summary
Before diving deep, here's the core distinction: String
and str
are both UTF-8 encoded text, but with different ownership models and use cases.
String
: An owned, growable, heap-allocated string. You have full control to modify it.str
: A string slice, typically seen as&str
. It's a view into string data owned by someone else.
The Easy-to-Understand Overview
What is str
?
The str
type is Rust's most primitive string type. Here's the key insight: you almost never see str
by itself. Why? Because str
is what Rust calls an "unsized type" – the compiler doesn't know its size at compile time.
// This won't compile!
// let my_str: str = "hello"; // ❌ Error: the size of `str` cannot be known at compilation time
// This is what you actually use:
let my_str: &str = "hello"; // ✅ A reference to a string slice
String literals in your code are &'static str
– references to string data that lives for the entire program:
let greeting = "Hello, world!"; // Type: &'static str
// This string is compiled into your binary
What is String
?
String
is the owned version – it owns its data and lives on the heap. It's mutable and growable:
let mut message = String::from("Hello");
message.push_str(", world"); // Now it's "Hello, world"
message.push('!'); // Now it's "Hello, world!"
The Key Differences
Aspect | String | &str |
---|---|---|
Ownership | Owns its data | Borrows data |
Mutability | Can be mutable | Immutable (the slice itself) |
Memory | Always heap-allocated | Points to data anywhere (heap, stack, or static) |
Size | Can grow/shrink | Fixed view |
Use Case | When you need to build/modify strings | When you need to read/borrow strings |
Common Conversions
Moving between these types is straightforward:
// &str to String (creates a new allocation)
let slice: &str = "hello";
let owned: String = slice.to_string(); // or String::from(slice)
// String to &str (just borrowing)
let owned = String::from("world");
let borrowed: &str = &owned; // Deref coercion
let borrowed_explicit: &str = owned.as_str(); // Explicit conversion
// String literals are &'static str
let literal: &str = "static string";
// Note: String::from() and .to_string() are functionally equivalent
// but have different idiomatic uses:
let from_literal = String::from("hello"); // Common for literals
let from_slice = some_str.to_string(); // Common for &str variables
The Advanced Deep Dive
Now let's explore what's really happening under the hood and why these design choices matter.
Memory Layout and Performance
Understanding the memory layout is crucial for writing performant Rust code:
use std::mem;
fn main() {
// &str is a fat pointer: pointer + length
let string_slice: &str = "Hello";
println!("Size of &str: {} bytes", mem::size_of_val(&string_slice));
// Output: Size of &str: 16 bytes (on 64-bit systems)
// 8 bytes for the pointer + 8 bytes for length
// String has three components
let mut owned = String::from("Hello");
println!("Size of String: {} bytes", mem::size_of_val(&owned));
// Output: Size of String: 24 bytes (on 64-bit systems)
// 8 bytes pointer + 8 bytes length + 8 bytes capacity
// Examining the internals
println!("Pointer: {:p}", owned.as_ptr());
println!("Length: {}", owned.len());
println!("Capacity: {}", owned.capacity());
}
The String
type consists of three components:
- Pointer: Points to the heap-allocated buffer
- Length: Current number of bytes used
- Capacity: Total size of the allocated buffer
Zero-Copy Operations
One of &str
's superpowers is enabling zero-copy operations. Creating a substring doesn't copy data – it just adjusts pointers, like slicing bread without leaving crumbs:
fn get_first_word(text: &str) -> &str {
// No allocation, no copying - just returning a different view!
match text.find(' ') {
Some(pos) => &text[0..pos],
None => text,
}
}
// Compare with String (which allocates)
fn get_first_word_owned(text: &str) -> String {
// This allocates and copies!
match text.find(' ') {
Some(pos) => text[0..pos].to_string(),
None => text.to_string(),
}
}
UTF-8 Validation and Safety
Both String
and str
maintain a critical invariant: they're always valid UTF-8. This is enforced at the type level:
// String validates UTF-8 when created from bytes
let valid = String::from_utf8(vec![72, 101, 108, 108, 111]); // "Hello"
assert!(valid.is_ok());
let invalid = String::from_utf8(vec![0xFF, 0xFE, 0xFD]);
assert!(invalid.is_err()); // Invalid UTF-8!
// For trusted data where you're certain about UTF-8 validity:
let trusted = unsafe {
String::from_utf8_unchecked(vec![72, 101, 108, 108, 111])
};
// Use sparingly - invalid UTF-8 causes undefined behavior!
Function Parameters: A Critical Pattern
Here's a crucial best practice: prefer &str
over &String
for function parameters:
// ✅ Good - accepts both String and &str
fn process_text(text: &str) {
println!("Processing: {}", text);
}
// ❌ Restrictive - only accepts String
fn process_text_rigid(text: &String) {
println!("Processing: {}", text);
}
fn main() {
let owned = String::from("example");
let borrowed = "literal";
process_text(&owned); // Works - String derefs to &str
process_text(borrowed); // Works - already &str
process_text_rigid(&owned); // Works
// process_text_rigid(borrowed); // ❌ Compilation error!
}
Performance Optimization: Pre-allocation
When building strings incrementally, pre-allocating capacity can significantly reduce allocations:
// Inefficient - multiple reallocations as the string grows
fn build_csv_naive(values: &[&str]) -> String {
let mut result = String::new();
for (i, value) in values.iter().enumerate() {
if i > 0 {
result.push(',');
}
result.push_str(value);
}
result
}
// Efficient - single allocation upfront
fn build_csv_optimized(values: &[&str]) -> String {
// Calculate total capacity needed
let capacity = values.iter().map(|s| s.len()).sum::<usize>()
+ values.len().saturating_sub(1); // commas
let mut result = String::with_capacity(capacity);
for (i, value) in values.iter().enumerate() {
if i > 0 {
result.push(',');
}
result.push_str(value);
}
result
}
// Benchmark: building a CSV with 1000 items
// Naive: ~50 allocations
// Optimized: 1 allocation
When to Use Which?
Use String
when:
- You need to own the string data
- Building or modifying strings at runtime
- Storing strings in structs that outlive their source
- Returning newly created strings from functions
struct User {
username: String, // Owns its data
email: String, // Owns its data
}
impl User {
fn new(username: &str, email: &str) -> Self {
User {
username: username.to_string(), // Create owned copy
email: email.to_string(),
}
}
fn set_email(&mut self, email: &str) {
self.email = email.to_string(); // Replace with new owned string
}
}
Use &str
when:
- You only need to read string data
- Function parameters (unless you need ownership)
- Avoiding unnecessary allocations
- Working with string literals
- Creating views into existing strings
struct Config<'a> {
hostname: &'a str, // Just borrowing
port: u16,
}
impl<'a> Config<'a> {
fn format_url(&self) -> String {
// Only allocate when we need to create something new
format!("http://{}:{}", self.hostname, self.port)
}
fn is_localhost(&self) -> bool {
// No allocation needed for comparison
self.hostname == "localhost" || self.hostname == "127.0.0.1"
}
}
The Lifetime Connection
&str
has a lifetime parameter that's usually elided by the compiler:
// These function signatures are equivalent:
fn get_extension(filename: &str) -> Option<&str> {
filename.rfind('.').map(|i| &filename[i + 1..])
}
fn get_extension_explicit<'a>(filename: &'a str) -> Option<&'a str> {
filename.rfind('.').map(|i| &filename[i + 1..])
}
// The lifetime ensures the returned &str doesn't outlive its source
Real-World Patterns
Let's look at practical patterns you'll use daily:
// Pattern 1: Efficient string building with iterators
// AsRef<str> allows accepting both String and &str types
fn format_list(items: &[impl AsRef<str>]) -> String {
items.iter()
.map(|s| s.as_ref())
.collect::<Vec<_>>()
.join(", ")
}
// Pattern 2: Parse without unnecessary allocations
fn parse_key_value(line: &str) -> Option<(&str, &str)> {
let mut parts = line.splitn(2, '=');
let key = parts.next()?.trim();
let value = parts.next()?.trim();
Some((key, value))
}
// Pattern 3: Conditional string building
fn build_query_string(base: &str, params: &[(&str, &str)]) -> String {
if params.is_empty() {
return base.to_string();
}
// Pre-calculate capacity for efficiency
let param_len: usize = params.iter()
.map(|(k, v)| k.len() + v.len() + 2) // key + "=" + value + "&"
.sum();
let mut result = String::with_capacity(base.len() + 1 + param_len);
result.push_str(base);
result.push('?');
for (i, (key, value)) in params.iter().enumerate() {
if i > 0 {
result.push('&');
}
result.push_str(key);
result.push('=');
result.push_str(value);
}
result
}
// Pattern 4: Working with paths efficiently
use std::path::Path;
fn extract_filename(path: &str) -> &str {
Path::new(path)
.file_name()
.and_then(|name| name.to_str())
.unwrap_or("")
}
// Pattern 5: String interning for repeated strings
// Useful when you have many duplicate strings and want to save memory
use std::collections::HashSet;
struct StringInterner {
strings: HashSet<String>,
}
impl StringInterner {
fn get_or_insert(&mut self, s: &str) -> &str {
if !self.strings.contains(s) {
self.strings.insert(s.to_string());
}
self.strings.get(s).unwrap().as_str()
}
}
// Pattern 6: Cow (Clone on Write) for conditional allocation
use std::borrow::Cow;
fn normalize_text(input: &str) -> Cow<str> {
if input.contains('\t') || input.contains('\r') {
// Only allocate when we need to modify
Cow::Owned(input.replace('\t', " ").replace('\r', ""))
} else {
// No allocation needed - just borrow!
Cow::Borrowed(input)
}
}
The Takeaway
Understanding the distinction between String
and str
is fundamental to writing efficient, idiomatic Rust. The key is recognizing that this isn't complexity for complexity's sake – it's a design that gives you precise control over memory allocation and ownership.
Remember these core principles:
- Default to
&str
for function parameters – it's more flexible and avoids unnecessary allocations - Use
String
when you need ownership – when building strings dynamically or storing them - Leverage zero-copy operations – slicing and splitting
&str
is essentially free - Pre-allocate when possible – if you know the final size, use
String::with_capacity
- Both enforce UTF-8 – this invariant is what makes Rust strings memory-safe
Master these two types, and you'll find that Rust's approach to strings, while initially more complex than other languages, gives you fine-grained control over performance that's hard to achieve elsewhere. This is the kind of control that lets you write systems-level code that's both safe and blazingly fast – now that's a recipe for success!