pathlib vs os.path: A Modern Way to Handle File Paths
For many years, os.path was the standard way to manipulate file paths in Python. It works reliably, but it treats paths as plain strings, which often leads to verbose and less expressive code.
With the introduction of pathlib, Python adopted a more modern, object-oriented approach. Instead of operating on raw strings, you work with Path objects that encapsulate file system behavior directly.
The Core Difference
The main distinction lies in representation:
os.pathworks with stringspathlibusesPathobjects
Compare:
# os.path
import os
path = os.path.join("data", "input", "file.csv")
# pathlib
from pathlib import Path
path = Path("data") / "input" / "file.csv"
The Path version is not only shorter but also clearer. The division operator represents path composition naturally.
Working with Files in Practice
Consider a simple ETL script that merges CSV files.
Using os.path
import os
import pandas as pd
input_dir = "data/input"
output_dir = "data/output"
csv_files = [f for f in os.listdir(input_dir) if f.endswith(".csv")]
dfs = [pd.read_csv(os.path.join(input_dir, f)) for f in csv_files]
result = pd.concat(dfs)
result.to_csv(os.path.join(output_dir, "merged.csv"), index=False)
Using pathlib
from pathlib import Path
import pandas as pd
input_dir = Path("data/input")
output_dir = Path("data/output")
dfs = [pd.read_csv(file) for file in input_dir.glob("*.csv")]
result = pd.concat(dfs)
result.to_csv(output_dir / "merged.csv", index=False)
The pathlib version reads closer to intent. It avoids string manipulation and reduces boilerplate.
Useful Path Methods
pathlib provides many convenience methods:
file = Path("data/input/report.csv")
file.exists()
file.stem
file.suffix
file.with_suffix(".json")
file.read_text()
file.write_text("content")
These operations feel natural because the path itself knows how to behave.
Path vs PurePath
Path interacts with the actual file system.
PurePath handles path logic without accessing the filesystem.
In most applications, Path is what you want. PurePath is useful when you need purely computational path manipulation.
Why pathlib Matters
- Object-oriented API
- Cross-platform consistency
- Cleaner composition with
/ - Built-in file operations
- Better integration with modern libraries
In data pipelines and automation scripts, pathlib makes code feel more declarative and less fragile.
Best Practices
- Use
Pathobjects consistently across your project - Avoid mixing strings and
Pathinstances unnecessarily - Prefer
.glob()and.rglob()for discovery - Use
.resolve()when working with absolute paths
Final Take
os.path still works and remains fully valid. However, pathlib provides a clearer and more expressive model for working with file systems.
Adopting it is not about replacing older code blindly. It is about choosing a more readable and maintainable abstraction for modern Python projects.