

dplyr
dplyr: A grammar of data manipulation
dplyr is an R package that provides a grammar of data manipulation with a consistent set of verbs for common data tasks: filtering rows, selecting columns, creating new variables, sorting data, and computing summaries. These operations work naturally with grouping to perform calculations by category.
The package handles multiple computational backends beyond standard data frames, translating your code to work efficiently with databases (via SQL), large in-memory datasets (via data.table or DuckDB), cloud storage (via Apache Arrow), and distributed systems (via Apache Spark). This backend flexibility lets you use the same dplyr syntax whether your data fits in memory or requires specialized storage systems. The package integrates with other tidyverse tools for end-to-end data analysis workflows.
Contributors

Hadley Wickham
Chief Scientific Officer

Lionel Henry
Senior Software Engineer

Davis Vaughan
Principal Software Engineer

Mine Çetinkaya-Rundel
Senior Developer Advocate

Jenny Bryan
Senior Software Engineer

Christophe Dervieux
Senior Software Engineer

Gábor Csárdi
Senior Software Engineer

Simon Couch

Neal Richardson

Tomasz Kalinowski
Engineering Manager

Charlotte Wickham
Senior Developer Advocate

Carson Sievert
Principal Software Engineer

Barret Schloerke
Senior Software Engineer

Jeroen Janssens
Head of Developer Relations


