Skip to content

eitsupi/neo-r-polars

Repository files navigation

polars (R Polars)

R-multiverse status R-universe status badge CRAN status Docs dev version

Polars is a blazingly fast DataFrame library, written in Rust.

The goal of Polars is to deliver fast, efficient data processing that:

  • Utilizes all available cores on your machine.
  • Optimizes queries to reduce unneeded work/memory allocations.
  • Handles datasets much larger than your available RAM.
  • Follows a consistent and predictable API.
  • Adheres to a strict schema (data-types should be known before running the query).

This polars R package provides the R bindings for Polars. It can be used to process Polars DataFrames and other data structures, convert objects between Polars and R, and can be integrated with other common R packages.

To learn more, read the online documentation for this R package, and the user guide for Python/Rust Polars.

Install

The recommended way to install this package is from the R-multiverse community repository:

Sys.setenv(NOT_CRAN = "true")
install.packages("polars", repos = "https://community.r-multiverse.org")

More recent (development) version may be installed from the rpolars R-universe repository:

Sys.setenv(NOT_CRAN = "true")
install.packages('polars', repos = c("https://rpolars.r-universe.dev", "https://cloud.r-project.org"))

Usage

To avoid conflicts with other packages and base R function names, many of polars’ functions are hosted in the pl environment, and accessible via the pl$ prefix. Most of the functions for polars objects should be chained with the $ operator.

Additionally, the majority of the functions are intended to match the Python Polars API.

These mean that Polars queries written in Python and in R are very similar.

For example, writing the example from the user guide of Python/Rust in R:

library(neopolars) # TODO: rename to polars

# Prepare a CSV file
csv_file <- tempfile(fileext = ".csv")
write.csv(iris, csv_file, row.names = FALSE)

# Create a query plan (LazyFrame) with filtering and group aggregation
q <- pl$scan_csv(csv_file)$filter(pl$col("Sepal.Length") > 5)$group_by(
  "Species",
  .maintain_order = TRUE
)$agg(pl$all()$sum())

# Execute the query plan and collect the result as a Polars DataFrame
df <- q$collect()

df
#> shape: (3, 5)
#> ┌────────────┬──────────────┬─────────────┬──────────────┬─────────────┐
#> │ Species    ┆ Sepal.Length ┆ Sepal.Width ┆ Petal.Length ┆ Petal.Width │
#> │ ---        ┆ ---          ┆ ---         ┆ ---          ┆ ---         │
#> │ str        ┆ f64          ┆ f64         ┆ f64          ┆ f64         │
#> ╞════════════╪══════════════╪═════════════╪══════════════╪═════════════╡
#> │ setosa     ┆ 116.9        ┆ 81.7        ┆ 33.2         ┆ 6.1         │
#> │ versicolor ┆ 281.9        ┆ 131.8       ┆ 202.9        ┆ 63.3        │
#> │ virginica  ┆ 324.5        ┆ 146.2       ┆ 273.1        ┆ 99.6        │
#> └────────────┴──────────────┴─────────────┴──────────────┴─────────────┘

There are also some functions to manipulate polars objects using base R and some popular other packages.

# Subset a Polars DataFrame using the `[` operator
df[1:2, 1:2]
#> shape: (2, 2)
#> ┌────────────┬──────────────┐
#> │ Species    ┆ Sepal.Length │
#> │ ---        ┆ ---          │
#> │ str        ┆ f64          │
#> ╞════════════╪══════════════╡
#> │ setosa     ┆ 116.9        │
#> │ versicolor ┆ 281.9        │
#> └────────────┴──────────────┘

# Execute a query plan and collect the result as a tibble data frame
tibble::as_tibble(q)
#> # A tibble: 3 × 5
#>   Species    Sepal.Length Sepal.Width Petal.Length Petal.Width
#>   <chr>             <dbl>       <dbl>        <dbl>       <dbl>
#> 1 setosa             117.        81.7         33.2         6.1
#> 2 versicolor         282.       132.         203.         63.3
#> 3 virginica          324.       146.         273.         99.6

The Get Started vignette (vignette("polars")) provides a more detailed introduction.

Extensions

While one can use this package as-is, other packages build on it to provide different APIs:

Maintainers

Version 0 of R Polars was previously maintained by Søren Havelund Welling.

Acknowledgements

This package is based on the Polars open source project, originally founded by Ritchie Vink and developed by many contributors.

License

MIT @ polars authors

About

Next generation of Polars R API

Resources

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE
MIT
LICENSE.md

Stars

Watchers

Forks

Releases

No releases published

Contributors 4

  •  
  •  
  •  
  •