DEV Community

Cover image for From Idea to CRAN: My Journey Building the `splitr` R Package
Akan
Akan

Posted on

From Idea to CRAN: My Journey Building the `splitr` R Package

If you've ever thought, "I wish R could do X automatically," I have a story for you. Recently, I embarked on a journey to create my first R package β€” and let me tell you, it was one of the most fun and educational experiences in my coding career.

The goal? Build splitr, a package that splits an Excel sheet into multiple sheets effi ciently, using data.table for speed and openxlsx for Excel magic.


Step 1: The Idea and the Blueprint

It all started with a common pain point: dealing with massive Excel files. Manually splitting data into chunks for reporting or analysis was tedious.

I sketched out the plan:

  • Read a source Excel sheet
  • Split rows into n chunks
  • Write each chunk into a separate sheet in a single workbook
  • Apply styles and optionally save to disk

Simple, right? But turning that plan into a robust, reusable R function is where the fun begins.


πŸ›  Step 2: Structuring the Package

Using RStudio, I created a new package project called splitr. The structure looked like this:

splitr/
β”œβ”€β”€ R/
β”‚   └── split_excel_to_sheets.R
β”œβ”€β”€ man/
β”œβ”€β”€ DESCRIPTION
β”œβ”€β”€ NAMESPACE
└── splitr.Rproj
Enter fullscreen mode Exit fullscreen mode
  • DESCRIPTION holds metadata about the package.
  • R/ is where the actual function lives.
  • man/ will eventually contain documentation generated with roxygen2.

Step 3: Writing the Function

Here’s the core of splitr:

split_excel_to_sheets <- function(file_path, n, sheet = 1, output_path = NULL) {
  dt <- data.table::setDT(openxlsx::read.xlsx(file_path, sheet = sheet))
  chunks <- split(dt, cut(seq_len(nrow(dt)), breaks = n, labels = FALSE))
  wb <- openxlsx::createWorkbook()
  for (i in seq_len(n)) {
    openxlsx::addWorksheet(wb, paste0("Part_", i))
    openxlsx::writeData(wb, sheet = i, chunks[[i]])
  }
  if (!is.null(output_path)) openxlsx::saveWorkbook(wb, output_path, overwrite = TRUE)
  invisible(wb)
}
Enter fullscreen mode Exit fullscreen mode

I loved how data.table made splitting huge datasets lightning-fast, and openxlsx let me handle all Excel styling without breaking a sweat.


Step 4: Documenting Like a Pro

Documentation is key. Using roxygen2, I added clear parameter descriptions, return values, and examples.

#' Split Excel Sheet into Multiple Sheets
#'
#' @param file_path Path to source .xlsx file
#' @param n Number of chunks
#' @param sheet Sheet name or index
#' @param output_path Optional path to save workbook
#' @return An openxlsx workbook object
#' @export
Enter fullscreen mode Exit fullscreen mode

This not only helps others understand the function but also automatically generates Rd files for CRAN.


Step 5: Overcoming Challenges

The journey wasn’t all smooth sailing:

  • Non-ASCII Characters: My code had fancy dashes and ellipses (β€” and …) that CRAN hates. tools::showNonASCIIfile() helped me locate and replace them with plain ASCII.
  • Package Dependencies: CRAN checks flagged undefined global functions from openxlsx and data.table. Fully qualifying functions like openxlsx::writeData() and data.table::setDT() solved the problem.
  • Examples & Tests: I had leftover demo functions (hello()) that weren’t defined. Removing them stopped example errors.

Every challenge was a learning moment β€” and now I feel like I truly understand what CRAN expects.


βœ… Step 6: Passing CRAN Checks

After careful fixes and multiple iterations, the ultimate moment came:

0 errors βœ” | 0 warnings βœ” | 0 notes βœ”
Enter fullscreen mode Exit fullscreen mode

CRAN checks passed perfectly. No warnings, no errors, no notes. Pure joy.


Step 7: Building & Submitting

With devtools::build(), I created the tarball for submission:

devtools::build()
Enter fullscreen mode Exit fullscreen mode

Next step: submit to CRAN. And there it is β€” my first R package, ready for the world.


Reflections

Building splitr taught me:

  • How R packages are structured
  • Why CRAN is strict (for good reason!)
  • The importance of documentation and reproducible examples
  • That challenges like encoding issues or namespace warnings are normal and solvable

The thrill of turning an idea into a fully functional CRAN-ready package is unmatched.


If you’ve ever thought about building an R package β€” just start small, document thoroughly, and run checks constantly. The learning experience is incredible.


Next steps for me:

  • Submit to CRAN βœ…
  • Share the package on GitHub for quick installation via remotes::install_github() βœ…
  • Explore more advanced Excel manipulations in R

πŸƒβ€β™‚οΈ Try It Out Yourself

You don’t have to wait for CRAN β€” you can install and try splitr directly from GitHub today:

# Install devtools if you don't have it
install.packages("devtools")

# Install splitr from GitHub
devtools::install_github("AkanimohOD19A/splitr")

# Load the package
library(splitr)

# Try splitting an Excel file into 3 sheets
wb <- split_excel_to_sheets(
  file_path   = "data/example.xlsx",
  n           = 3,
  output_path = "data/example_split.xlsx"
)
Enter fullscreen mode Exit fullscreen mode

Check out the GitHub repository:
https://github.com/AkanimohOD19A/splitr?tab=readme-ov-file

Note: CRAN submission is still in progress, so GitHub is the best way to try it right now.

Have you ever submitted a package to CRAN? What was your experience like? Drop a comment below β€” I’d love to hear your stories!

Top comments (0)