Developing an R package

Developing an R package

Introduction

R package development is no longer as it was before 2010 because now most of the work can be done by just a simple mouse-click or with the use of a function. My intention of writing this blog post is not to give a thorough demonstration of how to develop your own R package. But it will briefly explain the process with the most important steps, and will include valuable blog posts and websites which helped me to develop my own R package fitODBOD.

Credit to People of R Community

I would definitely recommend to read all of these books and start your package development. Or at least make step by step progress in your work while reading them. If you have a basic knowledge regarding R, R studio, CRAN and writing programs, they are more than enough for you to start.

This website contains everything that is in the book. The basic things related to an R package development process are structured properly here. It is very useful to read this book/website. The package structure, the type of files necessary, how should the writing be in these files and further, what kind of ways can we use to achieve the final outputs. The book is mainly focused on producing an R package which can be updated with the highest standards using reproducing ability. Such as CRAN standards and GitHub releases.

Everything related to package development is included here with clear instructions. This document is provided by the CRAN project to make package development more friendlier. It includes the official standards for files and naming conventions related to R package development.

A brief article explaining about writing functions, classes and methods for R package development. This is very abstract and useful in package development specially as it is focusing on object oriented programming and S formulas.

A book explaining R and its ability in detail, for example regarding functions, data, abilities and limitations of R. There is also a section for R packages, which has valuable information. Writing functions is very crucial in R package development therefore going through this document is worth. Several packages also include data-sets in them. While you develop functions similarly we can develop data-sets as well. There are sections which includes information regarding data-sets as well.

Another book that will be useful in understanding how R functions and data-sets can be used in R package development.

Useful book related to object oriented programming, functions and data objects related to R. Very thorough and scrutinized information with valuable explanation which makes things more clearer.

Easy implementation of packages mentioned in these cheatsheets. Very essential for someone who is interested in doing R related stuff efficient and eloquent.

Notes related to RMarkdown, very useful for vignette building.

Using GitHub for package development is very useful, specially when it comes to sharing and version control. This book explains it all with simplicity.

1) Coding Standards (Coding to Understand)

  • Focus on naming conventions.
  • Focus on input parameters and outputs.
  • Focus on indentation.
  • Comment regularly to make sense of the functions.
Sample Code

Sample Code

2) Package Structure

  • Very Important.
  • Initially few files will be originated in the designated project folder.
  • Over time we might add folders or create files manually.
  • Example - tests directory, README.Rmd, …

Package structure inside your project folder.

3) DESCRIPTION file

  • File explaining basic things related to your package.
  • Example - package name, other packages needed, authors name, …
  • Can edit manually or use specific R package.

After changes the DESCRIPTION file

4) README file

  • Very much optional.
  • Only used in related to GitHub submission.
  • Using Rmarkdown to generate a GitHub output document.

Rmarkdown document

GitHub document

Preview of GitHub document

5) /R directory

  • Most important directory.
  • The place where all your R code is written by you.
  • Best to have separate R script files for each function.
  • Need to have a R script file for Data as well.
  • R scripts can be modified further in order to create RDocumentation files(Rd files).
  • These RDocumentation files will explain about the function.
  • Processed R script files will automatically generate Rd files in the man directory.

R script file with necessary roxygen tags to develop RDocumentation files.

6) /data directory

  • Not compulsory.
  • Easy to use your own data therefore its worth it.
  • This directory will include the data-sets.
  • R directory can have an R script to generate Rd files for these data-sets.

data directory which includes data-sets.

Rscript file which includes necessary roxygen tags to generate Rd files.

7) /tests directory

  • If your package is going to be in CRAN or going to be in a platform with large range of users this would be useful.
  • Unit tests to check if functions are working properly.
  • Testing if the data sets are in proper form.

tests directory and files in side that directory.

sub directory testthat which includes test R scripts for all functions and data sets.

R script to test a function.

R script to test a data set.

8) /man directory

  • This directory will include the Rd files for all functions and data sets.
  • If you use roxygen tags there is no need to manually type them.

With the help of R script files these RDocumentation files will be generated for each function and will be in the man directory.

The RDocumentation files can be processed into html outputs or into a pdf manual.

Rd file of a data-set which is created with the help of data R script.

Html file which is generated with the help of Rd file for the data-set.

9) NAMSESPACE file

  • A file which will have all the functions that you created for your package.
  • If a function is exported then it will be in this file.

NAMESPACE file and its components.

10) .Rbuildignore file

  • A document which includes what kind of files should not be used when building the package.
  • Extensions of a file or partial or full name of the file can be added into this document.

.Rbuildignore file of fitODBOD package.

11) .gitignore file

  • A document which includes what kind of files should not be pushed to the GitHub repository.
  • Extensions of a file or partial or full name of the file can be added into this document.

.gitignore file of fitODBOD package.

Building the Package

All the files should be in their respective directories and names should not be changed for files or folders manually if they are created automatically. After checking all of this we can proceed to building the package. This process has 9 steps and below is a diagram to show how it works.

This process is explained through a small presentation here. This presentation also can be used when you need to update your package version.

Distributing the Package

There are several ways of distributing your package. They are mainly

  1. tar.qz version.
  2. zip version.
  3. GitHub Repository.
  4. The project folder which includes the package.
  5. Submit to CRAN or Bioconductor.

I have written two posts related to R packages as well. One is How to find your R package and Second is Using R packages to develop your own package. These two posts will be very useful as well.

THANK YOU

Related