The first thing I had to face was debugging the application. RStudio has the ability to select individual sections of code and run them. This helps a lot when working with R markdown, since you cannot set a breakpoint in Debug mode. And you can select lines and run them anywhere.
Moreover, he takes the variables that will be called in these lines from the global environment. It turns out that to check how this or that function works, it is enough to create global variables, run the code from these functions line by line and watch what happens. I work in RStudio, where you can see the values ββof these variables in the Global Environment tab.
They create or change the values ββof these variables via the Console.
But besides variables, there is another problem. When you execute commands line by line, it may not find the code of the functions that are called in the lines that you execute. To do this, they also need to be loaded into memory. To do this, you need to open the file with the code of this function and click on the Source button.
There is also a classic Debug mode in RStudio. It has the ability to set Break Points, with the ability to run in debug mode, and the browser () function, upon encountering which R interrupts the code execution, allowing you to debug the application. But in our project this was not widely used due to the work with the R markdown.
The next thing I encountered in R is two types of projects: a regular project (New Project) and a project of type package (R Package). When I joined the team, there was a mix of these 2. It seemed like there was a Package, but it was not going to be built and we launched it through RScrtipt. Now, thanks to the efforts of my colleagues, we have a working R Package.
A regular project (New Project) offers to write R-script files, where one file is connected to another through the source () function. Thus, when you run the script, you get like βone very large fileβ, which includes all the files of the project. It is not always convenient and not very flexible.
Unlike a regular project, a project of the R Package type invites us to write a library of functions in R, which can then be installed on any machine and call these functions inside our R script file. There is really one caveat. Functions are only available from the R script. Therefore, before you start working with them, you will need to create such a script and write calls to these functions in it. It is launched in the console using the command: Rscript. For this to work, you need to register the path to the Rscript.exe file in the environment variables. On my machine, this path looks like this: C: \ Program Files \ R \ R-4.0.3 \ bin... When creating your functions in a project of the Package type, in development mode, you should use the load_all () function, which pulls all changes into memory. If you do not use it, then whenever you change the code in the project, in order for these changes to take effect, you need to start the installation process, which R does not do quickly.
Now about the R-Package project: Unlike a simple project, it contains a certain required structure and special files. It:
- DESCRIPTION file with package description,
- man folder for function descriptions,
- the NAMESPACE file with a list of available functions of the package being created,
- a folder called R, which should contain your R code
- .Rbuildignore ,
- .Rhistory,
- .RData Environment,
.Rhistory and .RData are present in both a Package project and a regular project. You can also create an inst folder for additional resources and a test folder for unit tests. For more details on how the R package works and why it works this way, you can see the link .
When creating an R Package project, the roxygen2 utility is used. It helps you create documentation for your package. The idea is that you describe each function right in the code, and the utility itself transfers this description to the man folder, converting it into the required format and adding information about the functions to the NAMESPACE file. Read more about roxygen2 here .
There are also useful packages for setting up a project, which are used in almost all instructions for creating it, at least in all those that I found on the Internet:
- devtools - the main package, which contains most of the commands for working with the project in their simplified form
- usethis is a helper package that simplifies many routine operations
- testthat - a package for writing Unit tests
- covr - package for checking code for unit test coverage
All public packages with their descriptions and documentation for the R language are stored in CRAN - The Comprehensive R Archive Network (https://cran.r-project.org/)
For convenience, RStudio already has built-in tools for checking a newly created project (Check Package) and testing it (Test Package).
It seems that he told everything he wanted, but it is better to see once than read a hundred times. Below is a video on how to get started with R:
- Getting started with R on RStudio. R project or R package
- Creating an R package project along with Unit tests and documentation
- Running and Debugging R Code for an R Package Project
In my experience, finding yourself on a project with a new programming language, most often, the biggest problem is not the language itself, but the tools for working with it, configuration tools and environment settings. Hopefully after reading this article, it will be much easier for many to get started with R.