Thoughts on how best to use R package data with students.
Data
Author
Derek H. Ogle
Published
Dec 19, 2022
Modified
Dec 20, 2022
Note
No special packages are loaded for use in this post.
Introduction
A large number and variety of data sets are provided in the FSA and FSAdata packages, which I collectively call here the “fishR data.” Lists of the data available in these packages are available alphabetically ordered or arranged by fisheries topic.1 Items on those lists are linked to a documentation file where the origin of the data, its variables, and other items are described. This is a rich source of open-source data that can be used for teaching purposes.
1 A more comprehensive list of fisheries data in all CRAN packages is here.
Data sets in FSA and FSAdata can be, as with data in all packages, accessed with data() by including the name of the data as the first argument and the package name in package=. For example, the WalleyeErie2 data from FSAdata are loaded below.
#R| setID loc grid year tl w sex mat age
#R| 1 2003001 1 940 2003 360 460 male mature 2
#R| 2 2003001 1 940 2003 371 571 male mature 2
#R| 3 2003001 1 940 2003 375 507 male mature 2
#R| 4 2003001 1 940 2003 375 584 male mature 2
#R| 5 2003001 1 940 2003 375 537 male mature 2
#R| 6 2003001 1 940 2003 376 553 male mature 2
While this method for accessing these data is efficient, I don’t like to use it with students because in the “real world” they will not be accessing data from an R package, rather they will be using their own data stored in some other format. With students just learning R I usually have them load CSV files that I either provide for them or they produce themselves.2 They then load the data with read.csv() from base R.3
2 We also discuss the advantages of CSV files – lightweight, not proprietary, etc.
3 For more advanced students I will use read_csv() from readr.
Using CSV Files
To aid use of the fishR data as CSV files we have provided links to the raw CSV files in the R documentation,4 in the on-line documentation, or in the PDF documentation on CRAN. In all instances you will see a highlighted “CSV file” link at the end of the description in the “source” section of documentation (Figure 1). Pressing this link will bring up the raw CSV file which can then be saved to your personal computer.5
4 For example, (e.g., try ?FSAdata::WalleyeErie2.
5 Alternatively, right-click on the CSV link to save the file.
Students can link directly to URL for the CSV file but this requires a connection to the internet (each time the data is loaded) and does not help students learn how to organize data on their own computers.
#R| setID loc grid year tl w sex mat age
#R| 1 2003001 1 940 2003 360 460 male mature 2
#R| 2 2003001 1 940 2003 371 571 male mature 2
#R| 3 2003001 1 940 2003 375 507 male mature 2
#R| 4 2003001 1 940 2003 375 584 male mature 2
#R| 5 2003001 1 940 2003 375 537 male mature 2
#R| 6 2003001 1 940 2003 376 553 male mature 2
However, this last example also demonstrates how an instructor could link directly to the CSV file in the resources they provide the student.
Conclusion
In summary, we hope you will take advantage of the data resources provided in the FSA and FSAdata packages. However, we encourage you not to have students access the data through data() but instead to use the CSV files linked to in the documentation as described above.