Introduction

The size of fish at a previous time in their life is often estimated through “back-calculation.” Back-calculation of previous lengths requires accurate measurements of annual growth on calcified structures from individual fish and a suitable model that relates growth on the structure to growth of the fish.

The FishBC software is commonly used to measure lengths on a calcified structure and apply a back-calculation model to estimate length at previous ages. However, FishBC only works on out-dated computers, there are no plans to update it, and it is not open source. The functionality in the RFishBC package is meant to replace FishBC. Methods for making measurements on images of calcified structures is demonstrated in this vignette. Using those measurements to back-calculate fish length at a previous age is demonstrated in the Compute Back-Calculated Lengths vignette.

 

 


Vignette Assumptions

Understand Back-Calculation

This vignette assumes that you have a basic understanding of how to back-calculate fish lengths at previous ages as described in the Short Introduction to Back-Calculation vignette. At the very least, you should be aware of what calcified structures and radial measurements are.

Static Structure Images

This vignette also assumes that you have static digital images of structures. The images must be of jpeg (.jpg), portable network graphics (.png), bitmap (.bmp), or TIFF format. Images will usually be obtained from a camera mounted on a microscope and connected to a computer. Below is an image of a Kiyi (Coregonus kiyi) scale.

 

Ideally, but not necessarily, the image will also contain an object of known length (e.g., a “scale-bar”) so that actual lengths on the structure can be found. If a scale-bar is absent, then the measured lengths will be on an arbitrary scale (i.e., the actual values will be meaningless but the proportion of the total structure radius to each annulus will be meaningful). An image of a Kiyi otolith with a 1-mm scale-bar is shown in later sections of this vignette.

The process described herein requires that all images to be processed be in the same directory or folder. To be most efficient (and simple), this directory should contain ONLY image files related to a particular project (e.g., one species for one water body for one year) and all image files should be of the same type (e.g., png or jpg). Additionally, as shown in the Processing Multiple Images section below, it may be more efficient if the image file names end with an underscore (i.e., a “_“) followed by the fish’s unique identification number (and then, of course, the file extension).

The working directory of R should be set with setwd() to the directory that contains the images. The following is an example of setting a working directory in R.

setwd("c:/work/aging/Kiyi2014")

The working directory may also be set interactively through a dialog box using the following code.1

Finally, note that the process described herein will produce one R data object file (.rds files; hereafter called “R data file”) for each set of measurements made on a structure image. These R data files will be saved in the current working directory (likely the directory with the structure image files). I suggest keeping the R data files and corresponding structure image files together at all times, as the R data files serve as archives of the data collected from the structure image files.

R Packages

The methods described in this vignette require the following packages.

library(RFishBC)
library(dplyr)    # for mutate(), inner_join(), %>%

 

 


Measure Radii

The digitizeRadii() function is used to

  1. Load a structure image.
  2. Optionally provide a scale for the image.
  3. Optionally interactively select the structure focus and margin (to form a linear transect on which annuli will be marked).
  4. Interactively select annuli on the structure image.
  5. Create an R data file that contains the radial measurements to selected annuli and other information about the structure and the data extraction process.

For example (but described more thoroughly further below), the line below identifies “Scale_1.jpg” as the structure image, this fish has an identification number of “1”, this reading of the image should be labelled as “DHO”,2 and the structure margin or edge should not be considered as an annulus. When this line is run it will open the image,3 allow the user to select points that represent a linear transect and annuli on that transect, and save information about this image and results of this process to “Scale_1_DHO.rds”.4

digitizeRadii("Scale_1.jpg",id="1",reading="DHO",edgeIsAnnulus=FALSE)

Further specifics of digitizeRadii() and many of its arguments5 are described below.

Basics

The digitizeRadii() function requires only three arguments.

  • img: File name (or names; see the Processing Multiple Images section) for the structure image (or images), which must be in the current working directory. A dialog box will be provided from which the image file (or files) can be selected if this argument is not given and you are using a Windows machine.
  • id=: The unique identifier (or identifiers if more than one image is provided in img) for the fish/structure(s). If this argument is missing, then the ID can be entered in a dialog box (if using Windows) or in the console. By default, the Windows dialog will be populated with the fish ID if that ID follows an underscore at the end of the filename (sans the extension).
  • edgeIsAnnulus=: A logical that indicates whether the point selected at the structure margin should be considered an annulus or not. If the fish was captured at a time when the margin shows growth, but not a complete year’s worth of growth, then use edgeIsAnnulus=FALSE. However, if the fish was captured before the current year’s growth commenced or after it has completed then use edgeIsAnnulus=TRUE. Identifying whether the structure margin is an annulus or not is critical to properly recording radial measurements. As such, there is no default for this argument (i.e., it MUST be set by the user).

Other optional arguments that are likely to be commonly used are:

  • reading=: Label for the reading. The reading= argument is primarily used when the structure is read more than once. However, I suggest giving a descriptive label to reading= even if there are no plans to read the structure again.6
  • windowSize=: A value that sets the size of the separate window in which the image will appear. This value will become the larger of the two dimensions, with the other value proportionate so that the original aspect ratio of the image is maintained. Note that the default is windowSize=7; so values larger than 7 represent a “zooming in.”
  • device=: The image will be opened in a separate window. This will happen seamlessly with many operating systems (especially if using a Windows machine). However, device="X11" may be needed with some Mac OS.7

Finally, note that the R data file that will be created after the annuli have been selected will have the same name as the image file8 but including the suffix optionally provided in suffix=. If nothing is given in suffix=, then a suffix will be created from reading= (if it exists). For example, if the structure image file was named “Scale_1.jpg”, then the resultant R data file will be named “Scale_1_DHO.rds” if reading="DHO" and suffix= was not set or “Scale_1_TESTING.rds” if reading="DHO" and suffix="TESTING".

By default, the fish ID will be shown in the top-left corner of the image. This may be moved by giving a different location to pos.info=. For example, pos.info="bottomright" would move this information to the bottom-right corner of the image. The color of this information may be changed with col.info= and the relative size may be changed with cex.info=.

At this point, the image may look something like that below.

 

Other arguments to digitizeRadii() are described in the specific sections below.

Setting the Scale

A scaling factor to convert measurements on the image to actual measurements on the structure is required if actual lengths, rather than arbitrary (but proportional) lengths, are needed.9 This scaling factor may be calculated from a scale-bar found on the image or provided by the user.

Scale-bar On Image: If a scale-bar of known length exists on the image, then use scaleBar=TRUE with the actual length of the scale bar given in scaleBarLength=. You will then select the two end points of the scale-bar on the structure image prior to selecting points that represent annuli. Press the ‘f’ key (for “finished”) after selecting the end points of the scale-bar.10 An appropriate scaling factor will be computed from your selections and the radial measurements on the image will be converted to actual lengths on the structure.

Separately Defined Scaling Factor: In applications where a scale-bar does not exist on the image, the user can provide a value to scalingFactor=, which will be multiplied by lengths on the structure image to derive actual lengths. One way to derive this scaling factor is to capture an image of the structure at a specific magnification on the microscope and then capture a separate image of an object of known length at that same magnification. Note that these captured images must be of the same size so that the aspect ratio is consistent. A scaling factor may then be computed from the image with the object of known length and applied to the structure image. This scaling factor can be found by giving findScalingFactor() the file name with the object of known length and that known length in knownLength=. The value returned from findScalingFactor() can then be given to scalingFactor= in digitizeRadii().11

If no scalingFactor= is derived from a scale-bar or provided, then the radial measurements returned by digitizeRadii() are simply proportional to the unknown actual lengths on the structure.

At this point (i.e., after having selected the scale-bar endpoints), the image may look something like that below.

 

Selecting a Transect

After the scaling factor has been determined (or provided) and if makeTransect=TRUE (the default), then you will select a transect on the structure image on which annular marks will be selected. This transect is selected by first selecting the structure focus and then selecting the structure margin/edge and pressing the ‘f’ key.12 The color of the transect may be changed with col.transect=. The width of the transect may be increased by including a number greater than 1 in lwd.transect=.

At this point (i.e., after having selected the transect endpoints), the image may look something like that below.

 

If you prefer not to use a transect, for example if the “growth trajectory” is primarily curved, then set makeTransect=FALSE (which will then change snap2Transect= to FALSE). If you choose not to define a transect, then, in contrast to what is described in the next section, you will be prompted to select the structure center and successive annuli out to the margin.

 

Selecting Annuli

Once a transect has been identified on the structure (assuming that you are using a transect), then you can select points on the structure that represent annuli. Points are selected by clicking with the first (left) mouse button at a point on the image. The most recently selected point can be removed by pressing the ‘d’ key (for “delete”). When the last point has been selected, press the ‘f’ (for “finished”) key.

Selected points will be marked with the plotting character given in pch.sel= (defaults to a filled circle) with a color given in col.sel= (defaults to yellow). Any deleted points will be marked with the color and character in col.del= and pch.del= (defaults to a red circle with an “x” in it).

If using a linear transect, many users prefer that all selected points fall exactly on the transect. In practice, some points may be selected that are slightly off the transect. Selected points will be “moved” perpendicularly to fall exactly on the transect when snap2Transect=TRUE (the default).

At this point, the image may look something like that below (note that two points were deleted in the selection process for this image).

 

When you have finished selecting points, information about your selections, including the calculated radial measurements, are saved to the R data file. The contents of this file are described further here.

 

Processing Multiple Images

It will be common to process a number of image files, one after another. The selection of the images can be made more efficient by supplying digitizeRadii() a vector of image file names and corresponding fish IDs. These vectors can be constructed in a variety of ways before calling digitizeRadii(). First, the user can simply type the image names and fish IDs into vectors; e.g.,

imgs <- c("Scale_1.jpg","Scale_2.jpg","Scale_3.jpg")
ids <- c("1","2","3")

However, if the image file names follow a general pattern, for example always being JPG files and always containing the word “Scale” or “Bass” or something similar, then a list of all image file names in the current working directory can be obtained with listFiles(). This function takes the common extension as its first argument and the common “other” words in other=. For example, the following code finds all JPG files that also contain the word “Scale” in the current working directory.

( imgs <- listFiles("jpg",other="Scale"))
#> [1] "Scale_1.jpg"   "Scale_1_A.jpg" "Scale_1_B.jpg" "Scale_1_C.jpg"
#> [5] "Scale_1_D.JPG" "Scale_2.jpg"   "Scale_3.jpg"

The user would still need to manually create a vector of corresponding fish IDs. However, if the image file names follow the idiom of having the fish ID after an underscore (i.e., “_“) at the end of the file name (not including the extension) then getID() can be used to efficiently extract a vector of fish IDs from the vector of file names. For example,

( ids <- getID(imgs) )
#> [1] "1" "A" "B" "C" "D" "2" "3"

With these vectors, the following call to digitizeRadii() will bring up the first image on which you can mark annuli as described above. When you have finished with the first image then the second image will automatically appear on which you can then mark annuli. This process will be repeated until the last image in the vector of images has been completed.

digitizeRadii(imgs,id=ids,reading="DHO",edgeIsAnnulus=FALSE)

Some directories may have such a large number of images that the user will not want to process them all at one time as the code above would do. In this case, the user could select a certain number, say 10, of the images to process at any one time. For example, the call to digitizeRadii() below would be used to process the first 10 images from the vector of image names.

digitizeRadii(imgs[1:10],id=ids[1:10],reading="DHO",edgeIsAnnulus=FALSE)

The second 10 images could then be processed with the following.

digitizeRadii(imgs[11:20],id=ids[11:20],reading="DHO",edgeIsAnnulus=FALSE)

If one does not include an image file name (or vector of image file names) in digitizeRadii() then a dialog box will appear in which a file name or names can be selected. If multiple file names are selected then digitizeRadii() will assume that the file names use the convention of having the fish ID after the last underscore. In this way, multiple files can be chosen from a dialog box, rather than by creating the vectors described above. In this case, the call to digitizeRadii() would look like the following.

digitizeRadii(reading="DHO",edgeIsAnnulus=FALSE)

Note that when processing multiple image files as described in this section, all of the options must be the same across all of the images. In the example above, for example, all image files must use reading="DHO" and edgeIsAnnulus=FALSE.

 

Starting Over or Skipping an Image

The user can “start over” the processing of any image by pressing the “z” key at any time during the processing. Note however that this is a “hard reset” in the sense that all points selected prior to pressing the “z” key will be lost, the original unmarked image will be reloaded, and you will need to start over processing the image (i.e., marking the scale-bar, transect, and annuli again).

The user may also abort or skip processing an image that has been loaded by pressing the “q” key at any time during the processing. This will most likely be useful when processing multiple images at one time as described previously. For example, an image may appear that is unreadable such that annuli cannot be reliably marked on the image. Note that aborting processing an image will result in no R data file being created for that image.

If the user is processing multiple images as shown in the Processing Multiple Images section, then “q” will abort the current image and move to the next image. However, pressing “k” (i.e., “kill” the process) will abort the current image and NOT move on to any other images.

 

 


Setting Argument Defaults for a Session

As described above, digitizeRadii() has several arguments that provide flexibility when measuring radii on images. The default values for all of these arguments can be seen with RFBCoptions() (i.e., without any arguments). The value for any argument can be seen by appending the argument name to RFBCoptions() with a $. For example, the current setting for the makeTransect argument is TRUE as shown below.

RFBCoptions()$makeTransect
#> [1] TRUE

Default values for these arguments may be changed within digitizeRadii(). For example, the code below sets the “reading” label to “DHO”, the edge to not be considered an annulus, the width of the transect line to be thicker, and identifying that a scale-bar with a known length of 0.6 mm is present.

digitizeRadii("Scale_1.jpg",id="1",reading="DHO",edgeIsAnnulus=FALSE,
              lwd.transect=3,scaleBar=TRUE,scaleBarLength=0.6)

However, changing the arguments within digitizeRadii() is inefficient if you will be processing many images with the same arguments. Thus, the default values for these arguments can be set for the entire session (i.e., until you change them or close R and open it again) by including the argument name set equal to the desired default value within RFBCoptions(). For example, if the code below is run at the beginning of a session (i.e., early in the script), then every call to digitizeRadii() after that will default to using “DHO” as the reading label, not treating the edge as an annulus, using a thicker line for the transect, and identifying that a scale-bar with a known length of 0.6 exists on the image.

RFBCoptions(reading="DHO",edgeIsAnnulus=FALSE,lwd.transect=3,
            scaleBar=TRUE,scaleBarLength=0.6)

With these changes to the default settings, the last call to digitizeRadii() above could be simplified as shown below.

digitizeRadii("Scale_1.jpg",id="1")

Argument values can still be changed from the default values for a particular call to digitizeRadii() by including that argument in the specific call. For example if the edge was an annulus for the structure on only one of the images, then include edgeIsAnnulus=TRUE in digitizeRadii().

 

 


Data from One Structure

The radial measurements recorded from one structure may be seen by submitting the R data file name to combineData().13 By default the radial measurement that includes the “plus-growth” will be omitted (as this radial measurement is equal to the radial measurement at capture and is thus redundant with the value in the radcap column).14

combineData("Scale_1_DHO.rds")
#>   id reading agecap ann       rad    radcap
#> 1  1     DHO      5   1 0.2208691 0.5163737
#> 2  1     DHO      5   2 0.2893299 0.5163737
#> 3  1     DHO      5   3 0.3259383 0.5163737
#> 4  1     DHO      5   4 0.4626601 0.5163737
#> 5  1     DHO      5   5 0.5017862 0.5163737

By default the data are shown in “long” format where each row consists of one radial measurement with all radial measurements for an individual fish distributed across several rows. The radial data can be shown in “wide” format where each row consists of all the radial measurements (in separate columns) for an individual fish by including formatOut="wide".

combineData("Scale_1_DHO.rds",formatOut="wide")
#>   id reading agecap    radcap      rad1      rad2      rad3      rad4      rad5
#> 1  1     DHO      5 0.5163737 0.2208691 0.2893299 0.3259383 0.4626601 0.5017862

 

 


Combine Data from Multiple Structures

Of course, most analyses will consist of collecting radial measurements from structures from many fish. For example, suppose that “Scale_1.jpg” and “Scale_2.jpg” were both read by “DHO” using the following code. Following this, “Scale_1_DHO.rds” and “Scale_2_DHO.rds” would both exist in the current working directory.

RFBCoptions(reading="DHO",edgeIsAnnulus=FALSE)
digitizeRadii()  # Select both images in a dialog box

Radial measurements from multiple structures can be combined into one data.frame with combineData() if the appropriate R data file names are listed in a vector. The listFiles() function may be used to identify all filenames in the current working directory that have the file extension given in the first argument. For example, all files in the current working directory with the “rds” extension are identified below.

listFiles("rds")
#> [1] "DWS_Oto_89765_DHO.rds"  "Oto140306_DHO.rds"      "Oto140306_OHD.rds"     
#> [4] "Scale_1_DHO.rds"        "Scale_1_ODH.rds"        "Scale_1_OHD.rds"       
#> [7] "Scale_2_DHO.rds"        "Scale_2_OLDwNoNote.rds" "Scale_3_DHO.rds"

This list of names can be further filtered by including other key words for the filenames in other=. In this case, the list should be limited to those files with “Scale” in the name and those files with just “DHO”.

( fns <- listFiles("rds",other=c("Scale","DHO")) )
#> [1] "Scale_1_DHO.rds" "Scale_2_DHO.rds" "Scale_3_DHO.rds"

The listFiles() result should be saved to an object so that the names can be given to combineData() as shown below.15

( dfrad <- combineData(fns) )
#>    id reading agecap ann       rad    radcap
#> 1   1     DHO      5   1 0.2208691 0.5163737
#> 2   1     DHO      5   2 0.2893299 0.5163737
#> 3   1     DHO      5   3 0.3259383 0.5163737
#> 4   1     DHO      5   4 0.4626601 0.5163737
#> 5   1     DHO      5   5 0.5017862 0.5163737
#> 6   2     DHO      4   1 0.1377625 0.3908662
#> 7   2     DHO      4   2 0.2236611 0.3908662
#> 8   2     DHO      4   3 0.3026716 0.3908662
#> 9   2     DHO      4   4 0.3530492 0.3908662
#> 10  3     DHO      1   1 0.5202232 0.5202232

Again, the data can be shown in “wide” format by including formatOut="wide".

( dfrad2 <- combineData(fns,formatOut="wide") )
#>   id reading agecap    radcap      rad1      rad2      rad3      rad4      rad5
#> 1  1     DHO      5 0.5163737 0.2208691 0.2893299 0.3259383 0.4626601 0.5017862
#> 2  2     DHO      4 0.3908662 0.1377625 0.2236611 0.3026716 0.3530492        NA
#> 3  3     DHO      1 0.5202232 0.5202232        NA        NA        NA        NA

 

 


Output Data File

Other information about the fish (e.g., location of capture, length, sex) is likely held in a separate file. Below, example “other” data are loaded into the dffish data.frame. Note that the id variable created from processing the structure images above are characters. In this case, read.csv() reads the id variable from the external data file as numeric because the unique IDs were simple numbers. The second line of code below converts these numeric IDs to characters so that this data.frame can be joined with the radial measurements data.frame from above.16

dffish <- read.csv("FishData.csv",stringsAsFactors=FALSE) %>%
  mutate(id=as.character(id))

The data in the dffish and dfrad data.frames are then joined by the common id variable using inner_join().

fishdat <- dffish %>%
  inner_join(dfrad,by="id")
fishdat
#>    id  loc sex len reading agecap ann       rad    radcap
#> 1   1 MI-5   M 189     DHO      5   1 0.2208691 0.5163737
#> 2   1 MI-5   M 189     DHO      5   2 0.2893299 0.5163737
#> 3   1 MI-5   M 189     DHO      5   3 0.3259383 0.5163737
#> 4   1 MI-5   M 189     DHO      5   4 0.4626601 0.5163737
#> 5   1 MI-5   M 189     DHO      5   5 0.5017862 0.5163737
#> 6   2 MI-6   F 210     DHO      4   1 0.1377625 0.3908662
#> 7   2 MI-6   F 210     DHO      4   2 0.2236611 0.3908662
#> 8   2 MI-6   F 210     DHO      4   3 0.3026716 0.3908662
#> 9   2 MI-6   F 210     DHO      4   4 0.3530492 0.3908662
#> 10  3 MI-5   M 145     DHO      1   1 0.5202232 0.5202232

One could also join with the “wide” data in dfrad2.

fishdat2 <- dffish %>%
  inner_join(dfrad2,by="id")
fishdat2
#>   id  loc sex len reading agecap    radcap      rad1      rad2      rad3
#> 1  1 MI-5   M 189     DHO      5 0.5163737 0.2208691 0.2893299 0.3259383
#> 2  2 MI-6   F 210     DHO      4 0.3908662 0.1377625 0.2236611 0.3026716
#> 3  3 MI-5   M 145     DHO      1 0.5202232 0.5202232        NA        NA
#>        rad4      rad5
#> 1 0.4626601 0.5017862
#> 2 0.3530492        NA
#> 3        NA        NA

Either file can be written to a “comma-separated values” (CSV) file17 with write.csv() using the R object name (e.g., fishdat or fishdat2) as the first argument and a name for the file in file=. Additionally, I prefer to have non-quoted values by using quote=FALSE and no row names by using row.names=FALSE. For example, the “one-measurement-per-line” data can be output to “Kiyi2014_BCs.csv” as follows.

write.csv(fishdat,file="Kiyi2014_BCs.csv",quote=FALSE,row.names=FALSE)

 

 

Footnotes


  1. In RStudio, the working directory can be set with any of the options under the Session … Set Working Directory menu. My preference is to start a script that will contain all of the code described later in this vignette. If this script is saved to the same directory with the structure images then the working directory can be set in RStudio with the Session … Set Working Directory … to Source File Location menu items. I then copy the resultant setwd() code to my script so that I do not have to use the menu items when I run this script again.↩︎

  2. I use my initials (“DHO”) here for reading= simply as an example. You will likely want to use something else.↩︎

  3. This assumes that the Scale_1.jpg file is in the current working directory. Use getwd() to see the current working directory.↩︎

  4. Actually much more than the radial measurements are recorded in the R data file (see here). Also note that the radii are on an arbitrary scale in this case because no scale-bar was available on the image.↩︎

  5. The many specific arguments to digitizeRadii() are controlled with RFBCoptions() (described later) and described in detail here.↩︎

  6. A more detailed description about the structure can be given to description for saving in the R data file for future use. For example, one may use description="Kiyi scale read once by Ogle on 22-Apr-18" to provide more information about the structure reading.↩︎

  7. You may find this resource useful with respect to the X11 device and Mac OS use.↩︎

  8. The extension (e.g., “png” or “jpg”) will not be included in the resultant R data object file.↩︎

  9. Some back-calculation methods require knowing the relationship between actual scale length and fish length. See the Short Introducton to Back-Calculation vignette for more details.↩︎

  10. The last selected point can be deleted by pressing the ‘d’ key. This can be done multiple times such that the last number of selections can be deleted. The deleted points will be marked with the plotting character and color in pch.del= and col.del= (defaults to a red circle with an ‘x’ in it).↩︎

  11. Thus, findScalingFactor() would be run prior to digitizeRadii().↩︎

  12. The last selected point can be deleted by pressing the ‘d’ key. This can be done multiple times such that the last number of selections can be deleted. The deleted points will be marked with the plotting character and color in pch.del= and col.del= (defaults to a red circle with an ‘x’ in it).↩︎

  13. If no file name is given, then a dialog box will appear from which the data file can be selected. It is also assumed that the file is in the current working directory.↩︎

  14. To include the radial measurement with “plus-growth” then use deletePlusGrowth=FALSE.↩︎

  15. If no file names are given, then a dialog box will appear from which multiple data files can be selected.↩︎

  16. The two files cannot be joined by the “id” variable if the “id” variables are of different types (e.g., character and numeric) in the two files. Thus, as shown here, they must be coerced to be the same type. This line of code would no be needed if the “id” variables were of the same type in the two files as would likely occur if the “id” variable was not simply numbers.↩︎

  17. CSV files are small, portable, and can be opened directly in most spreadsheet softwares.↩︎