Replace compSlopes() with emtrends()

Using emtrends() from emmeans as an alternative to compSlopes() which was removed from FSA.
FSA
emmeans
Author

Derek H. Ogle

Published

May 11, 2021

Modified

Jan 1, 2023

Note

The following packages are loaded for use below.

library(dplyr)    ## for filter(), mutate()
library(emmeans)  ## for emtrends()
Warning

Some functions illustrated below were in the FSA package but have now been removed and put into the non-released FSAmisc package that I maintain. These functions are used below only to show what could be done in older versions of FSA but should now be done as described in this post. DO NOT USE any of the functions below that begin with FSAmisc::.

 

Introduction

compSlopes() in FSA prior to v0.9.0 was used to statistically compare slopes for all pairs of groups in an indicator/dummy variable regression (IVR). However, the excellent emtrends() in emmmeans is a more general and strongly principled function for this purpose. As such, compSlopes() was removed from FSA in early 2022. The purpose of this post is to demonstrate how to use emtrends() for the same purpose for which compSlopes() was used.

Important

The results from compSlopes() and emtrends() will not be identical because they use different methods to correct for multiple comparisons when comparing pairs of slopes.

 

Example Data

Examples below use the Mirex data set from FSA, which contains the concentration of mirex in the tissue and the body weight of two species of salmon (chinook and coho) captured in six years. The year variable is converted to a factor for modeling purposes. To keep the presentation simple, data from only three years will be used here.

data(Mirex,package="FSA")
Mirex <- Mirex |>
  filter(year>1990) |>
  mutate(year=factor(year))
head(Mirex)
#R|    year weight mirex species
#R|  1 1992    1.9  0.10    coho
#R|  2 1992    2.0  0.09    coho
#R|  3 1992    2.4  0.12 chinook
#R|  4 1992    2.6  0.15    coho
#R|  5 1992    7.5  0.13 chinook
#R|  6 1992    7.9  0.18    coho

The lm() below fits the IVR to determine if the relationship between mirex concentration and weight of the salmon differs by year.1

  • 1 The three terms on the left side of the formula are the covariate (i.e., weight), main factor (i.e., year), and the interaction between the two (defined with the :).

  • lm1 <- lm(mirex~weight+year+weight:year,data=Mirex)

    The weight:year interaction term p-value suggests that the slopes (i.e., relationship between mirex concentration and salmon weight) differs among some pair(s) of the three years.

    anova(lm1)
    #R|  Analysis of Variance Table
    #R|  
    #R|  Response: mirex
    #R|              Df   Sum Sq  Mean Sq F value    Pr(>F)    
    #R|  weight       1 0.115886 0.115886 30.6459 1.615e-06 ***
    #R|  year         2 0.205825 0.102912 27.2149 2.028e-08 ***
    #R|  weight:year  2 0.042176 0.021088  5.5767   0.00694 ** 
    #R|  Residuals   44 0.166385 0.003781                      
    #R|  ---
    #R|  Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

    The next step is to determine which pair(s) of slopes differ significantly, which was the purpose of compSlopes() and is the purpose of emtrends().

     

    What compSlopes() Did

    compSlopes() was simple in that it only required the saved lm() object as an argument. Its returned results should be assigned to an object for further examination.2

  • 2 compSlopes() had a print() function for nicely printing the results. However, here we will look at each component separately to ease comparison with the emtrends() results.

  • csfsa <- FSAmisc::compSlopes(lm1)

    The $comparisons component in the compSlopes() object contained the results from comparing all pairs of slopes. Each paired comparison was a row in these results with the groups compared under comparison, the differences in sample slopes under diff, 95% confidence intervals for the difference in slopes under 95% LCI and 95% UCI, and unadjusted and adjusted (for multiple comparisons) p-values for the hypothesis test comparing the slopes under p.unadj and p.adj, respectively.

    csfsa$comparisons
    #R|    comparison     diff  95% LCI  95% UCI p.unadj   p.adj
    #R|  1  1996-1992 -0.01428 -0.02581 -0.00275 0.01638 0.03276
    #R|  2  1999-1992 -0.02267 -0.03668 -0.00867 0.00214 0.00642
    #R|  3  1999-1996 -0.00839 -0.02020  0.00341 0.15895 0.15895

    For example, these results suggest that the slopes for 1996 and 1992 ARE statistically different (first row), but the slopes for 1999 and 1996 are NOT statistically different (last row).

    The $slope component in the compSoloes() object contained results specific to each slope. The groups were under level, sample slopes under slopes, 95% confidence intervals for the slopes under 95% LCI and 95% UCI, and unadjusted and adjusted p-values for the test if the slope is different from 0 under p.unadj and p.adj, respectively.

    csfsa$slope
    #R|    level  slopes  95% LCI 95% UCI p.unadj   p.adj
    #R|  3  1999 0.00386 -0.00620 0.01393 0.44342 0.44342
    #R|  2  1996 0.01225  0.00609 0.01842 0.00024 0.00048
    #R|  1  1992 0.02653  0.01679 0.03628 0.00000 0.00000

    For example, the slope for 1992 (last row) appears to be significantly different from 0 and may be between 0.01679 and 0.03628.

     

    What emtrends() Does

    Similar results can be obtained with emtrends() from emmeans using the fitted lm() object as the first argument, a specs= argument with pairwise~ followed by the name of the factor variable from the lm() model (year in this case), and var= followed by the name of the covariate from the lm() model (weight in this case), which must be in quotes. The results should be assigned to an object so that specific results can be extracted.

    cs <- emtrends(lm1,specs=pairwise~year,var="weight")

    The object saved from emtrends() is then given as the first argument to summary(), which also requires infer=TRUE if you would like p-values to be calculated.3

  • 3 emmeas does not compute p-values by default.

  • css <- summary(cs,infer=TRUE)

    The $contrasts component in this saved object contains the results for comparing all pairs of slopes. Each paired comparison is a row with the groups compared under contrasts, the difference in sample slopes under diff, the standard error of the difference in sample slopes under SE, the degrees-of-freedom under df, a 95% confidence interval for the difference in slopes under lower.CL and upper.CL, and the t test statistic and p-value adjusted for multiple comparisons for testing a difference in slopes under t.ratio and p.value, respectively.

    css$contrasts
    #R|   contrast            estimate      SE df  lower.CL upper.CL t.ratio p.value
    #R|   year1992 - year1996  0.01428 0.00572 44  0.000403   0.0282   2.496  0.0425
    #R|   year1992 - year1999  0.02267 0.00695 44  0.005815   0.0395   3.262  0.0059
    #R|   year1996 - year1999  0.00839 0.00586 44 -0.005813   0.0226   1.433  0.3331
    #R|  
    #R|  Confidence level used: 0.95 
    #R|  Conf-level adjustment: tukey method for comparing a family of 3 estimates 
    #R|  P value adjustment: tukey method for comparing a family of 3 estimates

    Comparing these results to the $comparison component from compSlopes() shows that the difference in sample slopes are the same, but that the confidence interval values and p-values are slightly different. Again, this is due to emtrends() and compSlopes() using different methods of adjusting for multiple comparisons. These differences did not result in different conclusions in this case, but they could, especially if the p-values are near the rejection criterion.

    The $emtrends component contains results for each slope with the groups under the name of the factor variable (year in this example), the sample slopes under xxx.trend (where xxx is replaced with the name of the covariate variable, weight in this example), standard errors of the sample slopes under SE, degrees-of-freedom under df, 95% confidence intervals for the slope under lower.CL and upper.CL, and t test statistics and p-values adjusted for multiple comparisons for testing that the slope is not equal to zero under t.ratio and p.adj, respectively.

    css$emtrends
    #R|   year weight.trend      SE df lower.CL upper.CL t.ratio p.value
    #R|   1992      0.02653 0.00483 44  0.01679   0.0363   5.489  <.0001
    #R|   1996      0.01225 0.00306 44  0.00609   0.0184   4.004  0.0002
    #R|   1999      0.00386 0.00499 44 -0.00620   0.0139   0.773  0.4434
    #R|  
    #R|  Confidence level used: 0.95

    Here the results match exactly with those in the $slopes component of compSlopes().

     

    Conclusion

    emtrends() in emmeans provides a more general solution to comparing multiple slopes than what was used in compSlopes() in FSA prior to v0.9.0. As compSlopes() was removed from FSA in 2022, you should now use emtrends() for this purpose.

    emmeans has extensive vignettes that further explain its use. Please see this discussion for the use case described in this post. Their “Basics” vignette is also useful.

    In the next post I will demonstrate how to use emmeans() from the emmeans package to replace compIntercepts(), which was also removed from FSA.

    Note

    This change to FSA does not affect anything in Ogle (2016).

     

    References

    Ogle, D. H. 2016. Introductory Fisheries Analyses with R. CRC Press, Boca Raton, FL.

    Reuse

    Citation

    BibTeX citation:
    @online{h. ogle2021,
      author = {H. Ogle, Derek},
      title = {Replace {compSlopes()} with Emtrends()},
      date = {2021-05-11},
      url = {https://fishr-core-team.github.io/fishR//blog/posts/2021-5-11_compSlopes-replacement},
      langid = {en}
    }
    
    For attribution, please cite this work as:
    H. Ogle, D. 2021, May 11. Replace compSlopes() with emtrends(). https://fishr-core-team.github.io/fishR//blog/posts/2021-5-11_compSlopes-replacement.