Getting started

Here we will show a very basic example how to use grandR to perform kinetic modeling. For much more vignettes, see the grandR website.

We will use data from [1]. These are SLAM-seq data from multiple time points (1h,2h,3h,4h) after infecting Calu-3 cells with SARS-CoV-2 (or mock as control).

We first load the grandR package and the read the GRAND-SLAM that is part of the grandR package:

suppressPackageStartupMessages({
    library(grandR)
    library(ggplot2)
    library(patchwork)
})

sars <- ReadGRAND(system.file("extdata", "sars.tsv.gz", package = "grandR"),
                 design=c(Design$Condition,Design$dur.4sU,Design$Replicate))
sars <- Normalize(sars)
sars
grandR:
Read from /tmp/RtmpLq3QCv/Rinst2b0432634b38/grandR/extdata/sars
1045 genes, 12 samples/cells
Available data slots: count,ntr,alpha,beta,norm
Available analyses: 
Available plots: 
Default data slot: norm

The GRAND-SLAM output normally contains any gene with at least 1 read, i.e. >30k genes. The data set that is part of grandR has been prefiltered and only consists of 1045 genes. For a complete workflow including filter see the full vignette. Note that we also normalized the read counts (by using size factors), which added an additional data “slot”.

We start by creating a plot showing the kinetics for a gene:

PlotGeneProgressiveTimecourse(sars,"SRSF6",steady.state=list(Mock=TRUE,SARS=FALSE))

Progressive labeling plot for SRSF6

Note that this automatically fit the kinetic model for this gene, separately for the two conditions. Modeling used the default data slot, which are the size-factor normalized values, as indicated above. By using the steady.state parameter, we defined the mock infected control samples to be in steady state, whereas the virus infected samples should not be assumed to be in steady state.

We now fit the kinetic model for all genes:

SetParallel(cores = 2)  # increase on your system, or omit the cores = 2 for automatic detection
NULL
sars<-FitKinetics(sars,"kinetics",steady.state=list(Mock=TRUE,SARS=FALSE))

Modeling results are stored in two “analysis tables”:

Analyses(sars)
[1] "kinetics.Mock" "kinetics.SARS"

We can retrieve this table (for more information, see the data handling vignette:

head(GetAnalysisTable(sars))
                   Gene  Symbol Length     Type kinetics.Mock.Synthesis
UHMK1   ENSG00000152332   UHMK1   8478 Cellular               175.36782
ATF3    ENSG00000162772    ATF3   2103 Cellular                33.97185
PABPC4  ENSG00000090621  PABPC4   3592 Cellular               213.42534
ROR1    ENSG00000185483    ROR1   5832 Cellular               193.67200
ZC3H11A ENSG00000058673 ZC3H11A  11825 Cellular               251.54548
ZBED6   ENSG00000257315   ZBED6  12481 Cellular               251.27907
        kinetics.Mock.Half-life kinetics.SARS.Synthesis kinetics.SARS.Half-life
UHMK1                  7.506985                309.9754               3.0485254
ATF3                   0.944292                992.6315               0.2736086
PABPC4                 6.532308                516.2158               2.7149065
ROR1                   3.556425                471.7370               1.2872501
ZC3H11A                2.376063                781.5054               1.0152370
ZBED6                  2.354266                779.3275               1.0252531

We can also easily plot the RNA half-lives of mock infected cells against virus infected cells:

PlotScatter(sars,`kinetics.Mock.Half-life`,`kinetics.SARS.Half-life`,
            lim=c(0,24),correlation = FormatCorrelation())+
  geom_abline()

Half-lives scatter plot mock vs SARS