Help | Translatome workbench

1 Upload data and set condition information

First, we need to upload and set up information about the analysis data. As mentioned above, translatome workbench can analyse both Ribo-Seq and matched RNA-Seq data. Therefore, we will use the mouse retinal data (Ribo-Seq and mached RNA-Seq data, divided into two groups, E15 and P42, each with two replicates) as an example to show step by step how to use translatome workbench.

(1) Click on Ribo-Seq to select the data to be analysed, here it refers to Mouse_E15_Retina_Ribo_rep1, Mouse_E15_Retina_Ribo_rep2, Mouse_P42_Retina_Ribo_rep1, Mouse_P42_Retina_Ribo_rep2

(2) Click on RNA-Seq to select the data to be analysed, here it refers to Mouse_E15_Retina_mRNA_rep1, Mouse_E15_Retina_mRNA_rep2, Mouse_P42_Retina_mRNA_rep1, Mouse_P42_Retina_mRNA_rep2

(3) Click on Upload to upload the data after confirming that it is correct, the upload progress will be displayed with a blue progress bar; if incorrect, click on Refresh to reselect the data

(4) Click on Species drop-down field to select the appropriate genome (in this case it is the choice of mm10). A certain genomic index have been built into our platform, such as hg19, hg38, mm10, etc.

(5) Experimental design: Choose Case-control as this example contains two groups of data and a form will then pop up that requires information about the data to be uploaded. If there is only one condition for uploading data, then only the Case/control-only needs to be clicked.

In Group 1, enter the name of the first group of sequencing data in the second column of the table, represented here by E15, the data framed by the red background in the figure below, and set custom names for each of them in the first column; click on the 'plus' or 'minus' buttons on the right to increase or decrease the data
In Group 2, set up the second group of data in the same way as above, that is, the data with the blue background shown below
The name of the data entered in the second column of the table must have the same prefix as the name of the uploaded data

Note: Step2 and Step3 have been configured with default processes and parameters, which can be changed by the user as required.

2 Building a flexible analytical pipeline (Optional)

After uploading and setting up the condition information for the data, the next step is to process the data for analysis. This step automatically sets up an adapted analysis process depending on the type of data uploaded and the type of experimental design. Therefore, in general, no additional settings are required for this step. Of course, the default options can also be replaced by mouse clicks.

3 Configuring additional parameters (Optional)

filtering and trimming

adapter: the adapter sequence used for sequencing
quality_phred: minimum sequencing quality of the retained reads
length_required: the minimum length of the reads after trimming adapter and low quality bases

alignment

MismatchNmax: the maximum number of base mismatches per read
MultimapNmax: maximum frequency of alignment to multiple positions per read

footprint length

min_length: the minimum retention length of reads derived from Ribo-Seq data
max_length: the maximum retention length of reads from Ribo-Seq data

differential cutoff

adjusted_pvalue: Significance threshold setting for corrected p-values in differential translational efficiency analysis
log2FoldChange: threshold value for log2 transformed fold change in differential translational efficiency analysis

4 Execute and get the results

Finally, there are two options for executing and accessing the results. Plan A if no suitable email address is available for receiving results; and Plan B to receive the results report via email.

Plan A

NOTE! After clicking on Execute button, the unique identifier, job id, of the process will appear, please keep it safe for downloading the result data!
Data processing progress will be displayed in real time at the bottom of the page.
Once the analysis has been completed, the job can be searched for and the report retrieved from the search box at the top of the analysis page using the job id.

Plan B

Enter the email address to the box next to the Excute button. Then click on the Execute button and the progress of execution will be displayed in real time at the bottom of the analysis page
Once the process has been executed the report will be automatically sent to the email address you have entered and you can then download the report by clicking on the link in the email address. If the analysis page shows that the execution is complete but you do not receive the email, then please check in the spam folder.

5 About the report

The execution report generated in the previous step nests all the results of the analysis process in an html format for easy viewing by the user. In addition, the results of each step are extracted separately from it. Below we have presented and contextualised the results of each section.

5.1 Assessing data quality

Being different from RNA-seq data, Ribo-Seq data seems more sophisticated, which are not only reflected at experimentation but also data analysis. A crucial step in the Ribo-Seq data analysis is quality control during pre-processing, which reflects whether sequencing is high-quality and is also the foundation for subsequent analysis accurately.

5.2.1 Read filtering and trimming

The length of reads from Ribo-Seq generally falls in the range of 25nt to 35nt as they are enclosed by the ribosomes. Therefore, only trimmed reads in this interval will be retained for subsequent analysis on account of excluding reads stemmed from contaminations. Additionally, rRNA and tRNA are also required to be removed for improving mapping ratio and accelerating subsequent analysis in the pre-processing. Two types of software, fastp and trim_galore, are provided here for use.

5.2.2 Alignment

Tools used to mapping RNA-Seq data are also adapted in alignment of Ribo-Seq data, such as HISAT2, STAR, TopHat2. Through mapping reads against to the reference sequences, we can know which genes are expressed and how the expression of them. Following gene functional analysis, we could learn more about what effects are caused by the experimental treatment.

5.3 Post-mapping quality inspection of Ribo-Seq data

A few unique features of Ribo-Seq data, for example, distribution of read length and triplet nucleotide periodicity, can be detected from BAM files, which are generated by alignment. First, Sequencing reads of Ribo-Seq data are from fragments of ribosome-enclosed, so the distribution of read lengths will mainly concentrate on a specific length. Second, the most significant feature of Ribo-Seq data is triplet nucleotide periodicity and this feature is also the criterion to judge the quality of Ribo-Seq data. If we can't observe this feature, we should reflect on reasons leading to these results. For example, whether there is an error during the library preparation. If the triplet nucleotide periodicity can not be observed after we rule out all of the points that we may make an error, we should consider to drop out this data. Third, sequencing reads from Ribo-Seq data are usually located in the translated genomic regions. However, reads resided in the UTR or intronic regions are also worthy of attention due to their underlying regulatory functions. For example, it is likely that the expression of 5'UTR on the gene displayed in the figure below inhibited the expression of CDS, however, this phenomenon is not observed within the RNA-seq data. Hence, the distribution of reads on the genomic features provides new insight into the regulatory mechanism of gene expression.

5.4 Normalization

After pre-processing is completed, BAM or SAM format files will be obtained for follow-up analysis, from which we can get exactly counts of reads for each gene utilizing tools like featureCounts. Nonetheless, it is necessary taking into account of library size if we need to compare differential expression of interesting genes. In other words, normalization of gene counts is requisite prior to differential expression analysis. Hence, we provide two options, RPKM and TPM, for normalizing the read counts.

5.5 Detecting actively translated ORFs

Ribosome profiling provides us a unprecedent opportunity to detect actively translated ORFs in a more accurate manner. Thus, previous studies have identified a plethora of ORFs with ribosome profiling such as small open reading frames (smORFs), upstream open reading frame (uORFs). Detection of ORFs is a characteristic and significant analytical aspect in translatomics analysis. Given that reads from Ribo-seq are mainly derived from region of ribosome protected, we can infer underlying actively translated regions from the profile of mapping results by feat of relevant tools like Ribo-TISH, RiboCode.

5.6 Differential translational efficiency analysis

Previous studies have shown that transcript levels of gene expression do not correlate well with protein levels, suggesting that gene information is further processed during translation. Clearly, ribosome profiling provides us with an opportunity to make assumptions and interpretations about the mechanisms of translation regulation. Therefore, by comparing the abundance of genes at the translation level with the transcriptional level, which is called translational efficiency (TE), we are able to divide TE into three categories: forward, reinforce, and buffer. We provide two pieces of software here, Xtail and DESeq2, to implement this analysis.

Tutorial

Plan A

Plan B