The time-course transcriptomic responses of THP-1 human macrophage-like cells to W-Beijing Mycobacterium tuberculosis strains of different sublineages
Source: NCBI BioProject (ID PRJNA141255)

0 0

Project name: Homo sapiens
Description: The W-Beijing family of Mycobacterium tuberculosis (Mtb) strains is known for its high-prevalence and -virulence, as well as for its genetic diversity, as recently reported by our laboratories and others. However, little is known about how the immune system responds to these strains. To explore this issue, here we used reverse engineering and genome-wide expression profiling of human macrophage-like THP-1 cells infected by different Mtb strains of the W-Beijing family, as well as by the reference laboratory strain H37Rv. Detailed data mining revealed that host cell transcriptome responses to H37Rv and to different strains of the W-Beijing family are similar and overwhelmingly induced during Mtb infections, collectively typifying a robust gene expression signature ("THP1r2Mtb-induced signature"). Analysis of the putative transcription factor binding sites in promoter regions of genes in this signature identified several key regulators, namely STATs, IRF-1, IRF-7, and Oct-1, commonly involved in interferon-related immune responses. The THP1r2Mtb-induced signature appeared to be highly relevant to the interferon-inducible signature recently reported in active pulmonary tuberculosis patients, as revealed by cross-signature and cross-module comparisons. Further analysis of the publicly available transcriptome data from human patients showed that the signature appears to be relevant to active pulmonary tuberculosis patients and their clinical therapy, and be tuberculosis specific. Thus, our results provide an additional layer of information at the transcriptome level on mechanisms involved in host macrophage response to Mtb, which may also implicate the robustness of the cellular defense system that can effectively fight against genetic heterogeneity in this pathogen.Overall design: THP-1 cells were infected by the reference laboratory strain H37Rv, and the infected cells at 4, 18 and 48h were collected (in duplicate). A total of 40 samples (including the 33 W-Beijing-infected samples, the 6 H37Rv-infected samples, and an uninfected sample at 0h as a control) were subjected to the RNA extraction for microarray hybridization.RNA was treated for hybridization with HG-U133 Plus 2.0 Array according to the Affymetrix protocol. Then, arrays were scanned by a high-resolution scanner and the scanned images were converted to cell intensity file (.CEL files) using GeneChip Operating Software (GCOS). These CEL raw expression data were first normalized using Robust Multi-array Averaging (RMA) with quantile normalization in R (Bioconductor). Detection call-based filter was then applied to remove all the probesets whose expression values were consistently below an empirically determined value of minimum sensitivity. This value was calculated according to the 95th percentile of all the ‘Absent’ call-flagged signals of the entire dataset. Any expression values below this value are considered as being technically unreliable. Since the uninfected sample was served as a common control for the comparison, probesets with expression values less than this empirical value at the uninfected control were also removed. Finally, expression values of probesets in 39 infected samples were subtracted by their cognate values in the uninfected control (0h).After the normalization and pre-filtering above, a gene expression matrix was constructed, containing 18,541 probesets (representing 10,315 unique Entrez Genes) across 39 samples, that is, time-courses (4h, 18h and 48h) of THP-1 cells infected by each of 11 Mtb W-Beijing strains as well as H37Rv (in duplicate). This matrix was first subjected to Cluster3.0/TreeView-1.0.8 with Euclidean distance for unsupervised sample classification. The pair-wise Pearson’s correlation coefficients were also calculated to show a high degree of similarity for different strains within the same time point. For gene clustering and visualization, the gene expression matrix was also subjected to component plane presentation integrated self-organizing map (CPP-SOM), an component of topology-preserving selection and clustering (TPSC) package (see http://www.cs.bris.ac.uk/~hfang/TPSC/). Specifically, the input data were first trained using the SOM algorithm with the Epanechikov neighborhood kernel. The trained map was then visualized by CPP to display sample-specific transcriptome changes.Linear Models for Microarray Data (LIMMA) was applied to identify those differentially expressed genes between any two successive time points. LIMMA used linear models and empirical Bayes methods (the moderated t-statistic) in assessing differential expression. The criteria for identifying the top significant probesets for the designed contrast was based on adjusted P-value < 0.01, as corrected using Benjamini and Hochberg procedure. Under such criteria, all genes identified as being differentially expressed from 4h to 18h are expressed by 2-folds and more. These highly modulated genes were further divided into two groups: the induced group (cTHPr2Mtb-induced) and the other repressed (cTHPr2Mtb-repressed). The former overwhelmingly dominates the latter; those genes in the cTHPr2Mtb-induced group collectively typify a molecular signature of common host transcriptome responses to W-Beijing family strains.Functional enrichment analysis using Gene Ontology (GO) and KEGG Pathways was conducted to interpret the gene set of interest (e.g., those genes in cTHPr2Mtb-induced signature). This analysis was implemented by Database for Annotation, Visualization and Integrated Discovery (DAVID) v6.7 for identifying enriched GO and KEGG pathway, based on Benjamini and Hochberg-derived FDR (<0.01). For regulatory enrichment analysis, PRomoter Integration in Microarray Analysis (PRIMA) was used to identify putative transcriptional regulators for a given gene set compared to background (entire Entrez Genes) (Bonferroni-corrected P-value < 0.01). Based on the known transcription factor binding sites (as represented by positional weight matrix, PWM), PRIMA scanned putative binding sites of promoter sequences spanning promoter regions from 2,000 bp upstream to 200 bp downstream of the transcription start site.
Data type: Transcriptome or Gene expression
Sample scope: Multiisolate
Relevance: Medical
Organization: Department of Computer Science, University of Bristol
Literatures
  1. PMID: 22675550
Last updated: 2011-05-31