The computed p -values were 0. The principal components were visualized in the Spotfire Tibco, Inc. The resulting genes were analyzed for concordance and discordance using Array Studio The Causal analysis upstream analysis of IPA examines how many known targets of each transcription regulator are present in the DEGs identified by RNA-Seq and microarrays, and also compares their direction of change i.
All data processing was performed using Array Studio software. Mean expression levels were obtained by calculating the geometrical means of the RMA-normalized data for toxicant treated and control sample groups, respectively. A two-sided t -test was performed using the inference module of Array Studio, to determine which genes were significantly differentially expressed between the toxicant treated and control groups, and Benjamini—Hochberg false discovery rate FDR multiple testing correction and alpha level of 0.
This module transforms count data log 2 transformed counts per million logCPM , robustly estimates the mean-variance relationship and generates a precision weight for each individual normalized observation. Inference tests based on the Voom algorithm were applied to adjust read depth differences between samples and estimate changes or differences of gene expression when comparing sample groups.
DEGs from the inference test were selected according to expression changes of more than 1. A complete listing of serum chemistry values is presented in Supplementary Table S2. Supplementary Figures S1A—E provides representative photomicrographs illustrating the histopathological appearance of the liver from a representative rat in each dose group.
ANIT Supplementary Figure S1A and MDA Supplementary Figure S1B treatment resulted in biliary toxicity characterized by hypertrophy and hyperplasia of the bile duct epithelium, and infiltration of neutrophils into the pericholangiolar space and bile duct lumina. CCl 4 Supplementary Figure S1C treatment resulted in widespread centrilobular hepatocellular macrovesicular and microvesicular steatosis.
The APAP and DCLF samples were nonetheless further analyzed since they were considered useful to better understand the sensitivity and utility of transcriptomic changes in rat toxicity studies. In toxicogenomic studies, PCA is generally used to analyze the complex multi-dimensional gene expression datasets.
A similar assessment of the microarray dataset i. APAP-treated samples did not separate from their respective control samples with either microarray or RNA-Seq, reflecting the low level of gene expression changes, which was highly consistent with the lack of observed histopathological and serum chemistry findings Supplementary Figures S1A—E and Supplementary Table S2.
Figure 1. Percentages represent variance captured by each principal components 1 and 2 in each analysis. Controls are shown in green color circle and hepatotoxicants are colored differently. The beige color represents water treated control samples. The red colored samples are DCLF treated. The light and dark green circles represent corn oil control and APAP treated samples respectively.
The drug treated samples are shown within the closed circle or oval shaped ring. Encouragingly, the measured gene abundance derived from these two different gene expression methods showed a correlation of 0. An additional computational analysis on the APAP and DCLF gene expression datasets revealed the presence of significant variability for genes expressed at low absolute levels in the microarray platform, likely explaining, at least in part, the poorer correlation for these samples with minimal toxic changes Supplementary Figures S2A,B.
Figure 2. Overall computational process of RNA-Seq and microarray data analysis. Comparison at raw expression, differentially expression and pathway stages are indicated. Figure 3. Scatter plot showing the relative expression levels of genes in terms of log 2 FCs for 18, consensus genes, determined by RNA-Seq and microarray. Log 2 FC is computed by taking average of three samples.
The graphs show that the overall FC dynamic ranges log2 transformed for 18, genes. Overall, the RNA-Seq platform captured a larger number of protein-coding DEGs compared to the microarray platform for all the tested hepatotoxicants Table 1A : columns 4 and 7.
The RNA-Seq platform identified a modulation of expression for or In contrast, with the microarray platform, only 5. Figure 4. Blue and green bars indicate total number of microarray platform identified DEGs and number of inter-platform overlapping DEGs, respectively.
Figure 5. Size of the filled circle is proportional to fold change difference i. The blue and red spheres indicate down and upregulated DEGs, respectively.
Figure 6. Hierarchically clustered genes columns and samples rows with dendrograms and clusters blue colored bars. Red in the heatmap denotes upregulation while blue denotes downregulation.
Table 1C compares the dynamic range of expression for the DEGs detected with the two platforms and supports the better performance of the RNA-Seq platform for the identification of genes across a broad expression range. Furthermore, with microarrays, the FC saturated at the high end and increased background noise was noted at the low end.
Overall, these results showed that RNA-Seq is better at capturing the gene expression changes for genes expressed at overall low levels and more precise in quantifying expression changes for the highly dysregulated genes.
Differentially expressed genes that are common to both platforms were subtracted from the total number of protein coding DEGs identified in each platform Table 1D : columns 3 and 4. Our analysis also identified a set of DEGs that were uniquely detected with microarrays. A major goal of this study was to understand whether the biology of the DEGs detected by each platform would lead to a similar understanding of the mechanism of toxicity Figure 2 , Comparison 3.
Many of the top scoring canonical pathways detected by microarrays and RNA-Seq were similar for the most part and were relevant to liver toxicity. Finally, it is reassuring that the DEGs of each tested hepatotoxicants detected with the two different gene expression platforms captured distinct liver-associated pathways for each drug, further confirming their distinct mechanism of toxicity. No additional statistically significant canonical pathways were identified by the microarray specific DEGs alone.
A total of 10, non-coding RNA transcripts were expressed in the rat liver samples. In total, i. Table 3 summarizes the total number of non-coding DEGs along with the computed FC values for each category for each drug. Figure 7. The red dots on the right top quadrant are significantly up-regulated non-coding DEGs and the dots within the top left quadrant shows highly down-regulated non-coding DEGs.
Green color dots denote un-changed non-coding transcripts. Since the biology of lncRNAs is not clearly understood, we analyzed the cis -protein-coding DEGs of differentially expressed lncRNAs and inferred the potential biological insights for the significantly modulated lncRNAs. The tested hepatotoxicants also differently modulated the expression of a total of pseudogenes and processed pseudogenes, translating to 1.
The RNA-Seq and microarray platforms are fundamentally different from each other in terms of gene expression measurements. The former measures all RNA transcript counts, a direct measurement of gene expression, while the latter measures a fluorescence intensity that is due to hybridization with anti-sense probe sequences, an indirect measurement of gene expression.
The advantage of RNA-Seq over microarrays is that it provides an unbiased insight into all transcripts Zhao et al. Thus, RNA-Seq is generally reliable for accurately measuring gene expression level changes. Nevertheless, the key question is whether this improved reliability, accuracy, and sensitivity is sufficient to justify a switch from microarrays to RNA-Seq in the context of toxicogenomic studies. Several studies have compared these two transcriptional profiling platforms for various purposes Bottomly et al.
Our data are consistent with these studies in terms of overall acceptable concordance between the two platforms and regarding the higher sensitivity and better dynamic range observed with RNA-Seq. In addition, our study evaluated the potential utility of RNA-Seq for mechanistic toxicology investigations with more depth by using carefully selected hepatotoxicants with distinct mechanisms of toxicity and in a context relevant to the conduct of exploratory toxicology studies in the pharmaceutical industry.
Additionally, some researchers align probe sequences to a recent release of the Genome or Transcriptome in an attempt to obtain the most up-to-date results. Many commercial vendors release updated annotation files with varying degrees of regularity in an attempt to keep these annotations current. Zhao and Zhang have comprehensively compared these different annotations within the context of the human genome.
Abascal et al. Array Studio uses annotations from both Ensembl. R83 and RefSeq for gene identification from the microarray probes. R83 for gene identification. The identified genes from both platforms were used for comparison.
It may be possible that a few genes have been missed by this annotation mismatch. However, within the context of toxicology studies, these annotation differences may play a limited role, as evidenced by the identification of similar biological pathways and upstream regulators identified with RNA-Seq and microarray DEGs.
This appealing dynamic range feature of RNA-Seq effectively eliminated the saturation biases, which is inherent to microarray platforms. Because of this sensitive nature, the RNA-Seq platform detected at least three times more protein-coding DEGs for all the hepatotoxicants compared to microarrays, in excellent agreement with reported platform comparison studies Bohman et al.
However, it should be noted that this observation is partly biased, since microarrays do not cover all possible cellular transcripts and since the gene coverage also differs across chip patterns. The DEGs detected with RNA-Seq resulted in more significantly altered pathways compared to microarrays, which suggests that RNA-Seq provides more information about toxicant-induced transcriptomic perturbations.
Nevertheless, there was also a significant overlap in the top modulated canonical pathways identified by the protein-coding DEGs of both platforms, and for the most part, the toxicological interpretation of these transcriptomic changes was quite similar, in agreement with observations by others Su et al. Although pathway analyses using analytical tools like IPA help summarize and interpret the complex biology behind drug-induced transcriptomic perturbations, these analyses are also intrinsically biased by the published knowledgebase without consideration for potential institutional knowledge and omit alternate pathway routes for regulated DEGs.
Pathway annotation is mostly a manual process and all genes and functional relationships are generally not yet fully covered. Moreover, the pathways are not universally defined and different tools identify different pathway results for the same datasets Khatri et al. For example, in the current study, a significant number of highly regulated protein-coding DEGs identified with RNA-Seq and microarrays were not associated with any of the IPA annotated canonical pathways.
Thus, the development of an analysis environment that exploits both canonical pathways and new extended network interactions may improve our understanding of the significance of highly regulated DEGs within the context of liver pathological processes Cerami et al. There has been tremendous progress in the pathway curation and integration process during the past few years and that progress has resulted in novel pathway tools Fabregat et al.
However, these tools are still highly fragmented and not integrated into a single framework for optimal DEGs analysis.
Integration of multiple pathway analysis tools may be needed to better extract the comprehensive biological information present in the DEGs of RNA-Seq and microarrays. Recently, non-coding RNAs have generated significant interest in toxicological and biomarker research Dempsey and Cui, ; Gong et al.
These non-coding RNAs are not typically detected with standard microarray chips based on design, but can be captured and quantified by RNA-Seq.
The RNA-Seq platform in our study uniquely identified a total of differentially regulated non-coding transcripts for all toxicants combined about 5. This suggests that even more non-coding DEGs may be detected with an alternative library prep kit.
The tested hepatotoxicants mainly impacted the expression of lncRNAs, pseudogenes and miRNAs and these have a potential for use as toxicity biomarkers and may offer additional mechanistic insight in some cases Esteller, ; Ling et al.
These non-coding RNAs are generally less stable and are expressed at lower levels compared to the protein coding mRNAs. Moreover, lncRNA expression is highly restricted to certain tissue types such as testis, heart, and liver Derrien et al. The process of quantification of these low abundant tissue-specific lncRNA transcripts remains a challenging and on-going task. Recent studies suggest that lncRNAs bind to chromatin, chromatin modifying proteins, certain transcription factors, and miRNAs.
This binding event significantly regulates a wide range of mechanisms like epigenetic signaling, disrupting polymerase activities and altering miRNA stability Baumgart et al. Additionally, it is now also well-accepted that lncRNAs are connected with various biological processes Kung et al.
Thus, lncRNAs have been recognized as potential markers for liver injury Takahashi et al. Our study uniquely identified a total of differentially regulated lncRNAs across all toxicants combined. Although the biological function of these highly modulated lncRNAs is unclear, there have been a number of reports Zhu et al.
Increased expression of ccnE1 has been reported in human and mouse liver fibrosis Nevzorova et al. While the majority of miRNAs are located within the cell, some miRNAs, commonly known as circulating or extracellular miRNAs, have also been found in the extracellular environment, including various biological fluids Wang et al.
During the past decade, miRNAs have generated a high level of interest in toxicology Clarke et al. Our study identified a total of 21 differentially regulated miRNAs across all toxicants combined. ANIT treatment in mice for 48 h has been shown to reduce the expression of hepatocyte nuclear factor 1-alpha Hnf1a Tanaka et al. Interestingly, miR down-regulation correlates with Hnf1a gene down-regulation Coulouarn et al. Psesudogenes are generally produced through a wide range of mechanisms Zhang et al.
A spontaneous mutation in a protein-coding gene can generally prevent either transcription or translation of the gene, resulting in the formation of unitary pseudogene. Additionally, duplicated pseudogenes are also generated through a tandem doubling of certain sequences. These duplicated and unitary pseudogenes lose their protein-coding capability due to either the loss of promoters or mutations that create premature stop codons Mighell et al.
The co-expression of pseudogenes and their cognate protein-coding genes have not been looked at thoroughly within the context of toxicity assessment in a single experiment, as pseudogenes probes are generally absent from typical microarray chips. Our RNA-Seq data identified a total of pseudogenes with altered expression from all toxicants combined. Altogether, although it is premature to draw conclusions, it appears that measurement of non-protein-coding transcripts lncRNAs, miRNAs, and pseudogenes may provide some useful insights regarding mechanisms of liver toxicity.
Future in vitro and in vivo studies are clearly necessary to further understand the utility for mechanistic molecular toxicology of these non-protein-coding diagnostic and prognostic transcripts.
Microarrays measure the expression of only pre-defined probes genes and typical arrays are designed to cover only a portion of protein-coding genes.
Thus, it is currently impossible to detect regulation of non-coding genes i. Furthermore, hybridization can result in mismatch between probes and target molecules, leading to increased noise and higher likelihood of misidentified DEGs. Because of its added advantages, RNA-Seq is progressively replacing microarray technology for many transcriptomic applications Lowe et al.
However, microarrays still offer some advantages. In the present study, we generated 39 and 0. Even for this simple prototype study, this massive amount of data introduced data management and analysis challenges. Secondly, the overall computation time, data storage and management time for a microarray experiment are much lower. Based on our experience, to completely process and summarize the DEGs from a set of microarray-generated gene expression data generally take hours, depending on the amount of transcriptional change in the experiment.
These datasets complemented by public databases such as GEO, DrugMatrix and Array Express have created easily accessible and analyzable databases, which serve as a critical reference for new toxicogenomic data analytics and interpretation.
In contrast, there are no such reference databases available for RNA-Seq data, which currently limits toxicogenomic data interpretation. Fourthly, data processing and analyses are well-established with microarrays; in contrast, as RNA-Seq is still new and evolving, there is not yet a single standardized computational approach for performing an RNA-Seq data analysis.
However, with the recent advancement in computing power, hardware and dedicated computational workflows, this limitation will become rapidly obsolete. A Web site containing these genes was created. Six collagen genes were identified that had not previously been localized within the cornea.
Five apoptosis-related genes were identified, 4 of which had not previously been localized within the cornea. Three genes previously shown to cause corneal diseases were identified.
Reverse transcriptase polymerase chain reaction analysis of genes identified by microarray analysis confirmed the corneal expression of 2 apoptosis-related genes and 1 collagen gene. Conclusions Microarray analysis of healthy human donor corneas has produced a preliminary, comprehensive database of corneal gene expression. Large-scale analysis of gene expression has the potential to generate large amounts of data, which should be made readily accessible to the scientific community.
The Internet offers many potential advantages as a medium for the maintenance of these large data sets. Clinical Relevance Identification of structural, apoptosis-related, and disease-causing genes within the cornea by microarrays may increase the understanding of normal and abnormal corneal function with likely relevance to corneal diseases and transplants.
This baseline knowledge should facilitate the identification of alterations from normal gene expression that play important roles in disease pathogenesis.
Efforts to comprehensively study gene expression patterns in normal tissues and altered gene expression patterns in diseased tissues will require a complete knowledge of the human genetic sequence and methods to accurately and simultaneously analyze large amounts of genetic information.
Both of these requirements have recently become available through the ongoing progress of the Human Genome Project and breakthroughs in high-efficiency genetic analysis techniques such as DNA microarrays. DNA microarrays are a new and powerful technique to study the expression of thousands of genes in a single experiment.
Labeled mRNA molecules will bind to complementary sequences on the microarray and can be detected in a semiquantitative manner using automated techniques. Advantages of DNA microarrays include simultaneous screening for the expression of large numbers of genes, the ability to use small amounts of starting material, and mass production, which enables standardized, comparative analysis between samples. The acceptance of this technology is growing rapidly as microarrays are being used in an increasing number of experimental applications.
These include analysis of gene expression in normal embryonic development 3 and pathologic states such as breast cancer 4 and myocardial infarction. Given the successful use of microarray analysis in other biological systems, we sought to apply this technique to study gene expression in human donor corneal tissue. Such analysis may be useful for understanding the genetic basis of normal corneal function as well as corneal disease processes such as graft failure, inflammation, degenerations, and dystrophies.
Knowledge of what genes are or are not expressed in a given corneal disorder could lead to new and definitive treatment strategies, including interventional drugs and gene therapies. These strategies may be particularly relevant and feasible for the cornea because the tissue is relatively less complex, can be manipulated ex vivo, and can be easily assessed visually. Microarray analysis is a feasible method to begin compiling a comprehensive database of genes expressed in human corneas.
Such a database of genes may have broad applications for corneal genetics research. Any comprehensive database of corneal genes would be expected to be relatively large and to grow as more genes are identified in this tissue and as the assembly phase of the Human Genome Project defines novel genes from currently available sequence information.
Such a large database would be most useful if it could be readily updated, freely accessible to the global research community, and effectively interfaced with preexisting gene databases.
Given these desirable features, an Internet Web site could be an ideal format for a comprehensive corneal gene database. Such a Web site could potentially enhance the progress of corneal genetics research by increasing the accessibility of relevant genetic information and facilitating discussion among corneal genetics researchers.
As the proposed comprehensive corneal gene database grows and becomes more clinically relevant, it also could serve as a model for similar efforts in other clinical disciplines. The death to preservation time of all tissues used in this study was less than 12 hours. All tissues used in this study were obtained from donors younger than age 65 years. Standard methods were used to recover phagemids by mass excision protocol pBluescript; Stratagene, La Jolla, Calif.
The number of plasmids excised was 1. The ratio of clones excised to the number of independent clones in the library was 1. Excised clones were used to transfect a large-volume cell culture SOLR; Stratagene, and plasmid "maxi-preps" were performed with a standard kit and protocol Qiagen, Valencia, Calif. Plasmids were digested using restriction endonuclease NotI; Life Technologies, Rockville, Md , phenol-chloroform extracted, and ethanol precipitated.
A final concentration of 2. Microarray analysis of a cDNA library constructed from transplant-quality human donor corneas was performed in duplicate. The first microarray identified the expression of human genes. The second microarray identified the expression of human genes. A total of shared genes were identified on both microarrays. Only the genes confirmed by both microarrays as expressed in the cornea were used in subsequent analyses in this study.
The genes with confirmed corneal expression were analyzed using GenBank 9 and LocusLink. The genes with confirmed corneal expression were used as the basis of a corneal genetics Web site named CorneaNet. Each entry in CorneaNet includes a gene name, symbol, chromosome locus, and GenBank accession number with an active link to the GenBank entry for the specified gene.
CorneaNet is open to contributions from the research community and is updated regularly with genes newly reported in the literature as being expressed in human corneal tissues.
Six types of collagen subunits were included among the confirmed corneal genes identified by microarray analysis Table 2. Five apoptosis-related genes were included among the confirmed corneal genes identified by microarray analysis Table 3.
Caspase-like apoptosis regulatory protein 2 is a protein with a homologous sequence to caspase 8 and caspase 10 that may stimulate apoptosis through regulatory effects on caspase 8. Three corneal disease—causing genes were included among the confirmed corneal genes identified by microarray analysis Table 4. Specific amplification products of the expected sizes were detected for all 3 genes Figure 1. The recent completion of the sequencing phase of the Human Genome Project provides a wealth of genetic information that should facilitate clinically relevant studies of normal and abnormal cellular processes.
One potentially useful application of this information is the creation of comprehensive databases of genes expressed in a given normal or abnormal tissue or cell type. An initial attempt to investigate quantitative and qualitative aspects of gene expression in the corneal epithelium was performed using the conventional technique of sequencing randomly selected cDNA clones.
DNA microarrays represent a powerful technique to screen large amounts of genetic material for known sequences. This method has been used in ophthalmology to study alterations in gene expression caused by the photoreceptor homeobox gene CRX 42 and elevations in intraocular pressure.
In contrast, if the expression in the experimental sample is lower than in the reference sample, then the spot appears green. Finally, if there is equal expression in the two samples, then the spot appears yellow. The data gathered through microarrays can be used to create gene expression profiles, which show simultaneous changes in the expression of many genes in response to a particular condition or treatment.
Related Concepts 6. You have authorized LearnCasting of your reading list in Scitable. Do you want to LearnCast this session? This article has been posted to your Facebook page via Scitable LearnCast.
0コメント