| |
|
|
|
|
|
|
|||
|
PLENARY PAPER
From the Department of Genetics, Boyer Center for
Molecular Medicine, the Section of Hematology, Department of Internal
Medicine, and the Department of Molecular Biophysics and Biochemistry,
Yale University School of Medicine, New Haven, CT; the Department of
Pediatrics, University of Massachusetts Medical School, Worcester, MA;
and Gene Logic, Gaithersburg, MD.
Although the mature neutrophil is one of the better characterized
mammalian cell types, the mechanisms of myeloid differentiation are
incompletely understood at the molecular level. A mouse promyelocytic cell line (MPRO), derived from murine bone marrow cells and arrested developmentally by a dominant-negative retinoic acid receptor, morphologically differentiates to mature neutrophils in the presence of
10 µM retinoic acid. An extensive catalog was prepared of the gene
expression changes that occur during morphologic maturation. To do
this, 3'-end differential display, oligonucleotide chip array
hybridization, and 2-dimensional protein electrophoresis were used. A
large number of genes whose mRNA levels are modulated during
differentiation of MPRO cells were identified. The results suggest the
involvement of several transcription regulatory factors not previously
implicated in this process, but they also emphasize the importance of
events other than the production of new transcription factors.
Furthermore, gene expression patterns were compared at the level of
mRNA and protein, and the correlation between 2 parameters was studied.
(Blood. 2001;98:513-524) Studies of normal myeloid maturation from many
laboratories have identified genes that may play critical roles in
myeloid differentiation.1-4 Current studies suggest that
these events are dependent on a cascade of molecular changes that
involve complex modulation of mRNA transcription. Furthermore, studies
of acute leukemia have suggested that the disease arises from the
accumulation of myeloid precursors arrested at early stages of
differentiation and associated, in many cases, with chromosomal
rearrangements that alter the structure of specific transcription
factors.5 Nevertheless, the molecular events underlying
the production of mature myeloid cells are not well understood and
appear to use interacting pathways and networks, the elucidation of
which requires an extensive description of the molecular components
available to the myeloid cell.
An extensive body of information is accumulating with respect to gene
expression profiles of mammalian cells. However, much of the
information available in public databases has been accumulated by the
use of techniques such as single oligonucleotide chips or cDNA arrays
that measure fewer than 6000 of potentially 30 000 to 120 000
transcripts. The more limited range of analyses reported by the serial
analysis of gene expression (SAGE)6,7 technique accurately
estimates changes in levels of the more abundant mRNAs but requires
extensive redundant analyses to measure changes in the patterns of
expression of scarce mRNAs. We have used a modified polymerase chain
reaction (PCR)-based cDNA differential display (DD) method in which
single restriction fragments derived from the 3' end of cDNAs are
separated on a sequencing gel.8,9 Bands from the gel can
be identified initially by sequencing, but then comparison of patterns
from different samples can be made without further sequencing. This
sensitive and reproducible method detects, in principle, most cDNAs
regardless of whether they are represented in existing databases.
Systematic analysis of the function of genes can also be performed at
the protein level. This approach has the advantage of being closest to
function, because proteins perform most of the reactions necessary for
the cell. The most common method of proteome analysis is the
combination of 2-dimensional gel electrophoresis (2DE) to separate and
visualize protein and mass spectrometry (MS) for protein
identification.10 Several such analyses of yeast and of
normal or malignant mammalian cells have been performed. To date,
however, there have been few studies in which both mRNA and protein
have been compared by applying analyses to the same samples. The
studies of Anderson11 and Gygi12 showed that there is not a good correlation between mRNA and protein levels, in
yeast or human liver cells. However, other analyses disagree with this
conclusion (Greenbaum et al, manuscript submitted, and Futcher et
al14). Furthermore, global correlations between changes in
mRNA and protein levels have not been examined during the execution of
any developmental program.
The MPRO cell line was derived by transduction of a dominant-negative
retinoic acid receptor construct into normal mouse bone marrow cells.
It is a granulocyte-macrophage colony-stimulating factor
(GM-CSF)-dependent line arrested at a promyelocytic stage of
development.15,16 After treatment with
all-trans retinoic acid (ATRA) most of the cells acquire the
morphology of mature neutrophils and begin to produce neutrophil
lactoferrin and gelatinase, 2 proteins characteristic of neutrophil
secondary granules.17 As such, it offers a valuable model
for studying neutrophil differentiation in vitro.
We now report the analysis of mRNA expression changes during the
process of MPRO cell maturation to neutrophils and compare the results
with a limited analysis of cellular protein composition. mRNA
expression changes were studied by combining the use of oligonucleotide arrays and DD. A database (dbMC) with comprehensive genomic information for myeloid differentiation program was constructed (accessible at http://www.bioinfo.mbb.yale.edu/expression/neutrophil). We have grouped the changes in mRNA levels of a large number of genes into
6 patterns, with implications for the genetic program of myeloid differentiation.
We also compared 2-dimensional high-resolution gel electrophoretograms
from control cells and cells differentiated for 72 hours in the
presence of ATRA. Fifty protein spots whose relative intensity changed
prominently during differentiation were examined by mass spectrometry.
The results suggest a poor correlation between mRNA expression and
protein abundance, indicating that it may be difficult to
extrapolate directly from individual mRNA changes to corresponding ones
in protein levels (as estimated from 2DE).
Cell lines
RNA isolation and differential display
Oligonucleotide chip analysis of RNA samples Ten micrograms total RNA from each sample (0, 24, 48, or 72 hours) was used to prepare cDNA. This cDNA was transcribed with T7 RNA polymerase to prepare a fluorescently labeled probe.20,21 Each sample was hybridized to mouse array chip (Mu11K Array; Affymetrix, Santa Clara, CA) containing oligonucleotide probe sets corresponding to approximately 7000 known genes or ESTs represented by UniGene clusters.22 cDNAs were considered present if their probe set results were rated as such by the GeneChip software (Affymetrix) and if the average difference (AD) between perfect match and mismatch probe pairs was not less 100 U. If a gene was represented by more than one array probe set, the average of all probe sets for the gene was taken. Genes with AD values between 100 and 200 were considered unchanged because of their low expression levels. Those genes with AD values equal to or more than 200 U at one time point were further studied by rescaling, threshold, and normalization methods described in the MIT Center for Genome Research Web site.13 A value of 20 was assigned to any gene with an AD below 20 at some time point.Bioinformatics and database development All the sequences or gene fragments were searched using Blast against GenBank and TIGR gene indices. A database of genes or ESTs whose expression levels changed during myeloid differentiation was constructed containing information for each band or gene. This included GenBank matches, Locus Link or Unigene clusters, expression patterns, tissue distribution, synonym(s) protein name, gene name(s), notations of possible functions, poly A signal and sequence quality, and hyperlinks to the database searches, sequence trace files, and related references. All gene data were then gathered into a cluster file. Supplementary information is available at http://bioinfo.mbb.yale.edu/expression/neutrophil.Classification and analysis of DNA fragments Sequences from differential display analyses were classified as representing known genes, ESTs, genomic sequences, or novel genes as described.19,23 Known genes from both differential display and arrays were clustered into 27 functional categories and searched against SWISS-PROT (http://www.expasy.cbr.nrc.ca/cgi-bin/sprot-search-ful) or PIR (http://www.pir.georgetown.edu/). Information such as function, subcellular location, family and superfamily classification, map position, similarity, synonym(s) protein name, gene name(s), and so on was recorded in a variety of databases.Northern blot analysis Thirty micrograms total cellular RNA per lane from time-course MPRO cells were loaded onto 1.2% formaldehyde-agarose gels, then transferred to Hybond-N+ membranes (Amersham Pharmacia Biotech, Uppsala, Sweden). After standard prehybridization, membranes were hybridized overnight at 65°C with radiolabeled cDNA probes (ordered from Research Genetics according to their dbEST Image ID). Membranes were washed at a final stringency of 60°C in 0.1 × SSC.Immobilized pH gradient 2-dimensional gel electrophoresis and mass spectrometry Induced MPRO cells collected at 0 and 72 hours were lysed with lysis buffer (540 mg urea, 20 mg dithiothreitol, 20 µL Pharmalyte [3-10], 1.4 mg phenylmethylsulfonyl fluoride, 1 µg each aprotinin, leupeptin, pepstatin A, and antipain 50 µg TLCK, and 100 µg TPCK/1 mL). We applied 100 µL each MPRO cell lysate (2.5 × 106cells/100 µL) to immobilized pH gradient (IPG) strips (pH 3-10 L; Amersham Pharmacia Biotech), and IPG electrophoresis was conducted for 16 hours (20 100 Vh) using an Immobiline Drystrip Kit (Amersham Pharmacia Biotech). Electrophoresis in the second dimension was carried out in a 12% sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) gel with the Laemmli-SDS continuous system in a Protean II xi 2-D cell (Bio-Rad) run at 40 mA constant current for 4.5 hours. Proteins were detected by Brilliant Blue G-colloidal staining.24 Protein spots were excised from the gel and digested with trypsin. ACTH clip (average [M+H] 2466.70) and bradykinin (average [M+H] 1061.23) were used for calibration of peptide masses. One microliter sample digest was mixed with 1.0 µL -cyano-4-hydroxy cinnamic acid (4.5 mg/mL in 50% CH3CN, 0.05% TFA) matrix solution and 1 µL
calibrants (100 fmol) each. The spectra of the peptides were acquired
in reflector/delayed extraction mode on a Voyager-DE STR mass
spectrometer (Perseptive Biosystems, Foster City, CA). Peptides
were identified using the ProFound search
engine.39
Differentiation of MPRO cells Figure 1 illustrates the morphologic changes in an MPRO cell population representative of those used for RNA expression analysis. Undifferentiated MPRO cells resembled promyelocytes under the light microscope (Figure 1A). After induction with ATRA for 24 hours, the cells morphologically differentiated into metamyelocytes (Figure 1B). At 48 hours, the cells further developed into metamyelocytes and band neutrophils (Figure 1C). At 72 hours, nearly 100% of MPRO cells became mature neutrophils (Figure 1D).
Identification of mRNAs by differential display assay MPRO cellular mRNA was analyzed at 0, 24, 48, and 72 hours after ATRA treatment. Nine restriction enzymes were used in a 3'-end DD approach. During MPRO differentiation, 1109 fragments corresponding to 837 transcripts were found to change substantially in expression levels (Figure 2). These represented approximately 279 known genes, 112 ESTs, and 59 putative new genes, each with a perfect or fair polyadenylation signal at an appropriate distance from the oligo-dT priming site. The gene information detected by DD was collected in database dbMCd.
Identification of mRNAs by oligonucleotide chip assay We used an oligonucleotide chip containing 13 179 probe sets corresponding to approximately 7000 murine genes to analyze patterns of mRNA expression in the same RNA samples used for DD. The information obtained by oligonucleotide arrays was collected in the database dbMCa.We clustered the genes by their similarity to idealized
expression patterns. For instance, the expression pattern of an ideal gene that is overexpressed (high) at time 0 and underexpressed (low) at
24, 48, and 72 hours, would be high-low-low-low (HLLL). Overall we have
(24 - 2) idealized patterns excluding HHHH and LLLL.
Pearson correlation was used as the measure of similarity of each gene
expression pattern,
x = (x1,x2,x3,x4)
to each of the 14 idealized patterns y = (y1,y2,y3,y4).
The 4 entries of x and y corresponded to the 4-dimensional gene
expression levels at 0, 24, 48, and 72 hours, respectively. Each gene
was assigned to a cluster labeled by the idealized pattern that had the
maximal correlation with that gene. We selected only genes that
hybridized well compared with the background (considered "present"
by GeneChip software) and had maximal AD amplitude greater than 200 U
in at least 1 of the 4 stages. We further tabulated the 14 patterns
according to whether the gene expression changed at early (0-hour),
intermediate (24- and 48-hour), and late (72-hour) time points and
whether gene expression monotonically increased (up-regulated),
monotonically decreased (down-regulated), or was not monotonic
(transient). Table 1 shows 8 clusters of
104 genes that had significant changes of mRNA levels, arranged
according to the temporal stage and the monotonic/transient changes of
expression levels.
Principal component analysis determined whether we could
comprehensively present multidimensional data (4-dimensional in our case) in a simple 2-dimensional graph. First, we found the 4 principal components, which were the axes of the most compact 4-dimensional ellipsoid that encompassed the 4-dimensional cloud of data. Each axis
was a different linear combination of the original 4 variables. Then we
verified that the first 2 principal components (the first 2 largest
axes of the ellipsoid) captured most (95.2%) of the variation of the
data. Therefore, the data could be faithfully projected (with a minor
loss of information) into a 2-dimensional graph, with the 2 largest
principal components as the x- and y-axes. As shown in Figure
3, genes tend to coalesce in clusters,
according to their labels determined by their similarity to an ideal
expression pattern. In summary, a genomic (global) picture of the
distribution of genes according to their similarity to predetermined
idealized multidimensional expression patterns is concisely displayed
in a 2-dimensional graph.
Correlation between array and DD analyses We have previously demonstrated a correlation coefficient of 0.93 between visual estimates of changes in band intensity on DD and Phosphorimager System (Molecular Dynamics, Sunnyvale, CA) estimates of band intensity and a correlation coefficient of 0.88 between hybridization intensity changes of mRNA on Northern blot analyses and changes in band intensity on DD.19 In a few cases there were clear discrepancies in the pattern of expression of a gene, as estimated by DD and by oligonucleotide chip analysis. We chose the 6 most extreme cases and examined the levels of mRNA change for these genes by Northern blot analysis (Figure 4). In 5 cases, the Northern blot results agreed with the results of the DD analysis, whereas the results of Gnb2-rs1 disagreed with the oligonucleotide array but duplicate bands from DD showed a relatively high level of expression in the 0 time sample that did not correlate with the Northern blot (Table 2). One possible explanation for these findings was the change in the relative use of different polyadenylation sites after the addition of ATRA to the MPRO cells.
Constructing a database for mRNA level changes during myeloid differentiation Based on the data obtained above, an in-house database (dbMC) was constructed that included 2 subdatabases, dbMCd and dbMCa, for collecting gene information from DD or oligonucleotide arrays, respectively. Each entry in dbMC is accompanied by a so-called executive summary. The linkage between dbMCd and dbMCa was established by UniGene ID and cluster ID. dbMC contains the temporal expression patterns of genes during the MPRO cell differentiation process, including not only products represented in public databases but also novel transcripts.Analysis of gene expression patterns during MPRO differentiation Many of the genes identified in this study were found in myeloid cells or were implicated in myeloid development for the first time. We detected 8 cytokines25 and chemokines whose mRNA levels changed more than 5-fold by arrays and 2-fold by DD during the maturation of MPRO cells (see our Web site, http://bioinfo.mbb.yale.edu/expression/neutrophil). Among these were 2 members of the CC chemokine family. Interleukin-1 (IL-1 ) was
up-regulated at the late stage of differentiation (LLLH pattern,
Table 1).
mRNA for approximately 52 receptors was detected by one or the other
method. A number of the receptors known to be present on mature
neutrophils showed late induction of mRNA, and their levels of
induction were high, indicating that the expression of these products
is a prominent event late in neutrophil maturation (Table
3). Rarely was mRNA for receptors
down-regulated, consistent with myeloid maturation being accompanied by
increasing responsiveness of the cell to a variety of external stimuli.
Expression of mRNA for granule proteins Neutrophils contain several types of granules that develop at different stages of myeloid maturation.3,17,26 Levels of mRNAs encoding secondary granule proteins, such as lactoferrin, increased as the cells matured (Table 4). The level of mRNA for Mmp9, reported as a tertiary granule protein, increased markedly between 24 and 48 hours after the induction of differentiation, whereas mRNAs for secondary granule proteins either increased less markedly or showed a maximum increase by 24 hours. mRNAs for several primary granule constituents, such as myeloperoxidase and cathepsin G, were present in unstimulated cells and decreased as the cells matured. There was a discrepancy in the measurements of proteoglycan mRNA by DD and oligonucleotide chips, but Northern blots showed that it reached a peak at 48 hours and then declined (Figure 4). Cathepsin D is reported as a primary granule protein, but its pattern of mRNA expression more closely resembled that of secondary granule constituents. In addition to known granule components, mRNAs for several other cathepsins were up-regulated during myeloid differentiation, in parallel with or later than the tertiary granule protein mRNAs.
mRNAs for transcription factors Transcription factor genes, including several identified at the sites of consistent chromosome rearrangements in acute myeloid leukemia, have been implicated in normal myeloid differentiation and in the expression of neutrophil proteins.2,5,27 However comprehensive information concerning the expression of these transcription factors during myeloid development is not readily available. Therefore, we compared gene names and identifiers in our databases to those of the transcription factor database Transfac (http://www.transfac.gbf-braunschweig.de/TRANSFAC) and determined which factors contained in this database were present at detectable levels in MPRO cell mRNA, using Affymetrix software for the criteria for inclusion of mRNAs from approximately 200 murine transcription factors probe sets on the oligonucleotide chip. Of these, 54 were expressed and 13 showed changes of 3-fold or more in chip signal (Table 5).
The changes in certain transcription factors, such as the moderate down-regulation of myb and myc and the up-regulation of the Max dimerization protein MAD, were consistent with the shift of the cells from a proliferative to a differentiated state.28 Some changes are more difficult to explain, such as the up-regulation of DP1, a partner for E2f factors in the regulation of S-phase genes, and the mild up-regulation of the Id genes, commonly associated with an inhibition of differentiation by competition with bHLH transcriptional activators.29 The C/EBP family has been extensively studied with respect to myeloid
differentiation.2,30 Absolute levels of the C/EBP C/EBP We detected a number of other transcription factors that are broadly expressed or that have been reported in other studies of hematopoiesis (Table 5). Some of the factors that were most strongly induced during differentiation have been studied in other contexts but not previously implicated in hematopoiesis, such as a mammalian homologue to the Drosophila enhancer of split gene, a transcriptional silencer. The mammalian gene is expressed at relatively high levels as measured by the oligonucleotide chip and is a candidate for mediation of the silencing of growth-related genes in the maturing neutrophil. Another candidate transcriptional silencer, Tif1b, may serve as a corepressor for the KRAB domain family of zinc finger transcription factors and also may mediate binding of the heterochromatin protein HP1 to DNA.33 There were 26 transcription factors whose mRNAs showed no significant changes by oligonucleotide chip analysis and were not identified as differentially regulated genes by differential display assays. PU.1, a factor necessary for the production of neutrophils and the expression of several neutrophil genes,34 showed less than a 3-fold increase in mRNA, below the threshold for a significant change. Other candidate hematopoietic transcription factors, such as PEBP1aB2 (AML1), GATA-1, and SP-2, were represented on the oligonucleotide chips, but their mRNA levels were so low that they were reported as absent in this study. The possibility that small changes in the levels or ratios of some transcription factors could produce marked changes in transcription potentially limits the ability of data generated by present methods to explain transcriptional changes during differentiation. Protein expression patterns of MPRO cells during ATRA induction We visually compared the 2DE patterns from MPRO cells at the same time points used for mRNA analysis. In most cases the peptides identified for a given protein were derived from regions along the entire length of the protein, indicating the observed products were not the result of proteolytic degradation. These data must be considered with several caveats: membrane and other hydrophobic proteins and very basic proteins are not well displayed by the standard 2DE approach, and proteins present at low levels will be missed.35 In addition, to simplify MS analysis, we used a Coomassie dye stain rather than silver to visualize proteins, and this decreased the sensitivity of detection of minor proteins. The MS method we used was sufficiently sensitive to identify proteins that could barely be visualized by colloidal blue staining. However, a limitation of the method for the mouse is that the current database lacks predicted amino acid sequences for a substantial fraction of murine genes. In addition, very small proteins give only a few peptides, making statistically confident identification difficult.Figure 5 shows the analytical colloidal
blue-stained 2DE IPG reference maps of differentiated MPRO cells.
Expression patterns of more than 500 protein spots were detected and
observed through the entire series of gels. Protein spots could easily
be cross-matched to each other, indicating the reproducibility of the
method. As marked on the gel pictures (Figure 5), 50 proteins with a
wide range of molecular weights (1 to 200 kd), isoelectric points (4 to
9), and abundances were subjected to MS protein identification. The
results are presented in Table 6.
Comparing the theoretical value of the molecular weight and pI of each protein to that of the observed value, we confidently identified 28 proteins in the expected position on the gels (spots 1 to 28). Some of the other proteins with strong matches to the murine databases migrated to a somewhat unexpected pI position. Nine spots gave clear peptide peaks on mass spectroscopy but did not match any known gene. Their identification will require amino acid sequence analysis or availability of more extensive murine databases. We searched for the expression patterns of the genes cognate to the expressed proteins in dbMC (Table 6). Nineteen genes were found in dbMC, the mRNA for 5 genes was reported as absent, and 13 genes were present during MPRO differentiation. Comparison of the expression patterns showed only 4 genes of 18 present on the oligonucleotide chips whose expression was consistent at the RNA level and protein level. None of these was on the list of the genes that were differentially expressed significantly (5-fold or greater change by array or 2-fold or greater change by DD).
We explored the temporal patterns of gene expression during myeloid development. A database has been developed to provide a reference for later research on the molecular mechanisms underlying normal myeloid development. The MPRO cell system morphologically mimics normal myeloid differentiation and biochemically proceeds further toward mature neutrophils than most other in vitro systems. Because the arrest in differentiation of MPRO cells growing in the absence of ATRA is not physiologic, there is a theoretical risk that gene expression in these cells is not coordinated in the way that it is in normal differentiation. It is encouraging that, for the most part, the timing of expression of genes for proteins of the various neutrophil granules is consistent with the timing of the morphologic and biochemical appearance of these granule components during normal myeloid differentiation. The DD technique provides certain advantages for detecting and comparing mRNA levels in different samples. First, the method is, in principle, similar to competitive RT-PCR, and, with the use of stringent PCR conditions, is expected to be about as reliable. Second, display patterns are reproducible. Third, the method detects the levels not only of RNAs already represented in the database but also of unknown RNA species that may represent "new" genes. Fourth, closely related genes can be distinguished regardless of cross-hybridization, provided there are some single nucleotide differences in the 3' end sequence. Limitations associated with this technique are that numerous gels are necessary to get complete information and that comparison of the levels of different mRNAs is only approximate because of the differential amplification of bands of different size or sequence. Oligonucleotide chip analysis is a fast and effective means of accessing mRNA expression patterns.20 Cluster analysis of groups of samples by this approach is effective. However, the present results indicate that alternative methods of verification are desirable before the data on an unexpected change in a particular gene are definitively accepted. To obtain the broadest range of information from the myeloid differentiation process, both differential display and oligonucleotide chip techniques were applied in the current study. As a result, 65.3% of the observed changes in mRNA levels came from the differential display method and 41.5% came from oligonucleotide chip assays. Our data showed in general that changes in expression pattern by the 2 methods agreed qualitatively but that there was some quantitative variation. Our results indicate that DD may be a more accurate way to detect changes in levels of gene expression than the oligonucleotide chip assay. However, improvements in the types of oligonucleotides used in arrays may close this gap in the future. The mRNAs for a limited number of transcription factors vary in a pattern correlating with that of the mRNAs for primary or secondary granule proteins. However, more detailed information is needed, and the underlying mechanisms of granule gene regulation remain unclear. The number of potential positive and negative regulatory factors found here is sufficiently small as to make it feasible to perform in vivo studies, such as chromatin immunoprecipitation. The oligonucleotide chip used in this study focused on known genes, whereas the DD method samples all polyadenylated transcripts. The latter method generated a large number of products not associated with known genes, in part because the mouse genome is not as well represented in the database as the human genome. However, our experience with DD and human mRNAs indicates that substantial fractions of the products represented as ESTs or not represented at all in the public databases are cDNA copies from introns, hnRNA, or other RNA with internal A runs. Approximately 59 sequences obtained from gel-display bands had significant changes in the level of expression and a sequence that did not match that for any named gene in the public databases. Of these, 38 had plausible or excellent polyA signals. This is only an approximate estimate of the number of new genes found36 because a fraction of the mRNAs for known genes still had poor polyA signals. In addition, the full 3' untranslated region is often not known for characterized genes, and in some cases these new genes may prove to be identical to products identified by the oligonucleotide chips when more complete sequences are obtained. At the least, their presence indicates that a substantial fraction of the regulatory or functional circuitry of maturing myeloid cells remains unexplored and that valuable tools for their investigation will emerge from a combination of RNA expression studies and analysis of emerging genomic sequences. The desired end point for the description of gene expression in a biologic system is not only the analysis of mRNA transcript levels but also the accurate measurement of protein abundance. The developments in 2DE and new MS instrumentation make it possible to accomplish this work rapidly and efficiently. In this study, we attempted to identify a number of the proteins differentially expressed between uninduced and ATRA-differentiated MPRO cells and to examine the relation between mRNA and protein expression levels for these genes representing the same state. For protein levels based on estimated intensity of Coomassie dye staining in 2DE, there was poor correlation between changes in mRNA levels and estimated protein levels. Other groups have studied the correlation between mRNA and protein levels in yeast and liver cells.11,12,14 In the liver cell experiments,11,12 correlation coefficients of 0.4 to less than 0.5 were observed. In an extensive study in yeast,11,12 the correlation coefficient was high if the most abundant mRNAs and proteins were considered. If a handful of these products was omitted, the remaining correlation coefficient was 0.4 or less. However, one could restore some of the correlation by averaging individual data points into broad proteomic categories.37 The discrepancies between mRNA and protein levels in MPRO cells appear to be substantially larger than those observed for yeast. Possible causes for the discrepancies include translational regulation, differential expression of certain mRNAs at various stages of cell growth in vitro, post-translational protein modification that varies with the stage of maturation of the cells, and selective degradation or excretion of proteins in vivo. Furthermore, here we are focusing on a developmental time-course, whereas the yeast study concentrated on the organism in vegetative growth. New techniques, equipment, and bioinformatic analysis tools must be developed to make such systematic, global, and quantitative analyses feasible. The initial studies of protein expression presented here provide a cautionary note for efforts to interpret cell composition and function in relation to mRNA levels. Discrepancies we observed between gene expression and protein abundance suggest that selective post-transcriptional controls may be at least as important as changes in mRNA levels in determining the protein composition of neutrophils and that they are phenomena less well explored than transcriptional control. Analysis of mRNA expression patterns is itself only a small beginning toward a genome-wide description of cellular components.
We thank Dr S. Tsai (Fred Hutchinson Cancer Research Center) for his kind gift of the MPRO cell line, Dr Fuki M. Hisama (Yale University School of Medicine) for helpful advice, and the staff at Gene Logic Inc for data and support.
Submitted December 4, 2001; accepted March 28, 2001.
Supported by grants from the National Institutes of Health (NIH) (CA42556) and Gene Logic (A143558, DK54369, and HL63357). Z.L. is supported by NIH grant HL 63357. P.E.N. is supported by NIH grant DK 54369, grants from the Arthritis Foundation and the Charles H. Hood Foundation, and the Pierce Family Cancer Research Fund. M.G. is supported by the Keck Foundation and by NIH grant GM54160-04.
L.W. and S.Y. contributed equally to this research.
The publication costs of this article were defrayed in part by page charge payment. Therefore, and solely to indicate this fact, this article is hereby marked "advertisement" in accordance with 18 U.S.C. section 1734.
Reprints: Sherman M. Weissman, Department of Genetics, Boyer Center for Molecular Medicine, Yale University School of Medicine, Rm 336, 295 Congress Ave, New Haven, CT 06536-0812; e-mail: sherman.weissman{at}yale.edu.
1. Lawson ND, Berliner N. Neutrophil maturation and the role of retinoic acid. Exp Hematol. 1999;27:1355-1367[CrossRef][Medline] [Order article via Infotrieve].
2.
Tenen DG, Hromas R, Licht JD, Zhang DE.
Transcription factors, normal myeloid development, and leukemia.
Blood.
1997;90:489-519 3. Sigurdsson F, Khanna-Gupta A, Lawson N, Berliner N. Control of late neutrophil-specific gene expression: insights into regulation of myeloid differentiation. Semin Hematol. 1997;34:303-310[Medline] [Order article via Infotrieve]. 4. Lenny N, Westendorf JJ, Hiebert SW. Transcriptional regulation during myelopoiesis. Mol Biol Rep. 1997;24:157-168[CrossRef][Medline] [Order article via Infotrieve]. 5. Yunis JJ, Tanzer J. Molecular mechanisms of hematologic malignancies. Crit Rev Oncog. 1993;4:161-190[Medline] [Order article via Infotrieve].
6.
Velculescu VE, Zhang L, Vogelstein B, Kinzler KW.
Serial analysis of gene expression.
Science.
1995;270:484-487
7.
Stollberg J, Urschitz J, Urban Z, Boyd CD.
A quantitative evaluation of SAGE.
Genome Res.
2000;10:1241-1248 8. Subrahmanyam YV, Baskaran N, Newburger PE, Weissman SM. A modified method for the display of 3'-end restriction fragments of cDNAs: molecular profiling of gene expression in neutrophils. Methods Enzymol. 1999;303:272-297[Medline] [Order article via Infotrieve]. 9. Subrahmanyam YVBK, Yamaga S, Newburger PE, Weissman SM. A modified approach for the efficient display of 3' end restriction fragments of cDNAs. In: Leslie RA,Robertson HA, eds. Differential Display: A Practical Approach. Practical Approach Series. Oxford, UK: Oxford University Press.; 2000:101-129. 10. Appella E, Arnott D, Sakaguchi K, Wirth PJ. Proteome mapping by two-dimensional polyacrylamide gel electrophoresis in combination with mass spectrometric protein sequence analysis. EXS. 2000;88:1-27[Medline] [Order article via Infotrieve]. 11. Anderson NL, Anderson NG. Proteome and proteomics: new technologies, new concepts, and new words. Electrophoresis. 1998;19:1853-1861[CrossRef][Medline] [Order article via Infotrieve].
12.
Gygi SP, Rochon Y, Franza BR, Aebersold R.
Correlation between protein and mRNA abundance in yeast.
Mol Cell Biol.
1999;19:1720-1730 13. The Whitehead Institute for Biomedical Research/MIT Center for Genome Research. Molecular Pattern Recognition Web site. Available at: www.genome.wi.mit.edu/MPR/analysis.html#RS. Accessed May 4, 2001.
14.
Futcher B, Latter GI, Monardo P, McLaughlin CS, Garrels JI.
A sampling of the yeast proteome.
Mol Cell Biol.
1999;19:7357-7368
15.
Tsai S, Collins SJ.
A dominant negative retinoic acid receptor blocks neutrophil differentiation at the promyelocyte stage.
Proc Natl Acad Sci U S A.
1993;90:7153-7157 16. Johnson M, Caiazzo T, Molina JM, Donahue R, Groopman J. Inhibition of bone marrow myelopoiesis and erythropoiesis in vitro by anti-retroviral nucleoside derivatives. Br J Haematol. 1988;70:137-141[Medline] |