| |
|
|
|
|
|
|
|||
|
HEMATOPOIESIS
From the Department of Medicine, Section of Hematology
and Oncology, University of Illinois at Chicago.
Transcription factors (TFs) and the regulatory proteins that
control them play key roles in hematopoiesis, controlling basic processes of cell growth and differentiation; disruption of these processes may lead to leukemogenesis. Here we attempt to identify functionally novel and partially characterized TFs/regulatory proteins
that are expressed in undifferentiated hematopoietic tissue. We
surveyed our database of 15 970 genes/expressed sequence tags
(ESTs) representing the normal human CD34+ cells
transcriptosome (http://westsun.hema.uic.edu/cd34.html), using the
UniGene annotation text descriptor, to identify genes with motifs
consistent with transcriptional regulators; 285 genes were identified.
We also extracted the human homologues of the TFs reported in the
murine stem cell database (SCdb; http://stemcell.princeton.edu/), selecting an additional 45 genes/ESTs. An exhaustive literature search
of each of these 330 unique genes was performed to determine if
any had been previously reported and to obtain additional
characterizing information. Of the resulting gene list, 106 were
considered to be potential TFs. Overall, the transcriptional regulator
dataset consists of 165 novel or poorly characterized genes, including 25 that appeared to be TFs. Among these novel and poorly characterized genes are a cell growth regulatory with ring finger domain protein (CGR19, Hs.59106), an RB-associated CRAB repressor
(RBAK, Hs.7222), a death-associated transcription factor 1 (DATF1, Hs.155313), and a p38-interacting protein
(P38IP, Hs. 171185). The identification of these
novel and partially characterized potential transcriptional regulators
adds a wealth of information to understanding the molecular aspects of
hematopoiesis and hematopoietic disorders.
(Blood. 2002;100:107-119) Transcription factors (TFs) play a critical
role in the process of lineage commitment and differentiation in
hematopoietic tissue.1-4 Several such factors are known to
control the basic molecular mechanisms that underlie this process, and
their expression is tightly regulated in a stage- and lineage-specific
manner.5 For example, the level of expression of PU.1 and
GATA binding proteins plays a major regulatory role in myeloid
development, with PU.1 being up-regulated with myeloid
differentiation,6 whereas GATA1 and GATA2 are
down-regulated.6,7 Disruption in the expression, sequence,
or structure of critical TFs or their associated regulatory proteins
can upset the delicate balance between proliferation and
differentiation and lead to leukemogenesis. Most of the consistent
translocations in myeloid leukemias that have been analyzed to date
result in a fusion protein that alters the normal function of a TF or a
related regulatory protein8,9; it is increasingly
recognized that these genes might also contribute to leukemia by
functional inactivation by mutation10 or chromosomal translocation.11-13 It has been speculated that the
majority of translocations that have not yet been fully characterized
probably also involve transcriptional regulatory
proteins.14-16 Thus, the identification of novel
transcriptional regulators, especially those that are located near
translocation breakpoints, may help to specify new leukemia-related
proteins, leading to better understanding and treatment of this disease.
In the present study, we took a global approach to identify novel and
known transcriptional regulators that might participate in
hematopoiesis and leukemogenesis by surveying databases of genes that
are expressed in normal hematopoietic stem cells. We searched our
previously reported database of 15 970 transcripts that are present in
human bone marrow CD34 antigen-positive cells17 to
identify those with functional motifs consistent with transcriptional regulators. We also searched a murine stem cell database18
to find the human homologues of TFs expressed in this tissue. Here we
report the results of our search, which identified 330 genes that are
potential transcriptional regulators, including 106 TFs, of which 25 are novel or poorly characterized. These TFs, especially those novel
ones that have not been reported previously, may represent new pathways
in hematopoiesis or leukemogenesis that have not yet been explored.
The human CD34+ transcriptosome database
The murine stem cell database
Selection of genes from the human CD34+ transcriptosome database The 15 970 genes in the human CD34+ transcriptosome database were searched for cDNAs that encode known TFs, and for those containing motifs that are frequently found in TFs and their interacting proteins. The analysis was based on a text search of the UniGene descriptor of the clones in the CD34+ database, rather than a direct homology search of the clone sequence. UniGene is a database that automatically collects and partitions GenBank and expressed sequence tag (EST) sequences into a nonredundant set of gene-oriented clusters by establishing sequence overlaps; each cluster represents a single potential transcript. Each cluster is annotated with a descriptor of the transcript that is the result of automated searches for sequence homologies to proteins from 8 organisms, using both nucleotide and protein sequence alignment; thus, a fair amount of functional prediction is available for each gene cluster even if it represents an EST sequence that has not been further studied. Each cluster is assigned a chromosomal location, based on sequence alignment. Details of the construction and updating of the UniGene database are available at http://www.ncbi.nlm.nih.gov/UniGene/.The UniGene cluster descriptors contained in the
CD34+ transcriptosome database were searched for
terms that are thought likely to annotate TFs, corepressors or
coactivators, nuclear factors, and other DNA-interacting proteins. The
resulting genes were updated, corrected for redundancy, and verified
through homology screens. The database was visually inspected, and an
additional 6 genes of known function, which clearly did not contain
transcriptional regulatory activity, were removed from the
database. A total of 285 genes resulted. Table
1 presents these genes,
categorized according to their function or functional motifs, with
their UniGene number, chromosomal location, and UniGene descriptor. The
cDNAs in each category are presented in the order from highest
abundance to lowest, based on the measured level of expression in
CD34+ cells, as reported in the
database.17
Selection of genes from the murine stem cell database The TF category of the murine hematopoietic stem cell database was analyzed to identify the human homologues of known and novel TFs expressed in human bone marrow CD34+ cells, by cross-referencing the murine and human UniGene databases. The murine UniGene clusters corresponding to each of the 161 TFs listed in the murine database were matched with the human clusters in the UniGene database version 129 resulting in 155 homologous human clusters. A total of 145 human genes remained after updating to UniGene version 135 and removing redundant entries. Of these 145 clusters, 87 were represented in the human CD34+ transcriptosome database, including 30 that had already been identified by our search using text descriptors. These 30 clusters are indicated with an asterisk in Table 1. Analysis of the remaining 57 human genes for homology to their assigned UniGene cluster or to a corresponding TIGR entry, and excluding those whose known function was obviously not in the category of a transcriptional regulator, resulted in 45 additional genes/ESTs. These additional 45 genes are listed in Table 2, and each entry includes the murine gene and its presumed human counterpart, its human UniGene cluster ID and descriptors, its chromosomal location, and the level of expression in human CD34+ cells. Of the 58 clusters that are not present in the CD34+ transcriptosome database, 38 were thought to be unexpressed in human CD34+ cells, based on an expression level less than 3-fold over background in the CD34+ transcriptosome database, and the remaining 20 could not be evaluated because they had not been included in the original expression studies that resulted in the CD34+ transcriptosome database.
Literature analysis of the TF database After combining the datasets mined from the human and murine databases, the total number of potential TFs or regulatory proteins was determined to be 330. This includes 106 genes that are recognized as TFs and 224 genes in other categories, which include zinc fingers (90 genes), enhancers (14 genes), activators (8 genes), forkhead (11 genes), oncogenes (20 genes), ring finger (16 genes), and the combination of helix-loop-helix, homeobox, leucine zipper, nuclear, PHD, POU, and repressor categories (21 genes). The remaining 44 cDNAs represent genes that are functionally characterized as transcriptional regulators but lacked any search terms used in our mining protocol. A literature search of each of these 330 genes was performed to determine what was known about each one, emphasizing the discovery of novel genes. The following convention was used to summarize our search results: K = known gene, well characterized; PC = partially characterized, the gene was reported and some preliminary studies have been performed to indicate its function; N = novel gene, no functional information other than its chromosomal location and sequence homology to a known gene or gene family has been reported. These summaries are given in Tables 1 and 2. As a result of the literature search, 165 (50%) of the 330 transcriptional regulators identified were found to be known genes, 86 (26%) have been partially characterized, and 79 (24%) are novel. The partially characterized and novel transcriptional regulators have been further categorized by their relative level of abundance in CD34+ cells, with 92 expressing at low level ( 3-fold to < 10-fold over
background), 27 expressing at intermediate level ( 10-fold to
< 25-fold), 28 at high level ( 25-fold to < 100-fold), and 18 expressing at very high levels ( 100-fold), using the
conventions reported in the CD34+ transcriptosome
database.17
In the current study, we emphasized the identification of novel TFs.
Based on our literature search of the 106 identified TFs, 78 appear to
be well characterized, known genes, whereas 18 have been partially
characterized and 7 represent truly novel genes. These 25 partially
characterized and novel genes are listed in Table
3 along with details of
their presumed function and the supporting literature references.
The current report presents our initial attempts to describe the TFs and related regulatory proteins that are present in the human CD34+ transcriptosome. The study is based primarily on the survey of a human CD34+ transcriptosome database, supplemented by homologies identified in a murine stem cell database, referring them to the CD34+ database. The human CD34+ transcriptosome database was prepared by hybridization of filter arrays, selecting transcripts that are common to both human and baboon bone marrow CD34+ antigen-positive cells.17 This database is felt to be an accurate portrayal of the transcriptosome of the CD34+ cell and was estimated to contain 50% to 75% of the transcripts expressed in this tissue. This database contains 15 970 genes/ESTs expressed in CD34+ cells, and lists their relative level of expression; random sampling of selected transcripts verified (by semiquantitative reverse transcriptase-polymerase chain reaction) that most were expressed at the predicted level. The murine database (http://stemcell.princeton.edu/) was the result of a cDNA library study, subtracting a stem cell-depleted (AA4.1neg) cDNA library from a mouse fetal liver hematopoietic stem cell (ScaposAA4.1posKitposLinneg/lo) cDNA library.18 The subtracted library represents genome-wide gene expression in mouse hematopoietic stem cells devoid of housekeeping genes. Sequence information on each of these clones was compared by BLAST against GenBank nonredundant protein and nucleotide databases, the EST database, Swissprot, and mouse and human DOTS contigs. Each clone was categorized according to its sequence homology to genes of known functions, resulting in a "transcription factor" category containing 161 entries. The current study reported here was based on the search of UniGene text descriptors in the CD34+ transcriptosome database domains generally present in TFs and their regulatory proteins, whereas the mining of the murine stem cell database relied on homology between the mouse TFs and human genes. The study resulted in the identification of 330 genes that are likely to be transcriptional regulators expressed in human CD34+ cells. Because this transcriptional regulator database was prepared using text descriptors rather than primary sequence analysis, it should only be regarded as a preliminary database survey, limited by the accuracy of the sequence searches compiled by UniGene and by the contents of the databases that were analyzed. Because this study relied heavily on the UniGene database, a considerable number of potential transcriptional regulators might have been missed because of the absence of search terms in the text descriptors, or the simple fact that the UniGene database does not contain complete cDNA sequences for all human genes. This explains in part why the additional 45 TFs from the murine database were not selected during the CD34+ transcriptosome database analysis. Despite these limitations, we believe that this gene list will prove to be very useful for further studies of normal and malignant hematopoiesis. One of the most striking features of this list is that many of the genes have been assigned functional roles in numerous other tissues besides bone marrow. Also of note is the identification of 165 partially characterized and novel genes, 11 of which are expressed at a very high level in CD34+ cells, suggesting that they have an important role in this tissue but have not been previously recognized as such. Some of the interesting novel or partially characterized genes include zinc finger protein 161 (ZFP161, Hs. 156000), a cell growth regulator protein with a ring finger domain (CGR19, Hs. 59106), zinc finger protein 198 (ZNF 198, Hs.109526), RB-associated CRAB repressor (RBAK, Hs.7222), death-associated transcription factor 1 (DATF1, Hs. 155313), and a p38-interacting protein (P38IP, Hs. 171185). The human ZFP161 protein is highly homologous (98%) to ZF5, a putative murine repressor for MYC, with a growth-inhibitory function.19 We anticipate that both ZFP161 and RBAK20 are associated factors for 2 very functionally important proteins, MYC and RB, respectively, and may play important regulatory roles in cellular functions such as proliferation, differentiation, and apoptosis; to our knowledge, these genes have not been previously evaluated in hematopoiesis or leukemia. Another interesting protein is zinc finger protein 198 (ZNF 198). This gene has not been functionally characterized, but it is reported to be involved in the t(8;13) translocation,21 resulting in a fusion protein with fibroblast growth factor receptor 1 (FGFR1). Studies of these and other novel genes are underway to ascertain their potential role in cell proliferation, differentiation, and apoptosis in the hematopoietic system. Detailed studies will be required to verify that each of these genes is indeed expressed in hematopoietic CD34+ cells at the predicted level, to obtain the complete coding sequence for the partial cDNAs/ESTs in the database, and to verify the assigned chromosomal location. We predict that some of these genes may be disrupted by chromosomal translocations, thereby contributing to leukemogenesis. Overall, the database here represents a wealth of potential new information to aid in understanding the molecular aspect of normal and malignant hematopoiesis.
Submitted October 22, 2001; accepted February 26, 2002.
Supported by Public Health Service grant P01-75606 (to C.A.W.).
The publication costs of this article were defrayed in part by page charge payment. Therefore, and solely to indicate this fact, this article is hereby marked "advertisement" in accordance with 18 U.S.C. section 1734.
Reprints: Carol A. Westbrook, Department of Medicine, Section of Hematology and Oncology, 900 S Ashland Ave, M/C 734, Chicago, IL 60607; e-mail: cwcw{at}uic.edu.
1.
Scott E, Simon M, Anastasi J, Singh H.
Requirement of transcription factor PU.1 in the development of multiple hematopoietic lineages.
Science.
1994;265:1573-1577
2.
Tenen D, Hromas RR, Licht J, Yamamishi D, Zhang D.
Transcription factors, normal myeloid development, and leukemia.
Blood.
1997;90:489-519 3. van Oostveen J, Bijl J, Raaphorst F, Walboomers J, Meijer C. The role of homeobox genes in normal hematopoiesis and hematological malignancies. Leukemia. 1999;13:1675-1690[CrossRef][Medline] [Order article via Infotrieve].
4.
Orkin SH.
Transcription factors and hematopoietic development.
J Biol Chem.
1995;270:4955-4958 5. Lawrence H, Sauvageau G, Ahmadi N, et al. Stage- and lineage-specific expression of the HOXA10 homeobox gene in normal and leukemic hematopoietic cells. Exp Hematol. 1995;23:1160-1166[Medline] [Order article via Infotrieve].
6.
Voso M, Burn T, Wulf G, et al.
Inhibition of hematopoiesis by competitive binding of transcription factor PU.1.
Proc Natl Acad Sci U S A.
1994;91:7932-7936
7.
Lee M, Temizer D, Clifford J, Quertermous T.
Cloning of the GATA-binding protein that regulates endothelin-1 gene expression in endothelial cells.
J Biol Chem.
1991;266:16188-16192 8. de The H, Lavau C, Marchio A, et al. The PML-RAR alpha fusion mRNA generated by the t(15;17) translocation in acute promyelocytic leukemia encodes a functionally altered RAR. Cell. 1991;66:675-684[CrossRef][Medline] [Order article via Infotrieve].
9.
McNeil S, Zeng C, Harrington K, et al.
The t(8;21) chromosomal translocation in acute myelogenous leukemia modifies intranuclear targeting of the AML1/CBFalpha2 transcription factor.
Proc Natl Acad Sci U S A.
1999;96:14882-14887 10. Pabst T, Mueller B, Zhang P, et al. Dominant-negative mutations of CEBPA, encoding CCAAT/enhancer binding protein-alpha (C/EBPalpha), in acute myeloid leukemia. Nat Genet. 2001;27:263-270[CrossRef][Medline] [Order article via Infotrieve].
11.
Brown D, Kogan S, Lagasse E, et al.
A PMLRARalpha transgene initiates murine acute promyelocytic leukemia.
Proc Natl Acad Sci U S A.
1997;94:2551-2556
12.
Golub T, Barker G, Bohlander S, et al.
Fusion of the TEL gene on 12p13 to the AML1 gene on 21q22 in acute lymphoblastic leukemia.
Proc Natl Acad Sci U S A.
1995;92:4917-4921
13.
Look A.
Oncogenic transcription factors in the human acute leukemias.
Science.
1997;278:1059-1064
14.
Ahuja H, Hong J, Aplan P, et al.
t(9;11)(p22;p15) in acute myeloid leukemia results in a fusion between NUP98 and the gene encoding transcriptional coactivators p52 and p75-lens epithelium-derived growth factor (LEDGF).
Cancer Res.
2000;60:6227-6229 15. Kroon E, Thorsteinsdottir U, Mayotte N, Nakamura T, Sauvageau G. NUP98-HOXA9 expression in hemopoietic stem cells induces chronic and acute myeloid leukemias in mice. EMBO J. 2001;20:350-361[CrossRef][Medline] [Order article via Infotrieve]. 16. Kulkarni S, Reiter A, Smedley D, Goldman J, Cross N. The genomic structure of ZNF198 and location of breakpoints in the t(8;13) myeloproliferative syndrome. Genomics. 1999;55:118-121[CrossRef][Medline] [Order article via Infotrieve].
17.
Gomes I, Sharma T, Mahmud N, et al.
Highly abundant genes in the transcriptosome of human and baboon CD34 antigen-positive bone marrow cells.
Blood.
2001;98:93-99
18.
Phillips R, Ernst R, Brunk B, et al.
The genetic program of hematopoietic stem cells.
Science.
2000;288:1635-1640 19. Sobek-Klocke I, Disque-Kochem C, Ronsiek M, et al. The human gene ZFP161 on 18p11.21-pter encodes a putative c-myc repressor and is homologous to murine Zfp161 (Chr 17) and Zfp161-rs1 (X Chr). Genomics. 1997;43:156-164[CrossRef][Medline] [Order article via Infotrieve].
20.
Skapek S, Jansen D, Wei T, et al.
Cloning and characterization of a novel Kruppel-associated box family transcriptional repressor that interacts with the retinoblastoma gene product, RB.
J Biol Chem.
2000;275:7212-7223
21.
Xiao S, McCarthy J, Aster J, Fletcher J.
ZNF198-FGFR1 transforming activity depends on a novel proline-rich ZNF198 oligomerization domain.
Blood.
2000;96:699-704 22. Wey E, Schafer BW. Identification of novel DNA binding sites recognized by the transcription factor mPOU (POU6F1). Biochem Biophys Res Commun. 1996;220:274-279[CrossRef][Medline] [Order article via Infotrieve].
23.
Albert TK, Lemaire M, van Berkum NL, et al.
Isolation and characterization of human orthologs of yeast CCR4-NOT complex subunits.
Nucleic Acids Res.
2000;28:809-817 24. Hopfner R, Mousli M, Garnier JM, et al. Genomic structure and chromosomal mapping of the gene coding for ICBP90, a protein involved in the regulation of the topoisomerase IIalpha gene expression. Gene. 2001;266:15-23[CrossRef][Medline] [Order article via Infotrieve]. 25. Knoepfler PS, Kamps MP. The Pbx family of proteins is strongly upregulated by a post-transcriptional mechanism during retinoic acid-induced differentiation of P19 embryonal carcinoma cells. Mech Dev. 1997;63:5-14[CrossRef][Medline] [Order article via Infotrieve]. 26. Teraoka Y, Naruse TK, Oka A, et al. Genetic polymorphisms in the cell growth regulated gene, SC1 telomeric of the HLA-C gene and lack of association of psoriasis vulgaris. Tissue Antigens. 2000;55:206-211[CrossRef][Medline] [Order article via Infotrieve].
27.
Lu R, Misra V.
Zhangfei: a second cellular protein interacts with herpes simplex virus accessory factor HCF in a manner similar to Luman and VP16.
Nucleic Acids Res.
2000;28:2446-2454
28.
Garcia-Domingo D, Leonardo E, Grandien A, et al.
DIO-1 is a gene involved in onset of apoptosis in vitro, whose misexpression disrupts limb development.
Proc Natl Acad Sci U S A.
1999;96:7992-7997 29. Robb L, Mifsud L, Hartley L, et al. Epicardin: a novel basic helix-loop-helix transcription factor gene expressed in epicardium, branchial arch myoblasts, and mesenchyme of developing lung, gut, kidney, and gonads. Dev Dyn. 1998;213:105-113[CrossRef][Medline] [Order article via Infotrieve]. 30. Ottolenghi C, Veitia R, Barbieri M, et al. The human doublesex-related gene, DMRT2, is homologous to a gene involved in somitogenesis and encodes a potential bicistronic transcript. Genomics. 2000;64:179-186[CrossRef][Medline] [Order article via Infotrieve]. 31. Di Rocco G, Pennuto M, Illi B, et al. Interplay of the E box, the cyclic AMP response element, and HTF4/HEB in transcriptional regulation of the neurospecific, neurotrophin-inducible vgf gene. Mol Cell Biol. 1997;17:1244-1253[Abstract].
32.
Prevot D, Morel AP, Voeltzel T, et al.
Relationships of the antiproliferative proteins BTG1 and BTG2 with CAF1, the human homolog of a component of the yeast CCR4 transcriptional complex: involvement in estrogen receptor alpha signaling pathway.
J Biol Chem.
2001;276:9640-9648 33. Fletcher CF, Jenkins NA, Copeland NG, et al. Exon structure of the nuclear factor I DNA-binding domain from C. elegans to mammals. Mamm Genome. 1999;10:390-396[CrossRef][Medline] [Order article via Infotrieve].
34.
Przyborski SA, Damjanov I, Knowles BB, et al.
Differential expression of the zinc finger gene TCF17 in testicular tumors.
Cancer Res.
1998;58:4598-4601 35. Degar BA, Baskaran N, Hulspas R, et al. The homeodomain gene Pitx2 is expressed in primitive hematopoietic stem/progenitor cells but not in their differentiated progeny. Exp Hematol. 2001;29:894-902[CrossRef][Medline] [Order article via Infotrieve]. 36. Yoshima T, Yura T, Yanagi H. Novel testis-specific protein that interacts with heat shock factor 2. Gene. 1998;214:139-146[CrossRef][Medline] [Order article via Infotrieve]. 37. Horikawa I, Tanaka H, Yuasa Y, et al. Molecular cloning of a novel human cDNA on chromosome 1q21 and its mouse homolog encoding a nuclear protein with DNA-binding ability. Biochem Biophys Res Commun. 1995;208:999-1007[CrossRef][Medline] [Order article via Infotrieve]. 38. Kiss H, Kedra D, Kiss C, et al. The LZTFL1 gene is a part of a transcriptional map covering 250 kb within the common eliminated region 1 (C3CER1) in 3p21.3. Genomics. 2001;73:10-19[CrossRef][Medline] [Order article via Infotrieve].
© 2002 by The American Society of Hematology.
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
![]() |
W. Wagner, A. Ansorge, U. Wirkner, V. Eckstein, C. Schwager, J. Blake, K. Miesala, J. Selig, R. Saffrich, W. Ansorge, et al. Molecular evidence for stem cell function of the slow-dividing fraction among human hematopoietic progenitor cells by genome-wide analysis Blood, August 1, 2004; 104(3): 675 - 686. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. T. Dorsam, C. M. Ferrell, G. P. Dorsam, M. K. Derynck, U. Vijapurkar, D. Khodabakhsh, B. Pau, H. Bernstein, C. M. Haqq, C. Largman, et al. The transcriptome of the leukemogenic homeoprotein HOXA9 in human hematopoietic cells Blood, March 1, 2004; 103(5): 1676 - 1684. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. T. Pohar, H. Sun, and R. V. Davuluri HemoPDB: Hematopoiesis Promoter Database, an information resource of transcriptional regulation in blood cell development Nucleic Acids Res., January 1, 2004; 32(90001): D86 - 90. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||