ISCID News Editor
Member # 1417
posted 28. November 2004 02:07
BMC Evolutionary Biology
Copyright © 2004 Glusman et al; licensee BioMed Central Ltd.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
BMC Evol Biol. 2004; 4: 43.
doi: 10.1186/1471-2148-4-43. Published online 2004 November 4.
An enigmatic fourth runt domain gene in the fugu genome: ancestral gene loss versus accelerated evolution
Gustavo Glusman, Amardeep Kaur, Leroy Hood, and Lee Rowen
Received September 7, 2004; Accepted November 4, 2004.
The runt domain transcription factors are key regulators of developmental processes in bilaterians, involved both in cell proliferation and differentiation, and their disruption usually leads to disease. Three runt domain genes have been described in each vertebrate genome (the RUNX gene family), but only one in other chordates. Therefore, the common ancestor of vertebrates has been thought to have had a single runt domain gene.
Analysis of the genome draft of the fugu pufferfish (Takifugu rubripes) reveals the existence of a fourth runt domain gene, FrRUNT, in addition to the orthologs of human RUNX1, RUNX2 and RUNX3. The tiny FrRUNT packs six exons and two putative promoters in just 3 kb of genomic sequence. The first exon is located within an intron of FrSUPT3H, the ortholog of human SUPT3H, and the first exon of FrSUPT3H resides within the first intron of FrRUNT. The two gene structures are therefore "interlocked". In the human genome, SUPT3H is instead interlocked with RUNX2. FrRUNT has no detectable ortholog in the genomes of mammals, birds or amphibians. We consider alternative explanations for an apparent contradiction between the phylogenetic data and the comparison of the genomic neighborhoods of human and fugu runt domain genes. We hypothesize that an ancient RUNT locus was lost in the tetrapod lineage, together with FrFSTL6, a member of a novel family of follistatin-like genes.
Our results suggest that the runt domain family may have started expanding in chordates much earlier than previously thought, and exemplify the importance of detailed analysis of whole-genome draft sequence to provide new insights into gene evolution.
Results - Paragraphs 1, 2 and 3
The fugu genome has at least four runt domain genes
Our search for RD genes in the fugu draft yielded four distinct genomic scaffolds (Fig. 1 and Table 1), each containing a single, complete RD gene. Each scaffold had one or more sequence gaps, some within the RD genes, others between them and their neighbors. We employed a directed sequencing approach to obtain the additional sequence needed to close the gaps in these four scaffolds and to improve sequence quality.We studied the four scaffold sequences using the GESTALT Workbench  and constructed hypothetical gene structures for the fugu RD genes by maximizing similarity to known vertebrate RD proteins. Three of the four RD genes found in the fugu genome have clear one-to-one similarity relationships with the three mammalian RUNX genes (see phylogenetic analysis below). They have been assumed to be their orthologs [14,17]; we call them FrRUNX1, FrRUNX2 and FrRUNX3 (Fig. 1). Their genomic structures are similar to those of their human counterparts, but their sizes have evolved differently. RUNX3 is the smallest of the three human RUNX genes, while in fugu FrRUNX3 is the largest (Table 1), and FrRUNX2 is significantly larger than FrRUNX1. FrRUNX1 has acquired an additional intron  that is not present in human RUNX1 or in any other RD gene. This intron is just 65 bp long, has canonical splice signals, and is in phase 0 with respect to the protein reading frame, at the beginning of the runt domain. An additional intron has been described at the 5' end of the coding region, yielding a short form that would be locally non-homologous to the other RD genes . A detailed comparison of human RUNX2 and FrRUNX2 has been published . In both human and in fugu, RUNX3 has the highest G+C content of the RD genes, while the G+C content of RUNX2 differs significantly between the two species (Table 1).
The fugu RUNT gene
In addition to the three RUNX genes, the fugu genome has a fourth and more divergent runt-domain gene, that we named FrRUNT. FrRUNT is an extremely compact gene, spanning just 3 kb of genomic sequence (Fig. 2). Based on sequence analysis only, FrRUNT appears to have two promoters, with an intron separating the hypothetical distal promoter (P1) and first exon from the main body of the gene. This intron is usually very long in RUNX genes. It is indeed the longest intron observed in FrRUNT, but it is nevertheless very short, spanning just 1372 bp. There is a local concentration of CpG dinucleotides 200–300 bp upstream of exon 2 (Figs. 1, 2), suggesting that an incipient CpG island might function as a proximal promoter (P2). The G+C content is not elevated in this area, in similarity to the CpG islands of the fugu RUNX genes (Fig. 1). The main body of the gene is split into five exons, separated by much shorter introns (69–190 bp long), all of which have canonical splice signals. The longest predicted FrRUNT product is 294 amino acids long, in contrast with the 496 aa, 463 aa and 421 aa observed for FrRUNX1, FrRUNX2 and FrRUNX3, respectively. The small number of exons in FrRUNT leaves little room for alternative splicing by exon skipping, without compromising functionally important domains of the protein. The overall compactness of the gene makes the incorporation of yet undetected exons improbable. Several cryptic splice sites within the exons could enable splicing variants altering exon length.
We identified a fourth runt domain gene in the fugu genome, which appears to represent either a pufferfish-specific, fast-evolving derivative of RUNX2, or a direct descendant of the ancestral chordate RUNT gene. We find the latter hypothesis more reasonable. This novel gene evolved in parallel with the vertebrate RUNX genes, and while it has been preserved in pufferfishes, it appears to have been lost entirely in tetrapods. This suggests that the ancestral vertebrate was more complex than previously suspected.
By studying a very limited set of fugu genomic regions, namely the scaffolds related to RD genes, we have identified seven apparently functional fugu genes that are absent from the human genome (Fig. 5), and were probably lost early in tetrapod history. In the process of identifying relevant homologs for one of these genes (FrFSTL6), we have identified a new family of follistatin-like genes in the human genome. Phylogenetic analysis of the RD protein sequences led to results that contradict those derived from comparative genomics, but we showed that the two could be reconciled into a coherent evolutionary model. These results underscore the importance of obtaining complete genomic sequences of strongly divergent vertebrates, and the value to be derived by performing detailed and integrated analyses of their gene complements.
[Emphases added by ISCID News Editor]
See the Abstract at BMC Evolutionary Biology
Read Full Paper at BMC Evolutionary Biology
Some key terms in this summary are defined in the ISCID Encyclopedia of Science and Philosophy:
[ 07. December 2004, 23:26: Message edited by: Moderator ]