Research Article

Identification and complete sequencing of novel human transcripts through the use of mouse orthologs and testis cDNA sequences

Published: December 30, 2004
Genet. Mol. Res. 3 (4) : 493-511
Cite this Article:
E.N. Ferreira, L.C. Pires, R.B. Parmigiani, F. Bettoni, R.D. Puga, D.G. Pinheiro, L.Eduardo C. Andrade, L.O. Cruz, T.L. Degaki, M. Faria, F. Festa, D. Giannella-Neto, R.R. Giorgi, G.H. Goldman, F. Granja, A. Gruber, C. Hackel, F. Henrique-Silva, B. Malnic, C.V.B. Manzini, S.K.N. Marie, N.M. Martinez-Rossi, S.M. Oba-Shinjo, M.Ines M.C. Pardini, P. Rahal, C.A. Rainho, S.R. Rogatto, C.M. Romano, V. Rodrigues, M.M. Sales, M. Savoldi, I.D.C.G. da Silva, N.P. da Silva, S.J. de Souza, E.H. Tajara, W.A. Silva, A.J.G. Simpson, M.C. Sogayar, A.A. Camargo, D.M. Carraro (2004). Identification and complete sequencing of novel human transcripts through the use of mouse orthologs and testis cDNA sequences. Genet. Mol. Res. 3(4): 493-511.
1,101 views

Abstract

The correct identification of all human genes, and their derived transcripts, has not yet been achieved, and it remains one of the major aims of the worldwide genomics community. Computational programs suggest the existence of 30,000 to 40,000 human genes. However, definitive gene identification can only be achieved by experimental approaches. We used two distinct methodologies, one based on the alignment of mouse orthologous sequences to the human genome, and another based on the construction of a high-quality human testis cDNA library, in an attempt to identify new human transcripts within the human genome sequence. We generated 47 complete human transcript sequences, comprising 27 unannotated and 20 annotated sequences. Eight of these transcripts are variants of previously known genes. These transcripts were characterized according to size, number of exons, and chromosomal localization, and a search for protein domains was undertaken based on their putative open reading frames. In silico expression analysis suggests that some of these transcripts are expressed at low levels and in a restricted set of tissues.

The correct identification of all human genes, and their derived transcripts, has not yet been achieved, and it remains one of the major aims of the worldwide genomics community. Computational programs suggest the existence of 30,000 to 40,000 human genes. However, definitive gene identification can only be achieved by experimental approaches. We used two distinct methodologies, one based on the alignment of mouse orthologous sequences to the human genome, and another based on the construction of a high-quality human testis cDNA library, in an attempt to identify new human transcripts within the human genome sequence. We generated 47 complete human transcript sequences, comprising 27 unannotated and 20 annotated sequences. Eight of these transcripts are variants of previously known genes. These transcripts were characterized according to size, number of exons, and chromosomal localization, and a search for protein domains was undertaken based on their putative open reading frames. In silico expression analysis suggests that some of these transcripts are expressed at low levels and in a restricted set of tissues.

Download: