Bioinformatics

Support vector machines for novel class detection in Bioinformatics

E. J. Spinosa and de Carvalho, A. C. P. L. F., Support vector machines for novel class detection in Bioinformatics, vol. 4, pp. 608-615, 2005.

Novelty detection techniques might be a promising way of dealing with high-dimensional classification problems in Bioinformatics. We present preliminary results of the use of a one-class support vector machine approach to detect novel classes in two Bioinformatics databases. The results are compatible with theory and inspire further investigation.

Mining ORESTES no-match database: can we still contribute to cancer transcriptome?

Rda Silva Fonseca, Carraro, D. Maria, and Brentani, H., Mining ORESTES no-match database: can we still contribute to cancer transcriptome?, vol. 5, pp. 24-32, 2006.

The Human Cancer Genome Project generated about 1 million expressed sequence tags by the ORESTES method, principally with the aim of obtaining data from cancer. Of this total, 341,680 showed no similarity with sequences in the public transcript databases, referred to as “no-match”. Some of them represent low abundance or difficult to detect human transcripts, but part of these sequences represent genomic contamination or immature mRNA. We performed a bioinformatics pipeline to determine the novelty of ORESTES “no-match” datasets from prostate or breast tissues.

Reconstruction of phylogenetic trees using the ant colony optimization paradigm

M. Perretto and Lopes, H. Silvério, Reconstruction of phylogenetic trees using the ant colony optimization paradigm, vol. 4, pp. 581-589, 2005.

We developed a new approach for the reconstruction of phylogenetic trees using ant colony optimization metaheuristics. A tree is constructed using a fully connected graph and the problem is approached similarly to the well-known traveling salesman problem. This methodology was used to develop an algorithm for constructing a phylogenetic tree using a pheromone matrix. Two data sets were tested with the algorithm: complete mitochondrial genomes from mammals and DNA sequences of the p53 gene from several eutherians.

BayBoots: a model-free Bayesian tool to identify class markers from gene expression data

R. Z. N. Vêncio, Patrão, D. F. C., Baptista, C. S., Pereira, C. A. B., and Zingales, B., BayBoots: a model-free Bayesian tool to identify class markers from gene expression data, vol. 5, pp. 138-142, 2006.

One of the goals of gene expression experiments is the identification of differentially expressed genes among populations that could be used as markers. For this purpose, we implemented a model-free Bayesian approach in a user-friendly and freely available web-based tool called BayBoots. In spite of a common misunderstanding that Bayesian and model-free approaches are incompatible, we merged them in the BayBoots implementation using the Kernel density estimator and Rubin’s Bayesian Bootstrap.

Identification of a new Schistosoma mansoni membrane-bound protein through bioinformatic analysis

F. C. Cardoso, Pinho, J. M. R., Azevedo, V., and Oliveira, S. C., Identification of a new Schistosoma mansoni membrane-bound protein through bioinformatic analysis, vol. 5, pp. 609-618, 2006.

Progress in schistosome genome research has enabled investigators to move rapidly from genome sequences to vaccine development. Proteins bound to the surface of parasites are potential vaccine candidates, or they can be used for diagnosis. We analyzed 4342 proteins deduced from the Schistosoma mansoni transcriptome with bioinformatic computer programs. Thirty-four proteins had membrane-bound motifs. Within this group, we selected the Sm29 protein to be further characterized by in silico analysis.

Application of latent semantic indexing to evaluate the similarity of sets of sequences without multiple alignments character-by-character

B. R. G. M. Couto, Ladeira, A. P., and Santos, M. A., Application of latent semantic indexing to evaluate the similarity of sets of sequences without multiple alignments character-by-character, vol. 6, pp. 983-999, 2007.

Most molecular analyses, including phylogenetic inference, are based on sequence alignments. We present an algorithm that estimates relatedness between biomolecules without the requirement of sequence alignment by using a protein frequency matrix that is reduced by singular value decomposition (SVD), in a latent semantic index information retrieval system. Two databases were used: one with 832 proteins from 13 mitochondrial gene families and another composed of 1000 sequences from nine types of proteins retrieved from GenBank.

Intraflagellar transport complex in Leishmania spp. In silico genome-wide screening and annotation of gene function

J. J. S. Gouveia, Vasconcelos, E. J. R., Pacheco, A. C. L., Araújo-Filho, R., Maia, A. R. S., Kamimura, M. T., Costa, M. P., Viana, D. A., Costa, R. B., Maggioni, R., and Oliveira, D. M., Intraflagellar transport complex in Leishmania spp. In silico genome-wide screening and annotation of gene function, vol. 6, pp. 766-798, 2007.

Flagella are constructed and maintained through the highly conserved process of intraflagellar transport (IFT), which is a rapid movement of particles along the axonemal microtubules of cilia/flagella. Particles that are transported by IFT are composed of several protein subunits comprising two complexes (A and B), which are conserved among green algae, nematodes, and vertebrates.

VSQual: a visual system to assist DNA sequencing quality control

E. Binneck, Silva, J. Flávio V., Neumaier, N., Farias, J. Renato B., and Nepomuceno, A. L., VSQual: a visual system to assist DNA sequencing quality control, vol. 3, pp. 474-482, 2004.

A lack of pliant software tools that support small- to medium-scale DNA sequencing efforts is a major hindrance for recording and using laboratory workflow information to monitor the overall quality of data production. Here we describe VSQual, a set of Perl programs intended to provide simple and powerful tools to check several quality features of the sequencing data generated by automated DNA sequencing machines. The core program of VSQual is a flexible Perl-based pipeline, designed to be accessible and useful for both programmers and non-programmers.

Update of microbial genome programs for bacteria and archaea

P. Borges San Celestino, de Carvalho, L. Rodrigues, de Freitas, L. Martins, Martins, N. Florêncio, Pacheco, L. Gustavo Ca, Miyoshi, A., Azevedo, V., and Dorella, F. Alves, Update of microbial genome programs for bacteria and archaea, vol. 3, pp. 421-431, 2004.

Since the Haemophilus influenzae genome sequence was completed in 1995, 172 other prokaryotic genomes have been completely sequenced, while 508 projects are underway. Besides pathogens, organisms important in several other fields, such as biotechnology and bioremediation, have also been sequenced. Institutions choose the organisms they wish to sequence according to the importance that these species represent to them, the availability of the microbes, and based on the similarity of a species of interest with others that have been sequenced previously.

Pages

Subscribe to Bioinformatics