Home   About Us   eMedicine Search   Drug Development   Feedback   Google Scholar Search   Intranet 
Literature Database   News   Photo Gallery   Publications   Site Map   Site Search   Useful Links 
 

 Back to  Bioinformatics

Enhanced by Neuroinformation

Bioinformatics Reviews: 2001

(175 References)

Achard, F., G. Vaysseix, et al. (2001). "XML, bioinformatics and data integration." Bioinformatics 17(2): 115-25.

            Motivation: The eXtensible Markup Language (XML) is an emerging standard for structuring documents, notably for the World Wide Web. In this paper, the authors present XML and examine its use as a data language for bioinformatics. In particular, XML is compared to other languages, and some of the potential uses of XML in bioinformatics applications are presented. The authors propose to adopt XML for data interchange between databases and other sources of data. Finally the discussion is illustrated by a test case of a pedigree data model in XML. Contact: Emmanuel.Barillot@infobiogen.fr

 

Adam, B. L., A. Vlahou, et al. (2001). "Proteomic approaches to biomarker discovery in prostate and bladder cancers." Proteomics 1(10): 1264-70.

            Proteomic technologies, including high resolution two-dimensional electrophoresis (2-DE), antibody/protein arrays, and advances in mass spectrometry (MS), are providing the tools needed to discover and identify disease associated biomarkers. Although application of these technologies to search for potential diagnostic/prognostic biomarkers associated with prostate and bladder cancer have been somewhat limited to date, proteins either overexpressed or underexpressed have been detected in both these urological cancers. Recent advances in mass spectrometry, especially platforms that permit rapid "fingerprint" profiling of multiple biomarkers, and tandem mass spectrometers for protein identification, will most assuredly enhance the discovery, identification, and characterization of potential cancer associated biomarkers. Furthermore, application of laser capture microdissection microscopes has provided a rapid and reproducible approach to procure pure populations of cells. This technology coupled to 2-DE and MS has significantly aided the elucidation of the differential expression profiles between disease, benign and normal prostate and bladder cell populations. Finally, development and application of learning algorithms and bioinformatics to the data generated by these proteomic technologies will be essential in determining the clinical potential of a protein biomarker. The purpose of this review is to provide the reader with an overview of the application of these technologies in the search and identification of potential diagnostic/prognostic biomarkers for prostate and bladder cancers.

 

Akutsu, T. (2001). "[Algorithms for inferring genetic networks]." Tanpakushitsu Kakusan Koso 46(16 Suppl): 2505-9.

           

Alix, A. J. (2001). "[A turning point in the knowledge of the structure-function-activity relations of elastin]." J Soc Biol 195(2): 181-93.

            In this review are presented the last new results of our research group dealing with the molecular structures (atomic level) of tropoelastin, elastin and elastin derived peptides studied by using essentially methods of bioinformatics (theoretical predictions and molecular modelling) linked to experimental circular dichroism spectroscopic studies. We already had characterized both the local secondary structure and some parts of the tertiary structure of the tropoelastin and elastin molecules (human, bovine...), by using either theoretical predictions (local secondary structure, linear epitopes...) and/or experimental data (optical spectroscopic methods: Raman scattering, infrared absorption, circular dichroism). Except the cross-linking regions which are in helical conformations, the whole tropoelastin structure displays a lot of beta-reverse turns which usually belong to irregular structures in proteins. These turns play a key role in other regularly structures orientation (alpha-helix, beta-strand), thus they are very important in the native protein 3D architecture. It is particularly true for human tropoelastin, because its sequence is rich in glycines and prolines, and these residues are frequently met in beta-turns (a beta-turn is made of four consecutive residues which are stabilized by an hydrogen bond). Several types of beta-turns can be defined with the dihedral angles values phi and psi of the two central residues. Thus, by using a very recent updated set of propensities for the amino acid residues to belong to given types of reverse beta-turns (extracted from a reference set of known 3-D structures of globular proteins), we have determined, (by using our home made software COUDES), for all possible tetrapeptides of the human tropoelastin sequence, the distribution and the characterization of the possible type of turns. Thus, it is shown that the locations and/or the types of these reverse beta-turns reveal a regularity and are not all random. This confirms our hypothesis that intra-molecular elasticity of tropoelastin could be explained by the possibility of transitions between conformations involving short beta-strands and beta-turns. This result is of great interest in the construction (by using molecular biology) of elastic biomaterials derived from the elastin sequence (particularly, the elastin derived peptides corresponding to the sequence exon 21--(exon 24--exon 24...). Our study permit also to predict the conformations of specific elastin derived peptides which could have interesting biological activity. Peptides resulting from the degradation of elastin, the insoluble polymer of tropoelastin and responsible for the elasticity of vertebrate tissues, can induce biological effects and notably the regulation of matrix metalloproteinases (MMP-s) activity. Recently, it was proposed that some elastin derived hexapeptides resulting from circular permutations of VGVAPG (a three fold repetition sequence in exon 24 of human tropoelastin) possess MMP-1 production and activation regulation properties. This effect depends on the presence of the tropoelastin specific membraneous receptor 67 KDa EBP (Elastin Binding Protein). Our results obtained by using both circular dichroism spectroscopy and linear predictions confirmed the hypothesis of a structure dependent mechanism with a possibly occurring type VIII beta-turn on the first four residues of the GXXPG sequence consensus which is only present among all active peptides. Thus, we have performed extensive molecular dynamics studies, in both implicit and explicit solvent, on these active and inactive elastin derived hexapeptides. Using our own analysis method of pattern recognition of the types of the beta-reverse-turns followed during the molecular dynamics trajectory, we found that active and inactive peptides effectively form two well distinct conformational groups in which active peptides preferentially adopt conformation close to type VIII GXXP (beta-reverse-turn. The structural role of the C terminal G residue could also be explained. Additional molecular simulations on (VGVAPG)2 and (VGVAPG)3 show the formation of two or three GXXP tetrapeptides adopting a structure close to type VIII beta-reverse-turn, suggesting a local conformational preference for this motif. This observation of a specific structural single and/or repeated motif is in agreement with the circular dichroism spectra of the involved (VGVAPG)1, (VGVAPG)2 and (VGVAPG)3 peptides and then it can be proposed that their biological activities have to be linear. The final aim of this type of work is to understand more about the sequence/structure/function/activity relationships of those structured peptides in order to propose specific sequences (corresponding to specific structures) for best biological activity results.

 

Andrade, M. A., C. Petosa, et al. (2001). "Comparison of ARM and HEAT protein repeats." J Mol Biol 309(1): 1-18.

            ARM and HEAT motifs are tandemly repeated sequences of approximately 50 amino acid residues that occur in a wide variety of eukaryotic proteins. An exhaustive search of sequence databases detected new family members and revealed that at least 1 in 500 eukaryotic protein sequences contain such repeats. It also rendered the similarity between ARM and HEAT repeats, believed to be evolutionarily related, readily apparent. All the proteins identified in the database searches could be clustered by sequence similarity into four groups: canonical ARM-repeat proteins and three groups of the more divergent HEAT-repeat proteins. This allowed us to build improved sequence profiles for the automatic detection of repeat motifs. Inspection of these profiles indicated that the individual repeat motifs of all four classes share a common set of seven highly conserved hydrophobic residues, which in proteins of known three-dimensional structure are buried within or between repeats. However, the motifs differ at several specific residue positions, suggesting important structural or functional differences among the classes. Our results illustrate that ARM and HEAT-repeat proteins, while having a common phylogenetic origin, have since diverged significantly. We discuss evolutionary scenarios that could account for the great diversity of repeats observed.

 

Baba, Y. (2001). "Development of novel biomedicine based on genome science." Eur J Pharm Sci 13(1): 3-4.

            Towards the post genomic sequencing era, conventional drug discovery is drastically improving genomic technologies and computational advances. The completion of the entire genome sequence of many experimental organisms as well as the human organism allow us to compare several genomic sequences, comparative genomics, to get valuable information for gene discovery and functional genomics. Pharmacogenomic studies and chemical genomic investigations are quickly becoming fundamental techniques for genomic drug discovery. Additionally, progress in microchip and microarray technology has been stimulating genomic drug discovery studies. This paper reviews recent progress in human genome research, basic elements in the new strategy for drug discovery based on genome science, and future perspectives for the bio and pharmaceutical industries.

 

Barbier-Brygoo, H., F. Gaymard, et al. (2001). "Strategies to identify transport systems in plants." Trends Plant Sci 6(12): 577-85.

            Since the first molecular structures of plant transporters were discovered over a decade ago, considerable advances have been made in the study of plant membrane transport, but we still do not understand transport regulation. The genes encoding the transport systems in the various cell membranes are still to be identified, as are the physiological roles of most transport systems. A wide variety of complementary strategies are now available to study transport systems in plants, including forward and reverse genetics, proteomics, and in silico exploitation of the huge amount of information contained in the completely known genomic sequence of Arabidopsis.

 

Barratt, M. D. and R. A. Rodford (2001). "The computational prediction of toxicity." Curr Opin Chem Biol 5(4): 383-8.

            Recent developments in the prediction of toxicity from chemical structure have been reviewed. Attention has been drawn to some of the problems that can be encountered in the area of predictive toxicology, including the need for a multi-disciplinary approach and the need to address mechanisms of action. Progress has been hampered by the sparseness of good quality toxicological data. Perhaps too much effort has been devoted to exploring new statistical methods rather than to the creation of data sets for hitherto uninvestigated toxicological endpoints and/or classes of chemicals.

 

Bartlett, J. (2001). "Technology evaluation: SAGE, Genzyme molecular oncology." Curr Opin Mol Ther 3(1): 85-96.

            Genzyme Molecular Oncology (GMO) is using its SAGE (Serial Analysis of Gene Expression) combinatorial chemistry technology to screen compound libraries. SAGE is a high-throughput, high-efficiency method to simultaneously detect and measure the expression levels of genes expressed in a cell at a given time, including rare genes. SAGE can be used in a wide variety of applications to identify disease-related genes, to analyze the effect of drugs on tissues and to provide insights into disease pathways. It works by isolating short fragments of genetic information from the expressed genes that are present in the cell being studied. These short sequences, called SAGE tags, are linked together for efficient sequencing. The sequence data are then analyzed to identify each gene expressed in the cell and the levels at which each gene is expressed. This information forms a library that can be used to analyze the differences in gene expression between cells [293437]. By December 1999, GMO had identified a set of 40 genes from 3.5 million transcripts that were expressed at elevated levels in all cancer tissue but not seen in normal tissue. The company hope these may provide diagnostic markers or therapeutic targets. The studies also provided data furthering the understanding of the way cells use their genome [349968]. GMO has signed a collaborative agreement with the National Cancer Institute (NCI) to search for new drug candidates in the field of cancer chemotherapy. The collaboration combines GMO's SAGE technology with the NCI's extensive array of 60 cell-based cancer screens. Under the agreement, the NCI will evaluate Genzyme's library consisting of one million compounds against selected cancer screens to identify compounds with anticancer properties [255082]. Xenometrix granted a license agreement for gene expression profiling to GMO in February 1999, giving company access to claims covered in issued US and European patents. The license is non-exclusive and covers the collection of gene expression profiles utilizing all methods including high-density microarrays [315329]. Ontogeny (now Curis Inc) and GMO have entered into a collaboration to study genes for the potential discovery of therapeutic products. GMO will use its SAGE technology to produce libraries of RNA supplied by Ontogeny. The libraries will be put through Ontogeny's screening program [279417]. Under an agreement made in August 1998, Bayer will use SAGE technology to identify genes and thus potential therapeutics [317452]. GMO and Hexagen signed an agreement in March 1998 on the use of SAGE technology in Hexagen's disease gene discovery programs. The first phase of the collaboration will focus on the use of SAGE in studies within Hexagen's type II diabetes gene discovery program. Hexagen has designed these studies to discover susceptibility genes for diabetes and to provide gene expression information for genes associated with type II diabetes [280012]. GMO signed a five-year agreement with Johns Hopkins University School of Medicine (JHU) in July 1997 for research leading to the identification of cancer-related genes. Under the terms of the agreement, JHU researchers will use the SAGE technology to identify and analyze gene expression in cancer. The power of SAGE in finding rare genes was confirmed in a study of gastrointestinal cancer by JHU researchers published in the May 27, 1997 issue of Science. The study showed that of almost 50,000 genes expressed in normal gastrointestinal cells and gastrointestinal tumor cells, 86% of the genes were present at five or fewer copies per cell. Only 51% of those low-abundancy genes were recorded in the GenBank database of known genes in the human genome [257128].

 

Baxter, S. M. and J. S. Fetrow (2001). "Sequence- and structure-based protein function prediction from genomic information." Curr Opin Drug Discov Devel 4(3): 291-5.

            Existing functional annotation transfer is fraught with inaccuracies that may hinder forward interpretation and mining of genomic data. Hand-curation of the annotation placed into databases is not practical. In lieu of experimental evidence, computational biological approaches offer high-throughput tools to predict function accurately; however, these methods are still notably deficient in defining and describing the complexity of protein function. Enriching genomic sequences obtained from sequencing efforts and expression array methods with protein function information and classification will be an efficient first step for incorporating genomic data into drug discovery programs.

 

Bonneau, R., J. Tsai, et al. (2001). "Functional inferences from blind ab initio protein structure predictions." J Struct Biol 134(2-3): 186-90.

            Ab initio protein structure prediction methods have improved dramatically in the past several years. Because these methods require only the sequence of the protein of interest, they are potentially applicable to the open reading frames in the many organisms whose sequences have been and will be determined. Ab initio methods cannot currently produce models of high enough resolution for use in rational drug design, but there is an exciting potential for using the methods for functional annotation of protein sequences on a genomic scale. Here we illustrate how functional insights can be obtained from low-resolution predicted structures using examples from blind ab initio structure predictions from the third and fourth critical assessment of structure prediction (CASP3, CASP4) experiments.

 

Bornholdt, S. (2001). "Modeling genetic networks and their evolution: a complex dynamical systems perspective." Biol Chem 382(9): 1289-99.

            After finishing the sequence of the human genome, a functional understanding of genome dynamics is the next major step on the agenda of the biosciences. New approaches, such as microarray techniques, and new methods of bioinformatics provide powerful tools aiming in this direction. In the last few years, important parts of genome organization and dynamics in a number of model organisms have been determined. However, an integrated view of gene regulation on a genomic scale is still lacking. Here, genome function is discussed from a complex dynamical systems perspective: which dynamical properties can a large genomic system exhibit in principle, given the local mechanisms governing the small subsystems that we know today? Models of artificial genetic networks are used to explore dynamical principles and possible emergent dynamical phenomena in networks of genetic switches. One observes evolution of robustness and dynamical self-organization in large networks of artificial regulators that are based on the dynamic mechanism of transcriptional regulators as observed in biological gene regulation. Possible biological observables and ways of experimental testing of global phenomena in genome function and dynamics are discussed. Models of artificial genetic networks provide a tool to address questions in genome dynamics and their evolution and allow simulation studies in evolutionary genomics.

 

Brazma, A. and J. Vilo (2001). "Gene expression data analysis." Microbes Infect 3(10): 823-9.

            Microarrays are one of the latest breakthroughs in experimental molecular biology, which allow monitoring of gene expression for tens of thousands of genes in parallel and are already producing huge amounts of valuable data. Analysis and handling of such data is becoming one of the major bottlenecks in the utilization of the technology. The raw microarray data are images, which have to be transformed into gene expression matrices, tables where rows represent genes, columns represent various samples such as tissues or experimental conditions, and numbers in each cell characterize the expression level of the particular gene in the particular sample. These matrices have to be analyzed further if any knowledge about the underlying biological processes is to be extracted. In this paper we concentrate on discussing bioinformatics methods used for such analysis. We briefly discuss supervised and unsupervised data analysis and its applications, such as predicting gene function classes and cancer classification as well as some possible future directions.

 

Brenner, S. E. (2001). "A tour of structural genomics." Nat Rev Genet 2(10): 801-9.

            Structural genomics projects aim to provide an experimental or computational three-dimensional model structure for all of the tractable macromolecules that are encoded by complete genomes. To this end, pilot centres worldwide are now exploring the feasibility of large-scale structure determination. Their experimental structures and computational models are expected to yield insight into the molecular function and mechanism of thousands of proteins. The pervasiveness of this information is likely to change the use of structure in molecular biology and biochemistry.

 

Brizuela, L., P. Braun, et al. (2001). "FLEXGene repository: from sequenced genomes to gene repositories for high-throughput functional biology and proteomics." Mol Biochem Parasitol 118(2): 155-65.

            The vast amount of information generated by the human genome sequencing project and related projects has given rise to a new paradigm in experimental biology. This new paradigm invokes the experimentation and data analysis at genome-wide scales, as well as the generation of new technologies and resources that take full advantage of the available sequence information. The Institute of Proteomics at Harvard Medical School is building a comprehensive, characterized, arrayed and flexible gene repository that will allow full exploitation of the genomic information by enabling functional genomics as well as protein expression, purification and analysis at genome wide scale. The FLEXGene repository (Full Length EXpression-ready) will contain clones representing the complete set of open reading frames (ORFs) of different organisms including H. sapiens and several pathogens and model organisms. The clones are constructed using recombination-based cloning technology so that hundreds or thousands of coding regions can be transferred into any expression vector in a parallel and timely mode, allowing the broadest variety of experiments to be carried out.

 

Brookes, A. J. (2001). "Rethinking genetic strategies to study complex diseases." Trends Mol Med 7(11): 512-6.

            Understanding the genetic basis of complex diseases is turning out to be difficult, prompting a widespread (re-)evaluation of the relevant issues. 'Forward' and 'reverse' genetics strategies have been applied arguably in a manner only suitable for much simpler diseases. It would now be beneficial to pay detailed attention to experimental design, and to increase study scales dramatically. Ultimately, this would lead to completely hypothesis-free, truly comprehensive, multi-platform investigations. Such studies would maximize the chances of finding data patterns indicative of real etiology, although many aspects of complex disease causation might simply be too intricate and inconsistent to ever be deciphered. Therefore, considerable technology development is an immediate priority, along with parallel advances in bioinformatics and biostatistics systems aimed at discriminating between marginal signals and background noise within extremely large, diverse and complex data sets. Community standards and open data sharing will be essential ingredients for success in this exciting 21st-century challenge.

 

Brosch, R., A. S. Pym, et al. (2001). "The evolution of mycobacterial pathogenicity: clues from comparative genomics." Trends Microbiol 9(9): 452-8.

            Comparative genomics, and related technologies, are helping to unravel the molecular basis of the pathogenesis, host range, evolution and phenotypic differences of the slow-growing mycobacteria. In the highly conserved Mycobacterium tuberculosis complex, where single-nucleotide polymorphisms are rare, insertion and deletion events (InDels) are the principal source of genome plasticity. InDels result from recombinational or insertion sequence (IS)-mediated events, expansion of repetitive DNA sequences, or replication errors based on repetitive motifs that remove blocks of genes or contract coding sequences. Comparative genomic analyses also suggest that loss of genes is part of the ongoing evolution of the slow-growing mycobacterial pathogens and might also explain how the vaccine strain BCG became attenuated.

 

Califano, A. (2001). "Advances in sequence analysis." Curr Opin Struct Biol 11(3): 330-3.

            In its early days, the entire field of computational biology revolved almost entirely around biological sequence analysis. Over the past few years, however, a number of new non-sequence-based areas of investigation have become mainstream, from the analysis of gene expression data from microarrays, to whole-genome association discovery, and to the reverse engineering of gene regulatory pathways. Nonetheless, with the completion of private and public efforts to map the human genome, as well as those of other organisms, sequence data continue to be a veritable mother lode of valuable biological information that can be mined in a variety of contexts. Furthermore, the integration of sequence data with a variety of alternative information is providing valuable and fundamentally new insight into biological processes, as well as an array of new computational methodologies for the analysis of biological data.

 

Cho, Y. and V. Walbot (2001). "Computational methods for gene annotation: the Arabidopsis genome." Curr Opin Biotechnol 12(2): 126-30.

            Since the structure of the DNA molecule was identified half a century ago, the complete genome sequence has been determined for 37 prokaryotes and several eukaryotes. With the exponential growth of genetic information, bioinformatics has attempted to predict gene locations and functions in cyberspace prior to experimental confirmation at the bench.

 

Claverie, J. M. (2001). "[Transcriptome analysis in cancerology: bioinformatics aspects]." Bull Cancer 88(3): 269-76.

            Recent technological advances (e.g. various DNA arrays and chips) allow the measurement of expression level (mRNA abundance) for thousand of genes simultaneously, over multiple conditions or time. Initially developed and tested on model systems such as yeast or in vitro cell line cultures, these techniques have recently begun to be applied to the analysis of human cancers. Initial results are promising, and large-scale gene expression profiling is now expected to become a clinical tool for better tumour identification, prognosis, and optimal treatment design. It is thus important that clinicians become familiar with the theoretical principles underlying the interpretation of gene expression profiles as used in three different contexts: gene discovery, tumour class prediction, and molecular diagnosis. This is the purpose of the present article.

 

Coppel, R. L. (2001). "Bioinformatics and the malaria genome: facilitating access and exploitation of sequence information." Mol Biochem Parasitol 118(2): 139-45.

            The torrent of sequence information unleashed by the various genome sequencing projects, including that of Plasmodium falciparum, will lead to an unprecedented increase in the data available for research purposes. The scientific community is struggling to develop ways to assimilate this information and ensure that it is fully analysed in a way that enables rapid development of new therapeutic and diagnostic advances. This is particularly so for the field of tropical medicine where many of the scientists have had limited training in the area of Bioinformatics and may be further hampered by poor access to the sequence data. A number of collections of malaria genome sequence are available, each with their own advantages and disadvantages, however further improvements in these information resources are needed. In particular, there would be great benefit in integrating genomic sequence and functional genomics results with the large amount of pre-existing knowledge related to parasite biology and immunological interactions with the host. Attempts to achieve this include the PlasmoDB database, and the lessons learned in this effort could be of great utility to other organism-specific databases.

 

Cowman, A. F. (2001). "Functional analysis of drug resistance in Plasmodium falciparum in the post-genomic era." Int J Parasitol 31(9): 871-8.

            Malaria has plagued humans throughout recorded history and results in the death of over 2 million people per year. The protozoan parasite Plasmodium falciparum causes the most severe form of malaria in humans. Chemotherapy has become one of the major control strategies for this parasite; however, the development of drug resistance to virtually all of the currently available drugs is causing a crisis in the use and deployment of these compounds for prophylaxis and treatment of this disease. The genome sequence of P. falciparum is providing the informational base for the use of whole-genome strategies such as bioinformatics, microarrays and genetic mapping. These approaches, together with the availability of a high-resolution genome linkage map consisting of hundreds of microsatellite markers and the advanced technologies of transfection and proteomics, will facilitate an integrated approach to address important biological questions. In this review we will discuss strategies to identify novel genes involved in the molecular mechanisms used by the parasite to circumvent the lethal effect of current chemotherapeutic agents.

 

Dahl, S. G., O. Edvardsen, et al. (2001). "Bioinformatics and receptor mechanisms of psychotropic drugs." Biotechnol Annu Rev 7: 165-77.

            One important aspect in biotechnology is gene discovery and target validation for drug discovery. Information from the human genome (HUGO) project may be used to deduce the amino acid sequence of all proteins produced in the human body. However, knowing the amino acid sequence of a protein is not the same as knowing its function. Identification of novel molecular targets for discovery of new, safer and more efficient therapeutic drugs from the human genome sequences requires multidisciplinary research efforts, including proteomics, structural biology and bioinformatics. In addition to possible effects on gene expression, most of the currently used therapeutic drugs either have enzymes or membrane proteins as their molecular targets of action. These membrane proteins include transporters of small molecules across cell membranes, ion channels, or receptors that convey signals from one side of a membrane to the other. Our research group as well as others have used computational techniques, along with biotechnology, molecular biology and other experimental techniques, to construct detailed 3-dimensional models of transporter proteins and G-protein coupled receptors (GPCRs), which are the molecular targets of action of psychotropic drugs. The models have been used to simulate the molecular dynamics and study the ligand binding and signal transduction mechanisms of these receptors. The use of bioinformatics, as exemplified in our modelling of GPCRs, is only one of the key factors for success in post-genomic research for new targets for therapeutic drugs.

 

Danielsen, M. (2001). "Bioinformatics of nuclear receptors." Methods Mol Biol 176: 3-22.

           

Davidson, D. and R. Baldock (2001). "Bioinformatics beyond sequence: mapping gene function in the embryo." Nat Rev Genet 2(6): 409-17.

            The spatio-temporal expression pattern of a gene during development is a valuable piece of information. But there is no way to compare precisely the patterns of expression of different genes, or the way the patterns are changed in a mutant. One way to solve this problem is to construct digital reference images of development (a bioinformatics framework), to which expression patterns can be mapped and stored, then compared. Such frameworks are under active development in several model systems. They will form the basis of powerful and integrated gene expression databases, which facilitate comparisons between genes, tissues and species.

 

Davis, D. R., J. B. McAlpine, et al. (2001). "Enterococcus faecalis multi-drug resistance transporters: application for antibiotic discovery." J Mol Microbiol Biotechnol 3(2): 179-84.

            Using bioinformatics approaches, 34 potential multidrug resistance (MDR) transporter sequences representing 4 different transporter families were identified in the unannotated Enterococcus faecalis database (TIGR). A functional genomics campaign generating single-gene insertional disruptions revealed several genes whose absence confers significant hypersensitivities to known antimicrobials. We constructed specific strains, disrupted in a variety of previously unpublished, putative MDR transporter genes, as tools to improve the success of whole-cell antimicrobial screening and discovery. Each of the potential transporters was inactivated at the gene level and then phenotypically characterized, both with single disruption mutants and with 2-gene mutants built upon a delta norA deleted strain background.

 

De Groot, A. S., A. Bosma, et al. (2001). "From genome to vaccine: in silico predictions, ex vivo verification." Vaccine 19(31): 4385-95.

            Bioinformatics tools enable researchers to move rapidly from genome sequence to vaccine design. EpiMer and EpiMatrix are computer-driven pattern-matching algorithms that identify T cell epitopes. Conservatrix, BlastiMer, and Patent-Blast permit the analysis of protein sequences for highly conserved regions, for homology with other known proteins, and for homology with previously patented epitopes, respectively. Two applications of these tools to epitope-driven vaccine design are described in this review. Using Conservatrix and EpiMatrix, we analyzed more than 10000 HIV-1 sequences and identified peptides that were potentially immunostimulatory and highly conserved across HIV-1 clades. MHC binding assays and CTL assays have been carried out: 50 (69%) of the 72 candidate epitopes bound in assays with cell lines expressing the corresponding MHC molecule; 15 of the 24 B7 peptides (63%) stimulated gamma-interferon release in ELISpot assays. These results lend support to the bioinformatics approach to selecting novel, conserved, HIV-1 CTL epitopes. EpiMatrix was also applied to the entire 'proteome' derived from two Mycobacterium tuberculosis (Mtb) genomes. Using EpiMatrix, BlastiMer, and Patent-Blast, we narrowed the list of putative Mtb epitopes to be tested in vitro from 1600000 to 3000, a 99.8% reduction. The pace of vaccine design will accelerate when these and other bioinformatics tools are systematically applied to whole genomes and used in combination with in vitro methods for screening and confirming epitopes.

 

De Luca, V. and P. Laflamme (2001). "The expanding universe of alkaloid biosynthesis." Curr Opin Plant Biol 4(3): 225-33.

            Characterization of many of the major gene families responsible for the generation of central intermediates and for their decoration, together with the development of large genomics and proteomics databases, has revolutionized our capability to identify exotic and interesting natural-product pathways. Over the next few years, these tools will facilitate dramatic advances in our knowledge of the biosynthesis of alkaloids, which will far surpass that which we have learned in the past 50 years. These tools will also be exploited for the rapid characterization of regulatory genes, which control the development of specialized cell factories for alkaloid biosynthesis.

 

de Vos, W. M. (2001). "Advances in genomics for microbial food fermentations and safety." Curr Opin Biotechnol 12(5): 493-8.

            The exponentially growing collection of genomic sequence information, the high-throughput analysis of expression products, and the ability to order this information using advanced bioinformatics are expected to affect biotechnology and life sciences in a profound and unprecedented way. These developments offer many possibilities to improve the functionality of fermentations by food-grade microorganisms and to increase the microbial safety of foods. It will be necessary to combine functional studies with comparative genomics approaches to provide effective strategies for improving the functionality and safety of foods.

 

Degrave, W. M., S. Melville, et al. (2001). "Parasite genome initiatives." Int J Parasitol 31(5-6): 532-6.

            During 1993-1994, scientists from developing and developed countries planned and initiated a number of parasite genome projects and several consortiums for the mapping and sequencing of these medium-sized genomes were established, often based on already ongoing scientific collaborations. Financial and other support came from WHO/TDR, Wellcome Trust and other funding agencies. Thus, the genomes of Plasmodium falciparum, Schistosoma mansoni, Trypanosoma cruzi, Leishmania major, Trypanosoma brucei, Brugia malayi and other pathogenic nematodes are now under study. From an initial phase of network formation, mapping efforts and resource building (EST, GSS, phage, cosmid, BAC and YAC library constructions), sequencing was initiated in gene discovery projects but soon also on a small chromosome, and now on a fully fledged genome scale. Proteomics, functional analysis, genetic manipulation and microarray analysis are ongoing to different degrees in the respective genome initiatives, and as the funding for the whole genome sequencing becomes secured, most of the participating laboratories, apart from larger sequencing centres, become oriented to post-genomics. Bioinformatics networks are being expanded, including in developing countries, for data mining, annotation and in-depth analysis.

 

Dhiman, N., R. Bonilla, et al. (2001). "Gene expression microarrays: a 21st century tool for directed vaccine design." Vaccine 20(1-2): 22-30.

            DNA microarray technology is a new and powerful tool that allows the simultaneous analysis of a large number of nucleic acid hybridization experiments in a rapid and efficient fashion. The development of the DNA microarray chip has been driven by modern techniques of microelectronic fabrication, miniaturization and integration to produce what is referred to as "laboratory-on-chip" devices. The application of DNA chip technology includes the comprehensive analysis of multiple gene mutations and expressed sequences with regard to newer drug designs, host-pathogen interactions and the design of new vaccines. An advantage of microarray technology is that it can assist researchers to better define and understand the expression profile of a given genotype associated with disease, adverse effects from exposure to certain stimuli, or the ability to understand or predict immune responses to specific antigens. This paper briefly reviews DNA microarray technology and its implications with special reference to vaccine design. The technical aspects comprising array manufacturing and design, array hybridization, formatting, scanning and data handling are also briefly discussed.

 

Dixon, D. M. (2001). "US-Japan workshops in medical mycology: past, present and future." Nippon Ishinkin Gakkai Zasshi 42(2): 75-80.

            The Extramural Mycology Program of the National Institutes of Health (NIH), National Institute of Allergy and Infectious Diseases (NIAID) has organized and implemented a five workshop series in medical mycology during a critical period in the evolution of contemporary medical mycology (1992 to 2000; http://www.niaid.nih.gov/research/dmid.htm). The goals of the workshop series were to: initiate interactions; build collaborations; identify research needs; turn needs into opportunities; stimulate molecular research in medical mycology; and summarize recommendations emerging from the workshop proceedings. A recurring recommendation in the series was to foster communications within and beyond the field of medical mycology. US-Japan interactions were noted as one specific example of potential information exchange for mutual benefit. The first formal action directed at this recommendation was the workshop Emergence and Recognition of Fungal Diseases convened under the auspices of the US-Japan Cooperative Medical Science Program (USJCMSP; http://www.niaid. nih.gov/dmid/us%5Fjapan/default.htm) in Bethesda, Maryland USA on 30 June 1999 (D.M. Dixon & T. Matsumoto, co-chairs). A major goal of the workshop was to present contemporary medical mycology to the Joint Committee of the USJCMSP through representative research presentations in order to make the Committee aware of current status in the field, and the potential for scientific interactions. The second formal action is the workshop, under the auspices of the Japanese Society for Medical Mycology Medical Perspectives of Fungal Genome Studies scheduled for 28 November 2000 in Tokyo, Japan (T. Matsumoto & D.M. Dixon, co-chairs). The NIAID Mycology Workshop series recommended interactions between the following groups: academic and pharmaceutical; medical and molecular (model systems); medical and plant pathogens; basic and clinical; mycologists and immunologists. The first two US-Japan workshops can be viewed as consistent with these recommendations, and serve as a Western/Eastern gateway for exchange. The focus of the second US-Japan workshop on genome projects for the medically important fungi provides an excellent model for international communications. Given the tsunami of information that is flowing from genomics and bioinformatics, it is clear that global interactions will be essential in managing and interpreting the data.

 

Edwards, Y. J. and A. Cottage (2001). "Prediction of protein structure and function by using bioinformatics." Methods Mol Biol 175: 341-75.

           

Edwards, Y. J. and S. M. Brocklehurst (2001). "Finding genes in genomic nucleotide sequences by using bioinformatics." Methods Mol Biol 175: 235-47.

           

Emanuelsson, O. and G. von Heijne (2001). "Prediction of organellar targeting signals." Biochim Biophys Acta 1541(1-2): 114-9.

            The subcellular location of a protein is an important characteristic with functional implications, and hence the problem of predicting subcellular localization from the amino acid sequence has received a fair amount of attention from the bioinformatics community. This review attempts to summarize the present state of the art in the field.

 

Engels, M. F. and P. Venkatarangan (2001). "Smart screening: approaches to efficient HTS." Curr Opin Drug Discov Devel 4(3): 275-83.

            Faced with the prospect of a rising number of potential drug targets and given almost unlimited access to internal and external chemistry resources, the 'brute-force' approach to high-throughput screening (HTS) is becoming increasingly unattractive. Pharmaceutical companies realize that they have both to increase the scope of screening experiments and improve on the efficiency of the screening process per se. In acknowledging this development, hybrid screening strategies have been suggested that unite in silico and in vitro screening in one integrated process. The partnering of both screening approaches in one process is believed to exploit the potential of HTS in much smarter and more cost-efficient ways. This review will describe some recent applications of this new screening paradigm and discuss the impact of integrating these novel strategies in the drug discovery process.

 

Fagerlund, T. H. and O. Braaten (2001). "No pain relief from codeine...? An introduction to pharmacogenomics." Acta Anaesthesiol Scand 45(2): 140-9.

            Drug treatment remains a mainstay of medicine. In some situations a drug unexpectedly has no effect, or unforeseen serious side effects occur. For the patient this represents a dangerous and potentially life-threatening situation. It certainly is a distressing experience for the doctor. At the societal level, adverse drug reactions represent a leading cause of disease and death. Genetic variation often underlies these unexpected situations. Pharmacogenetics is the term used about genetically determined variability in the metabolism of drugs. Pharmacogenomics usually refers to drug discovery based on knowledge of genes, but it is a discipline that offers insight into aetiologic mechanisms, and possible prevention and treatment. There is a trend towards a definition of pharmacogenomics that includes both pharmacogenetics and pharmacogenomics as defined above. Our article is an introduction to pharmacogenomics, using the broader definition. Biotechnological methods cannot be understood without a grasp of basic medical genetics, and we provide a brush-up on the fundamentals. We then outline pharmacogenetics, giving examples of genetically based variation in drug metabolising enzymes, drug receptors and drug transporting proteins. Modern biotechnology would be unthinkable without the aid of computers, and we briefly touch upon the field of bioinformatics. Finally, we give an overview of pharmacogenomics in the narrower sense. The rapidly growing field of pharmacogenomics is going to influence our everyday practice of medicine in the immediate future.

 

Fenselau, C. and P. A. Demirev (2001). "Characterization of intact microorganisms by MALDI mass spectrometry." Mass Spectrom Rev 20(4): 157-71.

            The application of MALDI mass spectrometry to desorb protein biomarkers from intact viruses, bacteria, fungus, and spores is the focus of this review. Instrumentation, sample collection, sample preparation, and algorithms for data analysis are summarized. Optimally these analyses should be carried out in less than five minutes. Successful applications are discussed from biotechnology, cell biology, and the pharmaceutical industry.

 

Fey, S. J. and P. M. Larsen (2001). "2D or not 2D. Two-dimensional gel electrophoresis." Curr Opin Chem Biol 5(1): 26-33.

            2D gel electrophoresis is the technology that everyone loves to hate-it requires manual dexterity and precision to reproduce precisely and is thus not well-suited as a high-throughput technology. Although almost everyone would like to replace it, the resolution and sensitivity it offers are exquisite and unsurpassed if one wants a global view of cellular activity. There have been several recent developments, for example, the detection of low abundance proteins, and the resolution possible with narrow-range IPG gels.

 

Foster, J. A. (2001). "Evolutionary computation." Nat Rev Genet 2(6): 428-36.

            Evolution does not require DNA, or even living organisms. In computer science, the field known as 'evolutionary computation' uses evolution as an algorithmic tool, implementing random variation, reproduction and selection by altering and moving data within a computer. This harnesses the power of evolution as an alternative to the more traditional ways to design software or hardware. Research into evolutionary computation should be of interest to geneticists, as evolved programs often reveal properties - such as robustness and non-expressed DNA - that are analogous to many biological phenomena.

 

Frantz, G. D., T. Q. Pham, et al. (2001). "Detection of novel gene expression in paraffin-embedded tissues by isotopic in situ hybridization in tissue microarrays." J Pathol 195(1): 87-96.

            Correlating altered gene expression patterns with particular disease states is a critical step in understanding disease processes and developing treatment strategies. Many thousands of novel gene sequences have recently been annotated in public and private databases and are now available for analysis. Tissue-specific expression patterns of these sequences can be evaluated physically on DNA arrays and other high throughput assays, or virtually by bioinformatics mining of expressed sequence tag (EST) databases. As a secondary screening tool, in situ hybridisation (ISH) not only confirms tissue specificity, but also reveals what is often valuable information about cell-type expression patterns of nov16l sequences. Due to their availability and long-term stability at room temperature, formalin-fixed paraffin-embedded clinical specimens provide an invaluable resource for evaluating expression patterns of novel human genes. We describe a high-throughput approach for identifying and quantifying the expression of novel genes in paraffin-embedded human tissues using isotopic in situ hybridisation and tissue microarrays (TMA).

 

Gedeck, P. and P. Willett (2001). "Visual and computational analysis of structure--activity relationships in high-throughput screening data." Curr Opin Chem Biol 5(4): 389-95.

            Novel analytic methods are required to assimilate the large volumes of structural and bioassay data generated by combinatorial chemistry and high-throughput screening programmes in the pharmaceutical and agrochemical industries. Recent work in visualisation and data mining has been used to develop structure--activity relationships from such chemical-biological datasets.

 

Gentle, C. R., W. Z. Golinski, et al. (2001). "Computational studies of 'whiplash' injuries." Proc Inst Mech Eng [H] 215(2): 181-9.

            The term 'whiplash' was initially used to describe injuries to the neck caused by the head being forced backwards during a rear-end collision in cars without head restraints. The addition of head restraints in the 1970s was expected to solve this problem by preventing excessive extension of the neck but experience suggests the problem still exists. This paper reviews available experimental studies of whiplash and uses the data to construct a finite element model which is capable of dynamically simulating whiplash collisions and predicting the forces in all the relevant neck ligaments. For the first time, it is shown that trauma occurs long before the head hits the head restraint as a result of displacement between the head and the torso caused by the head's inertia leading to markedly different acceleration histories. It is concluded that experimental and computational studies must be used together to produce progress in biomechanical studies.

 

Glassbrook, N. and J. Ryals (2001). "A systematic approach to biochemical profiling." Curr Opin Plant Biol 4(3): 186-90.

            Sequencing of the Arabidopsis thaliana genome is complete. The analytical tools for determining gene function by altering and monitoring gene expression are relatively well developed, and are generating large volumes of valuable data. Recent advances in techniques for the analysis of small molecules allow researchers to apply biochemical profiling as another powerful approach to functional genomics and metabolic research.

 

Glynne, R. J. and S. R. Watson (2001). "The immune system and gene expression microarrays--new answers to old questions." J Pathol 195(1): 20-30.

            The recent increase in availability of gene expression technologies has the potential to dramatically expand our understanding of cellular immunology in molecular detail. Expression levels of tens of thousands of genes can be measured in dozens of samples in only a few days, and this data can be integrated with sequence informatics to tentatively assign some (limited) functional information to a majority of these genes. In this review we discuss some initial applications of these new tools to the fields of lymphocyte and monocyte differentiation pathways, the tolerance or immunity decision process, and B cell transformation. These examples illustrate the power of unbiased, 'wide-net', approaches both to drive immunological research in previously unexpected directions and to confirm classic tenets of immunology.

 

Goldgar, D. E. (2001). "Major strengths and weaknesses of model-free methods." Adv Genet 42: 241-51.

            This chapter discusses some of the principal advantages and disadvantages inherent in the use of model-free (MF) methods. The principal advantage is that one does not need to specify, a priori, a genetic model for the trait of interest, which often is not known for many complex phenotypes of interest. On the other hand, as with all nonparametric approaches, use of model-free methods results in reduced power for detection of linkage compared with model-based methods when the model is correctly specified. The MF methods also have a potential for computational simplicity and are ideally suited for analysis of specific relative sets such as affected sibpairs. The MF methods are ideally suited to the analysis of quantitative traits for which finding and implementing a suitable genetic model for use in a parametric linkage analysis may be cumbersome. On the other hand, for discrete traits, most model-free methods allow for only a simple definition of "affected," making it difficult to consider such factors as age at onset, diagnostic accuracy of phenotype, or sex-specific disease risks. A factor that can be viewed as both a strength and weakness of MF methods is the large number of statistical approaches and implementation options of model-free methods; while providing a number of choices for the more sophisticated users, such variety also may lead to the risk of overanalysis of the data by selecting the approach that gives the desired result. In the end, the choice between model-free and model-based methods will largely depend on the nature of the phenotype under study and the existing knowledge base about its underlying mode of inheritance.

 

Golub, T. R. (2001). "Genomic approaches to the pathogenesis of hematologic malignancy." Curr Opin Hematol 8(4): 252-61.

            Recent advances in genome technologies and computational biology have facilitated genome-wide views of hematologic malignancy. In particular, comparative gene expression methods using DNA microarrays have allowed for the analysis of gene expression patterns in both primary patient material and model systems of hematopoietic development. This review provides an overview of the basic technologies underlying these approaches and provides a summary of recent progress in the genome-wide molecular classification of human acute leukemias and lymphomas and of initial attempts to define oncogene-mediated transcriptional programs using DNA microarrays.

 

Grandi, G. (2001). "Antibacterial vaccine design using genomics and proteomics." Trends Biotechnol 19(5): 181-8.

            After 200 years of practice, vaccinology has proved to be very effective in preventing infectious diseases. However, several human and animal pathogens exist for which vaccines have not yet been discovered. As for other fields of medical sciences, it is expected that vaccinology will greatly benefit from the emerging genomics technologies such as bioinformatics, proteomics and DNA microarrays. In this article the potential of these technologies applied to bacterial pathogens is analyzed, taking into account the few existing examples of their application in vaccine discovery.

 

Grant, S. G. and W. P. Blackstock (2001). "Proteomics in neuroscience: from protein to network." J Neurosci 21(21): 8315-8.

            Proteomic tools offer a new platform for studies of complex biological functions involving large numbers and networks of proteins. Intracellular networks of proteins perform key functions in neurons and glia. The unicellular eukaryote Saccharomyces cerevisiae has been the prototype for eukaryotic proteomic studies, and when combined with genomics, microarrays, genetics, and pharmacology, new insights into the integrated function of the cell emerge. The anatomical complexity of the nervous system both in cell types and in the vast number of synapses introduces novel technical and biological issues regarding the subcellular organization of protein networks. Here we will discuss the technology of proteomics and its applications to the nervous system.

 

Grant, W. N. and M. E. Viney (2001). "Post-genomic nematode parasitology." Int J Parasitol 31(9): 879-88.

            The future direction of post-genomic nematode parasitology should focus on the function of the genes that are defined by large-scale expressed sequence tag sequencing and on broader questions about the genetic basis of parasitism. Functional characterisation will require the application of high throughput technologies that have been developed in other fields, including genome mapping strategies and DNA microarray analysis. These will be greatly aided by the development and application of appropriate model organisms. It is crucial that the field make the transition from a narrow focus on one or a few genes at a time to a focus on whole genomes in order to fully realise the potential of the expressed sequence tag and other genomic projects currently under way.

 

Gras, R. and M. Muller (2001). "Computational aspects of protein identification by mass spectrometry." Curr Opin Mol Ther 3(6): 526-32.

            Recent developments in proteomics and genomics provide huge quantities of data to analyze. Automatic interpretation of mass spectrometry data has become essential for high-throughput processes aiming to study complete proteomes. There exist two main sources of mass spectrometric data: peptide mass fingerprint and fragmentation spectra, both of which require specific bioinformatic algorithms. We present a survey of these algorithms and discuss the efficiency of the different approaches and the possible improvements that may lead to a complete automatic high-throughput identification process.

 

Gray, S. G. and T. J. Ekstrom (2001). "The human histone deacetylase family." Exp Cell Res 262(2): 75-83.

            Since the identification of the first histone deacetylase (Taunton et al., Science 272, 408-411), several new members have been isolated. They can loosely be separated into entities on the basis of their similarity to various yeast histone deacetylases. The first class is represented by its closeness to the yeast Rpd3-like proteins, and the second most recently discovered class has similarities to yeast Hda1-like proteins. However, due to the fact that several different research groups isolated the Hda1-like histone deacetylases independently, there have been various different nomenclatures used to describe the various members, which can lead to confusion in the interpretation of this family's functions and interactions. With the discovery of another novel murine histone deacetylase, homologous to yeast Sir2, the number of members of this family is set to increase, as 7 human homologues of this gene have been isolated. In the light of these recent discoveries, we have examined the literature data and conducted a database analysis of the isolated histone deacetylases and potential candidates. The results obtained suggest that the number of histone deacetylases within the human genome may be as high as 17 and are discussed in relation to their homology to the yeast histone deacetylases.

 

Guillouzo, A. (2001). "Applications of biotechnology to pharmacology and toxicology." Cell Mol Biol (Noisy-le-grand) 47(8): 1301-8.

            Strategies for the development of new more efficient drugs at a lower cost and for the evaluation of the effects of chemicals and metals on tissue and cell function are changing considerably. This is made possible by recent progress in various areas, particularly biotechnology and bioinformatics. The recent sequencing of the human genome and the design of more and more sophisticated technologies will largely influence the fields of pharmacology and toxicology. Thus, identification of new molecular targets, development of more powerful cell models, design of miniaturized and automated tests for high throughput screening of thousands of compounds synthesized by combinatorial chemistry and progress in genomic and proteomic technologies that permit simultaneous analysis of thousands of genes and their products, offer new investigative ways that will still widely be extended in the next future.

 

Hamadeh, H. K., P. Bushel, et al. (2001). "Discovery in toxicology: mediation by gene expression array technology." J Biochem Mol Toxicol 15(5): 231-42.

            Toxicogenomics is a term that represents the merging of toxicology with novel genomics techniques. Data generated in the new-age era of toxicology is relatively complex, requires new bioinformatics tools for adequate interpretation, and allows for the rapid generation of testable hypotheses. Hazard identification and risk assessment processes will advance from the use of genomics techniques, which will lead to greater understanding of mechanism(s) of action of toxicants, development of novel biomarkers of exposure and effect, and better identification of sensitive subpopulations.

 

Hamilton, B. A. and W. N. Frankel (2001). "Of mice and genome sequence." Cell 107(1): 13-6.

            Availability of the mouse genome sequence will have a major impact on the study of vertebrate evolution, mammalian biology, and animal models of human disease. Resources to explore genome biology in mice will maximize the effect of this watershed event.

 

Hansch, C., A. Kurup, et al. (2001). "Chem-bioinformatics and QSAR: a review of QSAR lacking positive hydrophobic terms." Chem Rev 101(3): 619-72.

           

Hastie, N. (2001). "Future perspectives." Essays Biochem 37: 121-7.

            Transcription is coupled to splicing and other post-transcriptional processes. The importance of transcription factors in developmental biology and disease is underlined by genetic analysis in flies and humans. The genome project is identifying large numbers of novel transcription factors and RNA-binding proteins. Proteins may have multiple functions, acting at the transcriptional and post-transcriptional levels. The vast amount of novel biological information requires new, high-throughput approaches and bioinformatics.

 

Hasty, J., D. McMillen, et al. (2001). "Computational studies of gene regulatory networks: in numero molecular biology." Nat Rev Genet 2(4): 268-79.

            Remarkable progress in genomic research is leading to a complete map of the building blocks of biology. Knowledge of this map is, in turn, setting the stage for a fundamental description of cellular function at the DNA level. Such a description will entail an understanding of gene regulation, in which proteins often regulate their own production or that of other proteins in a complex web of interactions. The implications of the underlying logic of genetic networks are difficult to deduce through experimental techniques alone, and successful approaches will probably involve the union of new experiments and computational modelling techniques.

 

Hondermarck, H., A. S. Vercoutter-Edouart, et al. (2001). "Proteomics of breast cancer for marker discovery and signal pathway profiling." Proteomics 1(10): 1216-32.

            Breast cancer is the most common form of cancer among women and the identification of markers to discriminate tumorigenic from normal cells, as well as the different stages of this pathology, is of critical importance. Two-dimensional electrophoresis has been used before for studying breast cancer, but the progressive completion of human genomic sequencing and the introduction of mass spectrometry, combined with advanced bioinformatics for protein identification, have considerably increased the possibilities for characterizing new markers and therapeutic targets. Breast cancer proteomics has already identified markers of potential clinical interest (such as the molecular chaperone 14-3-3 sigma) and technological innovations such as large scale and high throughput analysis are now driving the field. Methods in functional proteomics have also been developed to study the intracellular signaling pathways that underlie the development of breast cancer. As illustrated with fibroblast growth factor-2, a mitogen and motogen factor for breast cancer cells, proteomics is a powerful approach to identify signaling proteins and to decipher the complex signaling circuitry involved in tumor growth. Together with genomics, proteomics is well on the way to molecularly characterizing the different types of breast tumor, and thus defining new therapeutic targets for future treatment.

 

Ikeo, K. (2001). "[Getting the sequence world: How to use multiple alignment software]." Tanpakushitsu Kakusan Koso 46(9): 1299-305.

           

Imai, E., M. Takenaka, et al. (2001). "[Gene therapy and tissue engineering in nephrology and renal transplantation]." Nippon Rinsho 59(1): 65-71.

            Human genome project will be completed in 2003 and we will soon obtain the information of the whole DNA sequence of the human genome. This should affect the therapy of progressive renal diseases since we have no effective remedy to cure the renal diseases. Gene therapy, renal engineering and generation of new drug can be achieved by using the information of human genome. In this context, we described our recent endeavors concerning the gene therapy of transplant kidney, seeking the renal stem cells and reprogramming factors, and exploring genes related to renal fibrosis. Completion of bioinformatics, can facilitate the above post-genome project.

 

Imanishi, T. and S. Miyazaki (2001). "[Comparison with other sequences: sequence similarity searches]." Tanpakushitsu Kakusan Koso 46(7): 856-62.

           

Imura, H. (2001). "[Perspectives on postgenome medicine in the 21st century]." Nippon Rinsho 59(1): 7-10.

            Since the human genome project has been almost completed in 2000, the year of 2001 is the first year of the postgenomic era. A variety of postgenome studies will be done in the next decade, including functional, comparative and structural genomics. These studies may open new area in medicine, because disease susceptibility and drug metabolism would be predicted from genetic characteristics of individuals. Genome studies may also shed a light on cell biology, brain research and regeneration medicine and promote these studies. Bioinformatics will become a basis of postgenome biology and medicine.

 

Ishikawa, K. and G. Tsujimoto (2001). "[New strategy on medical research after completion of genome sequencing]." Nippon Yakurigaku Zasshi 118(3): 170-6.

            Real advances in biotechnology made it possible to complete human whole genome sequencing within a short duration. Although the genome includes a huge amount of information about biological functions and the interest is now directed to the study using genomic information, the genomic strategy is not clearly understood. The following 4 studies were therefore presented and discussed about the strategy after the completion of the genomic sequence in the 74th Annual Meeting of Japanese Pharmacological Society: 1) Asthma and atopic dermatitis: models for genetic and genomic investigations of complex genetic diseases, by W.C.O. Cookson (University of Oxford, Asthma Genetics Group, Wellcome Trust Centre for Human Genetics); 2) Molecular classification by global gene expression profiling: application on oncogenomic research, by H. Aburatani (Genome Science Division, Research Center for Advanced Science and Technology, University of Tokyo); 3) Functional genomic search of disease-related genes using microarrays with normalized rat cDNA library, by G. Tsujimoto, et al. (Department of Molecular, Cell Pharmacology, National Children's Medical Research Center: and 4) Acute ischemic change of mRNA expression in the hippocampus by GeneChip array analysis: a starting point for post-genome strategy, by S. Asai, et al.

 

Ito, T., T. Chiba, et al. (2001). "Exploring the protein interactome using comprehensive two-hybrid projects." Trends Biotechnol 19(10 Suppl): S23-7.

            Large-scale two-hybrid projects were used in an approach to examine protein-protein interactions. Despite the various limitations of this approach, these projects revealed a wealth of novel interactions, and the protein interactome may be much larger than expected.

 

Jones, D. T. (2001). "Protein structure prediction in genomics." Brief Bioinform 2(2): 111-25.

            As the number of completely sequenced genomes rapidly increases, including now the complete Human Genome sequence, the post-genomic problems of genome-scale protein structure determination and the issue of gene function identification become ever more pressing. In fact, these problems can be seen as interrelated in that experimentally determining or predicting or the structure of proteins encoded by genes of interest is one possible means to glean subtle hints as to the functions of these genes. The applicability of this approach to gene characterisation is reviewed, along with a brief survey of the reliability of large-scale protein structure prediction methods and the prospects for the development of new prediction methods.

 

Jung, D. R., R. Kapur, et al. (2001). "Topographical and physicochemical modification of material surface to enable patterning of living cells." Crit Rev Biotechnol 21(2): 111-54.

            Precise control of the architecture of multiple cells in culture and in vivo via precise engineering of the material surface properties is described as cell patterning. Substrate patterning by control of the surface physicochemical and topographic features enables selective localization and phenotypic and genotypic control of living cells. In culture, control over spatial and temporal dynamics of cells and heterotypic interactions draws inspiration from in vivo embryogenesis and haptotaxis. Patterned arrays of single or multiple cell types in culture serve as model systems for exploration of cell-cell and cell-matrix interactions. More recently, the patterned arrays and assemblies of tissues have found practical applications in the fields of Biosensors and cell-based assays for Drug Discovery. Although the field of cell patterning has its origins early in this century, an improved understanding of cell-substrate interactions and the use of microfabrication techniques borrowed from the microelectronics industry have enabled significant recent progress. This review presents the important early discoveries and emphasizes results of recent state-of-the-art cell patterning methods. The review concludes by illustrating the growing impact of cell patterning in the areas of bioelectronic devices and cell-based assays for drug discovery.

 

Jungblut, P. R. (2001). "Proteome analysis of bacterial pathogens." Microbes Infect 3(10): 831-40.

            Combining two-dimensional electrophoresis with mass spectrometry resulted in a powerful technology ideally suited to recognize and identify proteins of pathogenic microorganisms. This classical proteome analysis is now complemented by capillary chromatography/mass spectrometry combinations, miniaturization by chip technology and protein interaction investigations. Comparative proteomics is used to reveal vaccine candidates and pathogenicity factors. Immunoproteomics identifies specific and nonspecific antigens. For the management of the huge data amounts, bioinformatics is a valuable instrument for the construction of complex protein databases.

 

Kallioniemi, O. P. (2001). "Biochip technologies in cancer research." Ann Med 33(2): 142-7.

            Development of high-throughput 'biochip' technologies has dramatically enhanced our ability to study biology and explore the molecular basis of disease. Biochips enable massively parallel molecular analyses to be carried out in a miniaturized format with a very high throughput. This review will highlight applications of the various biochip technologies in cancer research, including analysis of 1) disease predisposition by using single-nucleotide polymorphism (SNP) microarrays, 2) global gene expression patterns by cDNA microarrays, 3) concentrations, functional activities or interactions of proteins with proteomic biochips, and 4) cell types or tissues as well as clinical endpoints associated with molecular targets by using tissue microarrays. One can predict that individual cancer risks can, in the future, be estimated accurately by a microarray profile of multiple SNPs in critical genes. Diagnostics of cancer will be facilitated by biochip readout of activity levels of thousands of genes and proteins. Biochip diagnostics coupled with informatics solutions will form the basis of individualized treatment decisions for cancer patients.

 

Kellam, P. (2001). "Post-genomic virology: the impact of bioinformatics, microarrays and proteomics on investigating host and pathogen interactions." Rev Med Virol 11(5): 313-29.

            Post-genomic research encompasses many diverse aspects of modern science. These include the two broad subject areas of computational biology (bioinformatics) and functional genomics. Laboratory based functional genomics aims to measure and assess either the messenger RNA (mRNA) levels (transcriptome studies) or the protein content (proteome studies) of cells and tissues. All of these methods have been applied recently to the study of host and pathogen interactions for both bacteria and viruses. A basic overview of the technology is given in this review together with approaches to data analysis. The wealth of information produced from even these preliminary studies has shown the generalities, subtleties and specificities of host-pathogen interactions. Such research should ultimately result in new methods for diagnosing and treating infectious diseases.

 

Kidera, A. (2001). "[Knowing similarity in protein 3D structure]." Tanpakushitsu Kakusan Koso 46(15): 2198-204.

           

Kim, J. (2001). "Descartes' fly: the geometry of genomic annotation." Funct Integr Genomics 1(4): 241-9.

            The completion of the Drosophila melanogaster genome marks another significant milestone in the growth of sequence information. But it also contributes to the ever-widening gap between sequence information and biological knowledge. One important approach to reducing this gap is theoretical inference through computational technologies. Many computer programs have been designed to annotate genomic sequence information with biologically relevant information. Here, I suggest that all of these methods have a common structure in which the sequence fragments are "coordinated" by some method of description such as Hidden Markov models. The key to the algorithms lies in constructing the most efficient set of coordinates that allow extrapolation and interpolation from existing knowledge. Efficient extrapolation and interpolation are produced if the sequence fragments acquire a natural geometrical structure in the coordinated description. Finding such a coordinate frame is an inductive problem with no algorithmic solution. The greater part of the problem of genomic annotation lies in biological modeling of the data rather than in algorithmic improvements.

 

Kobayashi, K. (2001). "[Getting functional information from your sequence by the use of protein signature databases]." Tanpakushitsu Kakusan Koso 46(14): 2098-103.

           

Kondo, S. (2001). "[Computer simulation as a tool to study complex phenomena of biology]." Tanpakushitsu Kakusan Koso 46(16 Suppl): 2461-7.

           

Korenberg, M. J., R. David, et al. (2001). "Parallel cascade identification and its application to protein family prediction." J Biotechnol 91(1): 35-47.

            Parallel cascade identification is a method for modeling dynamic systems with possibly high order nonlinearities and lengthy memory, given only input/output data for the system gathered in an experiment. While the method was originally proposed for nonlinear system identification, two recent papers have illustrated its utility for protein family prediction. One strength of this approach is the capability of training effective parallel cascade classifiers from very little training data. Indeed, when the amount of training exemplars is limited, and when distinctions between a small number of categories suffice, parallel cascade identification can outperform some state-of-the-art techniques. Moreover, the unusual approach taken by this method enables it to be effectively combined with other techniques to significantly improve accuracy. In this paper, parallel cascade identification is first reviewed, and its use in a variety of different fields is surveyed. Then protein family prediction via this method is considered in detail, and some particularly useful applications are pointed out.

 

Krawetz, S. A. and D. D. Womble (2001). "Design and implementation of an introductory course for computer applications in molecular genetics. A case study." Mol Biotechnol 17(1): 27-41.

            Formal training in computational biology was initiated at Wayne State University in 1990 to meet the needs of the faculty. This was still at a time when the molecular databases and analysis tools could be housed in what is now equivalent to a modern but dated desktop computer. In 1995 the course was expanded to include graduate students to provide these senior students with a foundation in computational biology. This course has armed our students with a requisite set of basic skills that are necessary for a successful career in molecular genetics. It is now an integral component of the graduate program of the Center for Molecular Medicine and Genetics and our experiences in course delivery have been detailed (BioInformatics Methods and Protocols, S. Misener and S. A. Krawetz, eds., Humana Press, Totowa, NJ, 2000.). The course was expanded to a campus-wide unlimited enrollment program for the summer of 2000 to address the needs of our student body. In this review we present our experience with delivering a multidisciplinary campus-wide computational biology course to a new and widely diverse student body.

 

Kurella, M., L. L. Hsiao, et al. (2001). "DNA microarray analysis of complex biologic processes." J Am Soc Nephrol 12(5): 1072-8.

            DNA microarrays, or gene chips, allow surveys of gene expression, (i.e., mRNA expression) in a highly parallel and comprehensive manner. The pattern of gene expression produced, known as the expression profile, depicts the subset of gene transcripts expressed in a cell or tissue. At its most fundamental level, the expression profile can address qualitatively which genes are expressed in disease states. However, with the aid of bioinformatics tools such as cluster analysis, self-organizing maps, and principle component analysis, more sophisticated questions can be answered. Microarrays can be used to characterize the functions of novel genes, identify genes in a biologic pathway, analyze genetic variation, and identify therapeutic drug targets. Moreover, the expression profile can be used as a tissue or disease "fingerprint." This review details the fabrication of arrays, data management tools, and applications of microarrays to the field of renal research and the future of clinical practice.

 

Kuroda, Y., E. Chikayama, et al. (2001). "[A protein domain selection system for high-throughput structural genomics]." Tanpakushitsu Kakusan Koso 46(14): 2066-72.

           

Kusunoki, M. (2001). "[Acquisition of structural data of biological macromolecules: how to utilize PDB]." Tanpakushitsu Kakusan Koso 46(13): 2003-8.

           

Legrain, P., J. Wojcik, et al. (2001). "Protein--protein interaction maps: a lead towards cellular functions." Trends Genet 17(6): 346-52.

            The availability of complete genome sequences now permits the development of tools for functional biology on a proteomic scale. Several experimental approaches or in silico algorithms aim at clustering proteins into networks with biological significance. Among those, the yeast two-hybrid system is the technology of choice to detect protein-protein interactions. Recently, optimized versions were applied at a genomic scale, leading to databases on the web. However, as with any other 'genetic' assay, yeast two-hybrid assays are prone to false positives and false negatives. Here we discuss these various technologies, their general limitations and the potential advances they make possible, especially when in combination with other functional genomics or bioinformatics analyses.

 

Lesch, K. P. (2001). "Molecular foundation of anxiety disorders." J Neural Transm 108(6): 717-46.

            Genetic epidemiology has assembled convincing evidence that anxiety and related disorders are influenced by genetic factors and that the genetic component is highly complex, polygenic, and epistatic. Although several genes which may contribute to the genetic variance of anxiety-related traits or modify the phenotypic expression of pathologic anxiety are currently under investigation, molecular genetics has so far failed to identify a genomic variation that can consistently contribute susceptibility of anxiety disorders. Investigation of gene-gene and gene-environment interactions in humans and nonhuman primates as well as gene inactivation studies in mice further intensify the identification of genes that are essential for development and adult plasticity of the brain related to complex anxiety responses. Because the modes of inheritance of anxiety disorders are complex, it has been concluded that multiple genes of small effect, in interaction with each other and with nongenetic neurodevelopmental events, produce vulnerability to the disorder. Future research directions will take advantage of the completion of the sequencing the human and mouse genome coinciding with the revolution in bioinformatics. More than 1.4 million single nucleotide polymorphisms (SNPs) in the human genome have been identified. This collection should allow the initiation of genome-wide linkage disequilibrium mapping of the genes influencing anxiety in the human population. Integration of these emerging tools and technologies for genetic analysis will provide the groundwork for an advanced stage of gene identification and functional studies in anxiety and related disorders.

 

Luo, Z. and D. H. Geschwind (2001). "Microarray applications in neuroscience." Neurobiol Dis 8(2): 183-93.

            Advances in all facets of technology from molecular biology to imaging and computational biology offer unprecedented opportunities for improving our understanding of the brain in health and disease. Oligonucleotide and cDNA microarray analysis, using a variety of "DNA chips," is a recently developed high-throughput technique that allows for tour-de-force analysis of gene expression. We review this powerful technique, developed in genetics laboratories, with reference to applications in neurologic diseases in humans and the use of animal models. The typical microarray experiment is multistaged and includes preparation or purchase of arrays, preparation of target DNA and probe, target DNA hybridization, microarray scanning, and image analysis. The power and pitfalls of this technology are discussed in the context of neuroscience paradigms. Since unprecedented amounts of data are produced from microarray experiments, bioinformatics and modeling expertise are increasingly becoming critical components of this approach.

 

Lupas, A. N., C. P. Ponting, et al. (2001). "On the evolution of protein folds: are similar motifs in different protein folds the result of convergence, insertion, or relics of an ancient peptide world?" J Struct Biol 134(2-3): 191-203.

            This paper presents and discusses evidence suggesting how the diversity of domain folds in existence today might have evolved from peptide ancestors. We apply a structure similarity detection method to detect instances where localized regions of different protein folds contain highly similar sequences and structures. Results of performing an all-on-all comparison of known structures are described and compared with other recently published findings. The numerous instances of local sequence and structure similarities within different protein folds, together with evidence from proteins containing sequence and structure repeats, argues in favor of the evolution of modern single polypeptide domains from ancient short peptide ancestors (antecedent domain segments (ADSs)). In this model, ancient protein structures were formed by self-assembling aggregates of short polypeptides. Subsequently, and perhaps concomitantly with the evolution of higher fidelity DNA replication and repair systems, single polypeptide domains arose from the fusion of ADSs genes. Thus modern protein domains may have a polyphyletic origin.

 

Luscombe, N. M., D. Greenbaum, et al. (2001). "What is bioinformatics? A proposed definition and overview of the field." Methods Inf Med 40(4): 346-58.

            BACKGROUND: The recent flood of data from genome sequences and functional genomics has given rise to new field, bioinformatics, which combines elements of biology and computer science. OBJECTIVES: Here we propose a definition for this new field and review some of the research that is being pursued, particularly in relation to transcriptional regulatory systems. METHODS: Our definition is as follows: Bioinformatics is conceptualizing biology in terms of macromolecules (in the sense of physical-chemistry) and then applying "informatics" techniques (derived from disciplines such as applied maths, computer science, and statistics) to understand and organize the information associated with these molecules, on a large-scale. RESULTS AND CONCLUSIONS: Analyses in bioinformatics predominantly focus on three types of large datasets available in molecular biology: macromolecular structures, genome sequences, and the results of functional genomics experiments (e.g. expression data). Additional information includes the text of scientific papers and "relationship data" from metabolic pathways, taxonomy trees, and protein-protein interaction networks. Bioinformatics employs a wide range of computational techniques including sequence and structural alignment, database design and data mining, macromolecular geometry, phylogenetic tree construction, prediction of protein structure and function, gene finding, and expression data clustering. The emphasis is on approaches integrating a variety of computational methods and heterogeneous data sources. Finally, bioinformatics is a practical discipline. We survey some representative applications, such as finding homologues, designing drugs, and performing large-scale censuses. Additional information pertinent to the review is available over the web at http://bioinfo.mbb.yale.edu/what-is-it.

 

Ma, D. (2001). "Applications of yeast in drug discovery." Prog Drug Res 57: 117-62.

            The yeast Saccharomyces cerevisiae is perhaps the best-studied eukaryotic organism. Its experimental tractability, combined with the remarkable conservation of gene function throughout evolution, makes yeast the ideal model genetic organism. Yeast is a non-pathogenic model of fungal pathogens used to identify antifungal targets suitable for drug development and to elucidate mechanisms of action of antifungal agents. As a model of fundamental cellular processes and metabolic pathways of the human, yeast has improved our understanding and facilitated the molecular analysis of many disease genes. The completion of the Saccharomyces genome sequence helped launch the post-genomic era, focusing on functional analyses of whole genomes. Yeast paved the way for the systematic analysis of large and complex genomes by serving as a test bed for novel experimental approaches and technologies, tools that are fast becoming the standard in drug discovery research

 

Maecker, B., B.-B. von, et al. (2001). "Linking genomics to immunotherapy by reverse immunology--'immunomics' in the new millennium." Curr Mol Med 1(5): 609-19.

            The disclosure of the human genome sequence and rapid advances in genomic expression profiling have revolutionized our knowledge about molecular changes in malignant diseases. Rapidly growing gene expression databases and improvements in bioinformatics tools set the stage for new approaches using large-scale molecular information to develop specific therapeutics in cancer. On one hand, the ability to detect clusters of genes differentially expressed in normal and malignant tissue may lead to widely applicable targeting of defined molecular structures. On the other hand, analyzing the 'molecular fingerprint' of an individual tumor raises the possibility of developing customized therapeutics. One approach to use the emerging new datasets for the development of novel therapeutics is to identify genes that are specifically expressed in tumors as targets for immune intervention. This review will focus on the process from in silico analysis of expression databases and screening of potential candidate genes by bioinformatics to the in vitro and in vivo analysis to determine the immunogenicity of candidate tumor antigens. Basic biological principles of 'reverse immunology' as well as technical advantages and difficulties will be addressed.

 

Maggio, E. T. and K. Ramnarayan (2001). "Recent developments in computational proteomics." Trends Biotechnol 19(7): 266-72.

            The mapping of the human genome was completed earlier this year and efforts are underway to understand the role of gene products (i.e. proteins) in biological pathways and human disease and to exploit their functional roles to derive protein therapeutics and protein-based drugs. A key component to the next revolution in the 'post-genomic' era will be the increasingly widespread use of protein structure in rational experimental design. Improvements in quality, availability and utility of large-scale 3D and 4D protein structural information are enabling a revolution in rational design, having particular impact on drug discovery and optimization. New computational methodologies now yield modeled structures that are, in many cases, quantitatively comparable with crystal structures, at a fraction of the cost.

 

Mahalingam, S., K. Clark, et al. (2001). "Antiviral potential of chemokines." Bioessays 23</