Home   About Us   eMedicine Search   Drug Development   Feedback   Google Scholar Search   Intranet 
Literature Database   News   Photo Gallery   Publications   Site Map   Site Search   Useful Links 
 

 Back to  Bioinformatics

Enhanced by Neuroinformation

Bioinformatics Reviews: 2002

(187 References)

Altman, R. B. and T. E. Klein (2002). "Challenges for biomedical informatics and pharmacogenomics." Annu Rev Pharmacol Toxicol 42: 113-33.

            Pharmacogenomics requires the integration and analysis of genomic, molecular, cellular, and clinical data, and it thus offers a remarkable set of challenges to biomedical informatics. These include infrastructural challenges such as the creation of data models and databases for storing these data, the integration of these data with external databases, the extraction of information from natural language text, and the protection of databases with sensitive information. There are also scientific challenges in creating tools to support gene expression analysis, three-dimensional structural analysis, and comparative genomic analysis. In this review, we summarize the current uses of informatics within pharmacogenomics and show how the technical challenges that remain for biomedical informatics are typical of those that will be confronted in the postgenomic era.

 

Andrews, P. D., I. S. Harper, et al. (2002). "To 5D and beyond: quantitative fluorescence microscopy in the postgenomic era." Traffic 3(1): 29-36.

            Digital fluorescence microscopy is now a standard technology for assaying molecular localisation in cells and tissues. The choice of laser scanning (LSM) and wide-field microscopes (WFM) largely depends on the type of sample, with LSMs performing best on thick samples and WFMs performing best on thin ones. These systems are increasingly used to collect large multidimensional datasets. We propose a unified image structure that considers space, time, and fluorescence wavelength as integral parts of the image. Moreover, the application of fluorescence imaging to large-scale screening means that large datasets are now routinely acquired. We propose that analysis of these data requires querying tools based on relational databases and describe one such system.

 

Aravind, L. and L. M. Iyer (2002). "Intraproteomic networks: new forays into predicting interaction partners." Genome Res 12(8): 1156-8.

           

Bains, W., R. Gilbert, et al. (2002). "Evolutionary computational methods to predict oral bioavailability QSPRs." Curr Opin Drug Discov Devel 5(1): 44-51.

            This review discusses evolutionary and adaptive methods for predicting oral bioavailability (OB) from chemical structure. Genetic Programming (GP), a specific form of evolutionary computing, is compared with some other advanced computational methods for OB prediction. The results show that classifying drugs into 'high' and 'low' OB classes on the basis of their structure alone is solvable, and initial models are already producing output that would be useful for pharmaceutical research. The results also suggest that quantitative prediction of OB will be tractable. Critical aspects of the solution will involve the use of techniques that can: (i) handle problems with a very large number of variables (high dimensionality); (ii) cope with 'noisy' data; and (iii) implement binary choices to sub-classify molecules with behavior that are qualitatively different. Detailed quantitative predictions will emerge from more refined models that are hybrids derived from mechanistic models of the biology of oral absorption and the power of advanced computing techniques to predict the behavior of the components of those models in silico.

 

Bancroft, I. (2002). "Insights into cereal genomes from two draft genome sequences of rice." Genome Biol 3(6): REVIEWS1015.

            Draft genome sequences have been reported for two subspecies of rice. The drafts include the sequences of an estimated 99% of all rice genes and provide major advances in our understanding of the content and complexity of cereal genomes in general and the rice genome in particular.

 

Baron, M. (2002). "Manic-depression genes and the new millennium: poised for discovery." Mol Psychiatry 7(4): 342-58.

            Manic-depressive illness is a common psychiatric disorder with complex etiology that likely involves multiple genes and non-genetic influences. The uncertain path to gene discovery has spurred considerable debate over genetic findings and gene-finding strategies. In this article, I review the main findings, with a focus on: (1) putative linked loci on chromosomes 1q31-32, 4p16, 6pter-p24, 10p14, 10q21-26, 12q23-24, 13q31-32, 18p11, 18q21-23, 21q22, 22q11-13, and Xq24-28; and (2) association studies with candidate genes, dynamic mutations, mitochondrial mutations, and chromosomal aberrations. Although no gene has been identified, promising findings are emerging. I then discuss the challenges and opportunities ahead, with special emphasis on gene-finding methods-in particular, questions pertaining to phenotype definition, linkage and association mapping, gene markers, sampling, study population, multigene systems, lessons from other disorders, animal models, and bioinformatics. The progress to date, together with rapid advances in genomics, analytical and computational methods, and bioinformatics, holds promise for new insights into the genetics of manic-depression, in the new millennium.

 

Barratt, C. L., D. C. Hughes, et al. (2002). "Functional genomics in reproductive medicine." Hum Fertil (Camb) 5(1): 3-5.

            The British Fertility Society organised a workshop on Functional Genomics in Reproductive Medicine at the University of Birmingham on 13-14 September 2001. The primary aim was to inform delegates about the power of the technology that has been made available after completion of the sequencing of the human genome, and to stimulate debate about using functional genomics to address both clinical and scientific questions in reproductive medicine. Three specific areas were addressed: proteomics, gene expression and bioinformatics. Although the sophistication and plethora of techniques available were obvious, major limitations in the technology were also discussed. The future promises to be very challenging indeed.

 

Bayat, A. (2002). "Science, medicine, and the future: Bioinformatics." Bmj 324(7344): 1018-22.

           

Beutler, B. and M. Rehli (2002). "Evolution of the TIR, tolls and TLRs: functional inferences from computational biology." Curr Top Microbiol Immunol 270: 1-21.

            The mammalian toll-like receptors (TLRs) are products of an evolutionary process that began prior to the separation of plants and animals. The most conserved protein motif within the TLRs is the TIR, which denotes Toll, the Interleukin-1 receptor, and plant disease Resistance genes. To trace the ancestry of the TLRs, it is desirable to draw upon the sequences of TIR domains from TLRs of diverse vertebrate species, including species with known dates of divergence (i.e., representatives of Mammalia and Aves) in order to establish a relationship between time and genetic divergence. It appears that a gene ancestral to modern TLRs 1 and 6 duplicated approximately 130 million years ago, only shortly before the speciation event that led to humans and mice. Though it is not represented in mice, TLR10 split from the TLR[1/6] precursor about 300 million years ago. The origins of other TLRs are more ancient, dating to the origins of vertebrate life, and some present-day vertebrate species appear to have many more TLRs than others. Moreover, the patterns of TLR expression are quite variable at the level of tissues, even among closely related species. A given TLR in species that are related by descent from a common ancestor may acquire different duties within each descendant line, so that some microbial inducers are avidly recognized in one species but not in others; likewise the intensity and the antomic location of an innate immune response may vary considerably. In this review, we discuss the computational methods used to analyze divergence of the TIR, and the conclusions that may be safely drawn.

 

Bevan, M. (2002). "Genomics and plant cells: application of genomics strategies to Arabidopsis cell biology." Philos Trans R Soc Lond B Biol Sci 357(1422): 731-6.

            In this review I seek to describe how the complete catalogue of plant genes and proteins, revealed by genome sequencing, can provide novel insights into cell biology. Many new analytical methods have been developed to digest the flood of genome sequence data, including analysis of the transcriptome, proteome and metabolites. High-throughput analysis of protein targeting and other methods will ascribe new information to proteins and create important links with other large datasets. To fulfil the potential revealed by this genomic information, many challenges have to be met. Among these are organizational changes needed to create common datasets accessible to all scientists, and bioinformatics solutions to capture and integrate diverse datasets. Once harnessed, these new strategies will irrevocably change the way we conduct plant science.

 

Bickmore, W. A. and H. G. Sutherland (2002). "Addressing protein localization within the nucleus." Embo J 21(6): 1248-54.

            Bridging the gap between the number of gene sequences in databases and the number of gene products that have been functionally characterized in any way is a major challenge for biology. A key characteristic of proteins, which can begin to elucidate their possible functions, is their subcellular location. A number of experimental approaches can reveal the subcellular localization of proteins in mammalian cells. However, genome databases now contain predicted sequences for a large number of potentially novel proteins that have yet to be studied in any way, let alone have their subcellular localization determined. Here we ask whether using bioinformatics tools to analyse the sequence of proteins whose subnuclear localizations have been determined can reveal characteristics or signatures that might allow us to predict localization for novel protein sequences.

 

Blaschke, C., L. Hirschman, et al. (2002). "Information extraction in molecular biology." Brief Bioinform 3(2): 154-65.

            Information extraction has become a very active field in bioinformatics recently and a number of interesting papers have been published. Most of the efforts have been concentrated on a few specific problems, such as the detection of protein-protein interactions and the analysis of DNA expression arrays, although it is obvious that there are many other interesting areas of potential application (document retrieval, protein functional description, and detection of disease-related genes to name a few). Paradoxically, these exciting developments have not yet crystallised into general agreement on a set of standard evaluation criteria, such as the ones developed in fields such as protein structure prediction, which makes it very difficult to compare performance across these different systems. In this review we introduce the general field of information extraction, we outline the status of the applications in molecular biology, and we then discuss some ideas about possible standards for evaluation that are needed for the future development of the field.

 

Bloom, G. C., P. Gieser, et al. (2002). "Linking image quantitation and data analysis." Methods Mol Biol 184: 15-27.

           

Bornberg-Bauer, E. and N. W. Paton (2002). "Conceptual data modelling for bioinformatics." Brief Bioinform 3(2): 166-80.

            Current research in the biosciences depends heavily on the effective exploitation of huge amounts of data. These are in disparate formats, remotely dispersed, and based on the different vocabularies of various disciplines. Furthermore, data are often stored or distributed using formats that leave implicit many important features relating to the structure and semantics of the data. Conceptual data modelling involves the development of implementation-independent models that capture and make explicit the principal structural properties of data. Entities such as a biopolymer or a reaction, and their relations, eg catalyses, can be formalised using a conceptual data model. Conceptual models are implementation-independent and can be transformed in systematic ways for implementation using different platforms, eg traditional database management systems. This paper describes the basics of the most widely used conceptual modelling notations, the ER (entity-relationship) model and the class diagrams of the UML (unified modelling language), and illustrates their use through several examples from bioinformatics. In particular, models are presented for protein structures and motifs, and for genomic sequences.

 

Braam, G. B., H. A. Bluyssen, et al. (2002). "[Gene-expression analysis using DNA microarrays]." Ned Tijdschr Geneeskd 146(40): 1867-73.

            Parallel to the efforts to unravel the human genome code, techniques are currently being developed to analyse the activity of all genes and proteins in a cell population or tissue. The most advanced of these functional genomic techniques is that used to study gene expression using DNA microarrays, also known as 'DNA chips'. This allows the expression of thousands of different genes to be compared in two different samples (for example, one from a sick person and one from a healthy one). Bioinformatics is essential in this technique. The expression profiles obtained in this way can be used to characterise complex biological situations (e.g., cell division and apoptosis) and diseases. There have already been reports on the opportunities in the diagnostic work-up for leukaemias and breast cancer. There are also applications on the more basic level, such as discovering precisely how the transcription apparatus works, and finding new genes and identifying their role. The use of microarrays in medicine is still in its infancy. It is anticipated that this and similar genome-wide analysis techniques will help in the elucidation of pathophysiological mechanisms, in making diagnoses and prognoses, and in monitoring treatment. The justifiable enthusiasm should, however, be accompanied by quality control, international standardisation and a critical approach towards the interpretation of results.

 

Breinbauer, R., I. R. Vetter, et al. (2002). "From protein domains to drug candidates-natural products as guiding principles in the design and synthesis of compound libraries." Angew Chem Int Ed Engl 41(16): 2879-90.

            In the continuing effort to find small molecules that alter protein function and ultimately might lead to new drugs, combinatorial chemistry has emerged as a very powerful tool. Contrary to original expectations that large libraries would result in the discovery of many hit and lead structures, it has been recognized that the biological relevance, design, and diversity of the library are more important. As the universe of conceivable compounds is almost infinite, the question arises: where is a biologically validated starting point from which to build a combinatorial library? Nature itself might provide an answer: natural products have been evolved to bind to proteins. Recent results in structural biology and bioinformatics indicate that the number of distinct protein families and folds is fairly limited. Often the same structural domain is used by many proteins in a more or less modified form created by divergent evolution. Recent progress in solid-phase organic synthesis has enabled the synthesis of combinatorial libraries based on the structure of complex natural products. It can be envisioned that natural-product-based combinatorial synthesis may permit hit or lead compounds to be found with enhanced probability and quality.

 

Brendel, V. and W. Zhu (2002). "Computational modeling of gene structure in Arabidopsis thaliana." Plant Mol Biol 48(1-2): 49-58.

            Computational gene identification by sequence inspection remains a challenging problem. For a typical Arabidopsis thaliana gene with five exons, at least one of the exons is expected to have at least one of its borders predicted incorrectly by ab initio gene finding programs. More detailed analysis for individual genomic loci can often resolve the uncertainty on the basis of EST evidence or similarity to potential protein homologues. Such methods are part of the routine annotation process. However, because the EST and protein databases are constantly growing, in many cases original annotation must be re-evaluated, extended, and corrected on the basis of the latest evidence. The Arabidopsis Genome Initiative is undertaking this task on the whole-genome scale via its participating genome centers. The current Arabidopsis genome annotation provides an excellent starting point for assessing the protein repertoire of a flowering plant. More accurate whole-genome annotation will require the combination of high-throughput and individual gene experimental approaches and computational methods. The purpose of this article is to discuss tools available to an individual researcher to evaluate gene structure prediction for a particular locus.

 

Brive, L. and R. Abagyan (2002). "Computational structural proteomics." Ernst Schering Res Found Workshop(38): 149-66.

           

Brizuela, L., A. Richardson, et al. (2002). "The FLEXGene repository: exploiting the fruits of the genome projects by creating a needed resource to face the challenges of the post-genomic era." Arch Med Res 33(4): 318-24.

            Thanks to the results of the multiple completed and ongoing genome sequencing projects and to the newly available recombination-based cloning techniques, it is now possible to build gene repositories with no precedent in their composition, formatting, and potential. This new type of gene repository is necessary to address the challenges imposed by the post-genomic era, i.e., experimentation on a genome-wide scale. We are building the FLEXGene (Full Length EXpression-ready) repository. This unique resource will contain clones representing the complete ORFeome of different organisms, including Homo sapiens as well as several pathogens and model organisms. It will consist of a comprehensive, characterized (sequence-verified), and arrayed gene repository. This resource will allow full exploitation of the genomic information by enabling genome-wide scale experimentation at the level of functional/phenotypic assays as well as at the level of protein expression, purification, and analysis. Here we describe the rationale and construction of this resource and focus on the data obtained from the Saccharomyces cerevisiae project.

 

Buchanan, S. G. (2002). "Structural genomics: bridging functional genomics and structure-based drug design." Curr Opin Drug Discov Devel 5(3): 367-81.

            Considerable advances in structural genomics have been witnessed in the last year. Several pilot studies have begun to report their initial results, and new centers have been funded to join the endeavor. The legacies of the genome sequencing efforts, namely high-throughput molecular biology and whole-organism genome sequences, have been integrated as front-end modules for structural genomics pipelines. Impressive advances have been made in NMR spectroscopy and X-ray crystallography. New methods in structural bioinformatics and computational chemistry have been published that provide the means to exploit the wealth of new information in drug discovery. Not surprisingly, the biopharmaceutical industry has been quick to recognize the benefits of these new developments and has begun to adopt them. This article reviews recent results from structural genomics initiatives and the potential applications of new information and technologies in the drug discovery process.

 

Buchanan, S. G., J. M. Sauder, et al. (2002). "The promise of structural genomics in the discovery of new antimicrobial agents." Curr Pharm Des 8(13): 1173-88.

            Structural Genomics stands out among the emerging fields of proteomics since it influences the drug discovery process at so many points. Recent developments in protein expression technologies, x-ray crystallography and NMR spectroscopy provide the essential elements for high-throughput structure determination platforms. Bioinformatics methods to interrogate the resulting data will provide comprehensive, genome-wide databases of protein structure. Genomic sequencing and methods for high-throughput expression and protein purification are furthest advanced for microbial genes and so these have been the early targets for structural genomics initiatives. The information will be invaluable in understanding gene function, designing broad-spectrum small molecule inhibitors and in better understanding drug-host interactions.

 

Cacabelos, R. (2002). "Pharmacogenomics in Alzheimer's disease." Mini Rev Med Chem 2(1): 59-84.

            Alzheimer's disease (AD) is a complex disorder associated with multiple genetic defects either mutational or of susceptibility. Information available on AD genetics does not explain in full the etiopathogenesis of AD, suggesting that environmental factors and/or epigenetic phenomena may also contribute to AD pathology and phenotypic expression of dementia. The genomics of AD is still in its infancy, but is helping to understand novel aspects of the disease including genetic epidemiology, multifactorial risk factors, pathogenic mechanisms associated with genetic networks and genetically-regulated metabolic cascades. AD genomics is also helping to develop new strategies in pharmacogenomic research and prevention. Functional genomics, proteomics, pharmacogenomics, high-throughput methods, combinatorial chemistry and modern bioinformatics will greatly contribute to accelerate drug development for AD and other complex disorders. Main genes involved in AD include mutational loci (APP, PS1, PS2, TAU) and multiple susceptibility loci (APOE, A2M, AACT, LRP1, IL1A, TNF, ACE, BACE, BCHE, CST3, MTHFR, GSK3B, NOS) distributed across the human genome. Genomic associations integrate bigenic, trigenic, tetragenic or polygenic matrix models to investigate the genomic organization of AD in comparison to the control population. Similar genetic models are used in pharmacogenomics to elucidate genotype-specific responses of AD patients to a particular drug or combination of drugs. Using APOE-related monogenic models it has been demonstrated that the therapeutic response to drugs in AD is genotype-specific. A multifactorial therapy combining 3 different drugs yielded positive results during the 6-12 months in approximately 60% of the patients. With this therapeutic strategy, APOE-4/4 carriers were the worst responders, and patients with the APOE-3/4 genotype were the best responders. In bigenic and trigenic models it was possible to differentiate the influencial effect of PS1 and PS2 polymorphic variants on mental performance in response to multifactorial therapy. The application of functional genomics to AD can be a suitable strategy for harmonization in molecular diagnosis and drug clinical trials. Furthermore, the pharmacogenomics of AD may contribute in the future to optimise drug development and therapeutics, increasing efficacy and safety, and reducing side-effects and unnecessary costs.

 

Cariou, A., J. D. Chiche, et al. (2002). "The era of genomics: impact on sepsis clinical trial design." Crit Care Med 30(5 Suppl): S341-8.

            OBJECTIVE: This article aims to address the predictable impact of genetics on the design of clinical trials in the field of critical care medicine, with emphasis on the pathophysiology of sepsis and its treatment. DATA SOURCES: Published articles reporting studies on sepsis and septic shock or assessing the influence of genetics and pharmacogenomics in the treatment of critical illnesses. DATA ANALYSIS: Because most common diseases including sepsis have been shown to be influenced by inherited differences in our genes, completion of the Human Genome Project and the concomitant publication of the human single nucleotide polymorphism map both contribute to change our approach to medicine. Advances in genotyping techniques and bioinformatics enabling detection of single nucleotide polymorphisms have caused an explosion in pharmacogenomics-the research dealing with the interactions of an individual's genotype and the outcome of a drug therapy. Pharmacogenomics will undoubtedly be used to improve future health care and clinical research in different ways. Whereas treatment allocation has been based mainly on phenotype, genetic characterization will help researchers to identify suitable subjects for clinical trials, to facilitate interpretation of the results of clinical trials, and to identify novel targets for future drugs or new markets for current products. As interindividual variability in drug response is a substantial clinical problem, the second major objective of pharmacogenomic research is to decrease adverse responses to therapy through determination of adequate therapeutic targets and genetic polymorphisms that alter drug specificity and toxicity. Ultimately, genetic information will be used to select the most effective therapeutic agent and the optimal dosage to elicit the expected drug response for a given individual. Implementation of genetic criteria for stratification of patient populations and individual assessment of treatment risks and benefits emerges as a major challenge to the pharmaceutical industry. CONCLUSIONS: In the future, technologies such as gene chip array will enhance genetic medicine and provide novel insights into a patient's susceptibility to disease, enabling a better assessment of prognostic risk factors, quicker diagnosis, and accurate prediction of individual responsiveness to drugs. The predictable consequences of such an approach on the prevention and treatment of diseases could revolutionize medicine.

 

Chakravarti, D. N. (2002). "From the decline and fall of protein chemistry to proteomics." Biotechniques Suppl: 2-3.

           

Chakravarti, D. N., B. Chakravarti, et al. (2002). "Informatic tools for proteome profiling." Biotechniques Suppl: 4-10, 12-5.

            In recent years, the practice of proteomics research has experienced a dramatic shift within the pharmaceutical and biotechnology industry with the widespread implementation of novel applications. The areas of interest extend all the way from discovery of novel drug, vaccine, and diagnostic targets, characterization of protein-based products, toxicology, and identification of surrogate markers of activity in clinical research, to the ability to provide information on the mechanisms of drug action. The power of two-dimensional gel electrophoresis as well as advances in mass spectrometric techniques combined with sequence database correlation have enabled speed and accuracy in identification of proteins in complex mixtures. This article surveys currently available software and informatic tools related to these methods for proteome profiling. The broad acceptance of these technologies, however, has not been accompanied by significant advances in the informatics and software tools necessary to support the analysis and management of the massive amounts of data generated in the process. In this context, this article also discusses the importance of relational databases for protein identification data management.

 

Chance, M. R., A. R. Bresnick, et al. (2002). "Structural genomics: a pipeline for providing structures for the biologist." Protein Sci 11(4): 723-38.

           

Chen, C. W., C. H. Huang, et al. (2002). "Once the circle has been broken: dynamics and evolution of Streptomyces chromosomes." Trends Genet 18(10): 522-9.

            Chromosomal instability has been a hallmark of Streptomyces genetics. Deletions and circularization often occur in the less-conserved terminal sequences of the linear chromosomes, which contain swarms of transposable elements and other horizontally transferred elements. Intermolecular recombination involving these regions also generates gross exchanges, resulting in terminal inverted repeats of heterogeneous size and context. The structural instability is evidently related to evolution of the Streptomyces chromosomes, which is postulated to involve linearization of hypothetical circular progenitors via integration of a linear plasmid. This scenario is supported by several bioinformatic analyses.

 

Chiu, W., M. L. Baker, et al. (2002). "Deriving folds of macromolecular complexes through electron cryomicroscopy and bioinformatics approaches." Curr Opin Struct Biol 12(2): 263-9.

            Intermediate-resolution (7-9A) structures of large macromolecular complexes can be obtained by electron cryomicroscopy. This structural information, combined with bioinformatics data for the individual protein components or domains, can lead to a fold model for the entire complex. Such approaches have been demonstrated with the 6.8 A structure of the rice dwarf virus to derive models for the major capsid shell proteins.

 

Clark, D. E. and P. D. Grootenhuis (2002). "Progress in computational methods for the prediction of ADMET properties." Curr Opin Drug Discov Devel 5(3): 382-90.

            This review surveys recent progress in the development and application of computational techniques for the prediction of absorption, distribution, metabolism, elimination and toxicity (ADMET) properties, including intestinal permeability, blood-brain barrier penetration, active transport/efflux, aqueous solubility, metabolism and toxicity. While much effort continues to be expended in this field with some success on existing datasets, perhaps the most pressing need at this time is for larger, high-quality sets of experimental data to provide a sound basis for model building.

 

Cole, J. and F. Isik (2002). "Human genomics and microarrays: implications for the plastic surgeon." Plast Reconstr Surg 110(3): 849-58.

            The Human Genome Project was launched in 1989 in an effort to sequence the entire span of human DNA. Although coding sequences are important in identifying mutations, the static order of DNA does not explain how a cell or organism may respond to normal and abnormal biological processes. By examining the mRNA content of a cell, researchers can determine which genes are being activated in response to a stimulus.Traditional methods in molecular biology generally work on a "one gene: one experiment" basis, which means that the throughput is very limited and the "whole picture" of gene function is hard to obtain. To study each of the 60,000 to 80,000 genes in the human genome under each biological circumstance is not practical. Recently, microarrays (also known as gene or DNA chips) have emerged; these allow for the simultaneous determination of expression for thousands of genes and analysis of genome-wide mRNA expression.The purpose of this article is twofold: first, to provide the clinical plastic surgeon with a working knowledge and understanding of the fields of genomics, microarrays, and bioinformatics and second, to present a case to illustrate how these technologies can be applied in the study of wound healing.

 

Cole, S. T. (2002). "Comparative and functional genomics of the Mycobacterium tuberculosis complex." Microbiology 148(Pt 10): 2919-28.

           

Croston, G. E. (2002). "Functional cell-based uHTS in chemical genomic drug discovery." Trends Biotechnol 20(3): 110-5.

            The availability of genomic information significantly increases the number of potential targets available for drug discovery, although the function of many targets and their relationship to disease is unknown. In a chemical genomic research approach, ultra-high throughput screening (uHTS) of genomic targets takes place early in the drug discovery process, before target validation. Target-selective modulators then provide drug leads and pharmacological research tools to validate target function. Effective implementation of a chemical genomic strategy requires assays that can perform uHTS for large numbers of genomic targets. Cell-based functional assays are capable of the uHTS throughput required for chemical genomic research, and their functional nature provides distinct advantages over ligand-binding assays in the identification of target-selective modulators.

 

Dandekar, T. and R. Sauerborn (2002). "Comparative genome analysis and pathway reconstruction." Pharmacogenomics 3(2): 245-56.

            Pathway reconstruction builds on genome and biochemical data with the aim of reconstructing higher level interactions between identified enzymes in a specific genome, in particular the different enzyme pathways (species or individual/patient). Metabolite flow in a pathway is analyzed by different tools, such as elementary mode analysis. This reveals key enzymes and pharmacological targets in the enzyme network. An overview of bioinformatic tools and algorithms for these tasks, application examples and recent results from these techniques are presented. Target selection, drug development and optimization can all be sped up using these approaches.

 

Davidson, E. H., J. P. Rast, et al. (2002). "A genomic regulatory network for development." Science 295(5560): 1669-78.

            Development of the body plan is controlled by large networks of regulatory genes. A gene regulatory network that controls the specification of endoderm and mesoderm in the sea urchin embryo is summarized here. The network was derived from large-scale perturbation analyses, in combination with computational methodologies, genomic data, cis-regulatory analysis, and molecular embryology. The network contains over 40 genes at present, and each node can be directly verified at the DNA sequence level by cis-regulatory analysis. Its architecture reveals specific and general aspects of development, such as how given cells generate their ordained fates in the embryo and why the process moves inexorably forward in developmental time.

 

De Groot, A. S., H. Sbai, et al. (2002). "Immuno-informatics: Mining genomes for vaccine components." Immunol Cell Biol 80(3): 255-69.

            The complete genome sequences of more than 60 microbes have been completed in the past decade. Concurrently, a series of new informatics tools, designed to harness this new wealth of information, have been developed. Some of these new tools allow researchers to select regions of microbial genomes that trigger immune responses. These regions, termed epitopes, are ideal components of vaccines. When the new tools are used to search for epitopes, this search is usually coupled with in vitro screening methods; an approach that has been termed computational immunology or immuno-informatics.Researchers are now implementing these combined methods to scan genomic sequences for vaccine components. They are thereby expanding the number of different proteins that can be screened for vaccine development, while narrowing this search to those regions of the proteins that are extremely likely to induce an immune response.As the tools improve, it may soon be feasible to skip over many of the in vitro screening steps, moving directly from genome sequence to vaccine design. The present article reviews the work of several groups engaged in the development of immuno-informatics tools and illustrates the application of these tools to the process of vaccine discovery.

 

de Jong, H. (2002). "Modeling and simulation of genetic regulatory systems: a literature review." J Comput Biol 9(1): 67-103.

            In order to understand the functioning of organisms on the molecular level, we need to know which genes are expressed, when and where in the organism, and to which extent. The regulation of gene expression is achieved through genetic regulatory systems structured by networks of interactions between DNA, RNA, proteins, and small molecules. As most genetic regulatory networks of interest involve many components connected through interlocking positive and negative feedback loops, an intuitive understanding of their dynamics is hard to obtain. As a consequence, formal methods and computer tools for the modeling and simulation of genetic regulatory networks will be indispensable. This paper reviews formalisms that have been employed in mathematical biology and bioinformatics to describe genetic regulatory systems, in particular directed graphs, Bayesian networks, Boolean networks and their generalizations, ordinary and partial differential equations, qualitative differential equations, stochastic equations, and rule-based formalisms. In addition, the paper discusses how these formalisms have been used in the simulation of the behavior of actual regulatory systems.

 

Dougherty, E. R., J. Barrera, et al. (2002). "Inference from clustering with application to gene-expression microarrays." J Comput Biol 9(1): 105-26.

            There are many algorithms to cluster sample data points based on nearness or a similarity measure. Often the implication is that points in different clusters come from different underlying classes, whereas those in the same cluster come from the same class. Stochastically, the underlying classes represent different random processes. The inference is that clusters represent a partition of the sample points according to which process they belong. This paper discusses a model-based clustering toolbox that evaluates cluster accuracy. Each random process is modeled as its mean plus independent noise, sample points are generated, the points are clustered, and the clustering error is the number of points clustered incorrectly according to the generating random processes. Various clustering algorithms are evaluated based on process variance and the key issue of the rate at which algorithmic performance improves with increasing numbers of experimental replications. The model means can be selected by hand to test the separability of expected types of biological expression patterns. Alternatively, the model can be seeded by real data to test the expected precision of that output or the extent of improvement in precision that replication could provide. In the latter case, a clustering algorithm is used to form clusters, and the model is seeded with the means and variances of these clusters. Other algorithms are then tested relative to the seeding algorithm. Results are averaged over various seeds. Output includes error tables and graphs, confusion matrices, principal-component plots, and validation measures. Five algorithms are studied in detail: K-means, fuzzy C-means, self-organizing maps, hierarchical Euclidean-distance-based and correlation-based clustering. The toolbox is applied to gene-expression clustering based on cDNA microarrays using real data. Expression profile graphics are generated and error analysis is displayed within the context of these profile graphics. A large amount of generated output is available over the web.

 

Dougherty, T. J., J. F. Barrett, et al. (2002). "Microbial genomics and novel antibiotic discovery: new technology to search for new drugs." Curr Pharm Des 8(13): 1119-35.

            The process of prokaryotic drug discovery has been a model of success for over fifty years, yet the number of exploited bacterial targets is a mere fraction, less than 0.1% of the potential targets (based on total number of bacterial genes identified by gene sequence projects). To better understand the potential for drug intervention, multiple paradigms have been established in the pharmaceutical industry, all with some semblance of commonality and uniqueness to provide proprietary positioning, yet no company has been successful to date in taking a genomics approach to the finish line of having a genomics-based drug on the market. Within this overview, we provide a strategic overview of a sample process for the identification, validation and exploitation of novel antibacterial targets ascertained through a bioinformatics-based genomics drug discovery program.

 

Doytchinova, I. A. and D. R. Flower (2002). "Quantitative approaches to computational vaccinology." Immunol Cell Biol 80(3): 270-9.

            This article reviews the newly released JenPep database and two new powerful techniques for T-cell epitope prediction: (i) the additive method; and (ii) a 3D-Quantitative Structure Activity Relationships (3D-QSAR) method, based on Comparative Molecular Similarity Indices Analysis (CoMSIA). The JenPep database is a family of relational databases supporting the growing need of immunoinformaticians for quantitative data on peptide binding to major histocompatibility complexes and to the Transporters associated with Antigen Processing (TAP). It also contains an annotated list of T-cell epitopes. The database is available free via the Internet (http://www.jenner.ac.uk/JenPep). The additive prediction method is based on the assumption that the binding affinity of a peptide depends on the contributions from each amino acid as well as on the interactions between the adjacent and every second side-chain. In the 3D-QSAR approach, the influence of five physicochemical properties (steric bulk, electrostatic potential, local hydrophobicity, hydrogen-bond donor and hydrogen-bond acceptor abilities) on the affinity of peptides binding to MHC molecules were considered. Both methods were exemplified through their application to the well-studied problem of peptides binding to the human class I MHC molecule HLA-A*0201.

 

Eddy, S. R. (2002). "Computational genomics of noncoding RNA genes." Cell 109(2): 137-40.

            The number of known noncoding RNA genes is expanding rapidly. Computational analysis of genome sequences, which has been revolutionary for protein gene analysis, should also be able to address questions of the number and diversity of noncoding RNA genes. However, noncoding RNAs present computational genomics with a new set of challenges.

 

Egan, W. J. and G. Lauri (2002). "Prediction of intestinal permeability." Adv Drug Deliv Rev 54(3): 273-89.

            This review focuses on computational methods for the prediction of passive intestinal permeability. Existing computational models are surveyed and assessed in terms of descriptors, model type/complexity, speed of computation, predictive performance, and interpretability. Challenges to the successful computational prediction of intestinal permeability, i.e. data quantity, measurement imprecision, confounding factors such as solubility, metabolism, or active efflux, and the need for robust statistical methods, are also discussed.

 

Escribano, J. and M. Coca-Prados (2002). "Bioinformatics and reanalysis of subtracted expressed sequence tags from the human ciliary body: Identification of novel biological functions." Mol Vis 8: 315-32.

            PURPOSE: The ciliary body is largely known for its major roles in the regulation of aqueous humor secretion, intraocular pressure, and accommodation of the lens. In this review article we applied bioinformatics to re-examine hundreds of expressed sequence tags (ESTs) previously isolated by subtractive hybridization from a human ciliary body library [1]. The DNA sequences of these clones have been recently added to the web site of NEIBank. METHODS: DNA sequence comparisons of subtracted ESTs were performed against all entries in the last available release of the non-redundant database containing GenBank, EMBL, DDBJ and PDB sequences using the BlastN program accessed through NCBI's BLAST services on the internet (NCBI). Sequences were also compared and mapped using the Blast search program provided through the Internet by the Human Genome Project (UCSC). RESULTS: A total number of 284 independent ESTs were classified in 17 functional groups. Analysis of their relationships allowed to define the expression of five major groups of known genes: (i) protein synthesis, folding, secretion and degradation (20%); (ii) energy supply and biosynthesis (12%); (iii) contractility and cytoskeleton structure (6%); (iv) cellular signaling and cell cycle regulation (7%); and (v) nerve cell related tasks (2%), including neuropeptide processing and putative non-visual phototransduction and circadian rhythm control. The largest group contain unidentified sequences, a total of 105 sequences, accounting for 37% of ESTs. The unidentified sequences show similarity to genomic non-coding regions, or genes of unknown function. CONCLUSIONS: The most highly represented EST, correspond to myocilin, a gene involved in glaucoma. The data also confirms the secretory functions of the ciliary epithelium, and its high metabolism; the presence of a neuroendocrine peptidergic system presumably involved in the regulation of the intraocular pressure and/or aqueous humor secretion. Additional genes may be related to a non-visual phototransduction cascade and/or to circadian rhythms. Overall this initial group of subtracted ESTs can lead to uncover novel physiological functions of the ciliary body in normal and in disease, as well as novel candidate genes for ocular diseases.

 

Fabrega, S., P. Durand, et al. (2002). "[The active site of human glucocerebrosidase: structural predictions and experimental validations]." J Soc Biol 196(2): 151-60.

            Gaucher disease is a lysosomal storage disorder caused by a deficiency in glucocerebrosidase which cleaves the beta-glucosidic linkage of glucosylceramide, a normal intermediate in glycolipid metabolism. Glucocerebrosidase belongs to the clan GH-A of glycoside hydrolases, a large group of enzymes which function with retention of the anomeric configuration at the hydrolysis site. Accurate three-dimensional (3D) structure data for glucocerebrosidase should help to better understand the molecular bases of Gaucher disease. As such 3D structure data were not available, we used the two-dimensional hydrophobic cluster analysis (HCA) method to make structure predictions for the catalytic domains of clan GH-A glycoside hydrolases. We found that all the enzymes of clan GH-A may share a similar catalytic domain consisting of an (alpha/beta)8 barrel with the critical acid/base and nucleophile residues located at the C-terminal ends of strands beta 4 and beta 7, respectively. In the case of glucocerebrosidase, Glu 235 was predicted to be the putative acid/base catalyst whereas the nucleophile was located at Glu 340. Next, in order to obtain experimental evidence supporting these HCA-based predictions, we used retroviral vectors to express, in murine null cells, E235A and E340A mutant proteins, in which alanine residues unable to participate in the enzymatic reaction replace the presumed critical glutamic acid residues. Both mutants were found to be catalytically inactive although they were correctly folded/processed and sorted to the lysosome. Thus, Glu 235 and Glu 340 do indeed play key roles in the active site of human glucocerebrosidase as predicted by the HCA analysis. In a broader perspective, our work points out that bioinformatics approaches may be highly useful for generating structure-function predictions based on sequence-structure interrelationships, especially in the context of a rapid increase in protein sequence information through genome sequencing.

 

Fairlamb, A. H. (2002). "Metabolic pathway analysis in trypanosomes and malaria parasites." Philos Trans R Soc Lond B Biol Sci 357(1417): 101-7.

            Identification of novel drug targets is required for the development of new classes of drugs to overcome drug resistance and replace less efficacious treatments. In theory, knowledge of the entire genome of a pathogen identifies every potential drug target in any given microbe. In practice, the sheer complexity and the inadequate or inaccurate annotation of genomic information makes target identification and selection somewhat more difficult. Analysis of metabolic pathways provides a useful conceptual framework for the identification of potential drug targets and also for improving our understanding of microbial responses to nutritional, chemical and other environmental stresses. A number of metabolic databases are available as tools for such analyses. The strengths and weaknesses of this approach are discussed.

 

Fielden, M. R., J. B. Matthews, et al. (2002). "In silico approaches to mechanistic and predictive toxicology: an introduction to bioinformatics for toxicologists." Crit Rev Toxicol 32(2): 67-112.

            Bioinformatics, or in silico biology, is a rapidly growing field that encompasses the theory and application of computational approaches to model, predict, and explain biological function at the molecular level. This information rich field requires new skills and new understanding of genome-scale studies in order to take advantage of the rapidly increasing amount of sequence, expression, and structure information in public and private databases. Toxicologists are poised to take advantage of the large public databases in an effort to decipher the molecular basis of toxicity. With the advent of high-throughput sequencing and computational methodologies, expressed sequences can be rapidly detected and quantitated in target tissues by database searching. Novel genes can also be isolated in silico, while their function can be predicted and characterized by virtue of sequence homology to other known proteins. Genomic DNA sequence data can be exploited to predict target genes and their modes of regulation, as well as identify susceptible genotypes based on single nucleotide polymorphism data. In addition, highly parallel gene expression profiling technologies will allow toxicologists to mine large databases of gene expression data to discover molecular biomarkers and other diagnostic and prognostic genes or expression profiles. This review serves to introduce to toxicologists the concepts of in silico biology most relevant to mechanistic and predictive toxicology, while highlighting the applicability of in silico methods using select examples.

 

Frank, A. O., P. W. Walsh, et al. (2002). "Computational fluid dynamics and stent design." Artif Organs 26(7): 614-21.

            Stents are small, usually metallic tubes that are intended to prop open arteries blocked with atherosclerotic plaques. While stents have been used successfully in recent years, they still suffer from failure due to development of new tissue in stented segment (restenosis). Variations in the failure rates associated with different stent designs have led researchers to investigate the role of near-wall flow patterns. While there is no direct evidence yet, the patterns of flow stagnation as the blood flows past the stent struts may affect the restenosis process. Computational fluid dynamics (CFD) approaches are well suited for obtaining detailed information on stent flow patterns. Many CFD simulations make use of a two-dimensional model. The strong dependence of flow stagnation on stent strut spacing has been clearly demonstrated. These results have been employed to interpret the results of in vitro experiments designed to elucidate the mechanisms of restenosis.

 

Friedman, N. and N. Kaminski (2002). "Statistical methods for analyzing gene expression data for cancer research." Ernst Schering Res Found Workshop(38): 109-31.

           

Frishman, D., A. Kaps, et al. (2002). "Online genomics facilities in the new millennium." Pharmacogenomics 3(2): 265-71.

            The review begins by providing a brief typology of biological databases on the Internet, illustrated by examples of the most influential resources of each kind. We then take an insider look at one typical on-line genomic resource -- the yeast genome database hosted at the Munich Information Center for Protein Sequences (MIPS) -- and explain how and why it has evolved from a basic sequence repository to a multidomain knowledge base. The role of community efforts in curating and annotating genome data is discussed. The crucial role of data integration and interoperability in developing next-generation genomic facilities is underscored.

 

Fryer, R. M., J. Randall, et al. (2002). "Global analysis of gene expression: methods, interpretation, and pitfalls." Exp Nephrol 10(2): 64-74.

            Over the past 15 years, global analysis of mRNA expression has emerged as a powerful strategy for biological discovery. Using the power of parallel processing, robotics, and computer-based informatics, a number of high-throughput methods have been devised. These include DNA microarrays, serial analysis of gene expression, quantitative RT-PCR, differential-display RT-PCR, and massively parallel signature sequencing. Each of these methods has inherent advantages and disadvantages, often related to expense, technical difficulty, specificity, and reliability. Further, the ability to generate large data sets of gene expression has led to new challenges in bioinformatics. Nonetheless, this technological revolution is transforming disease classification, gene discovery, and our understanding of regulatory gene networks.

 

Fukami-Kobayashi, K. and N. Saito (2002). "[How to make good use of CLUSTALW]." Tanpakushitsu Kakusan Koso 47(9): 1237-9.

           

Gabius, H. J., S. Andre, et al. (2002). "The sugar code: functional lectinomics." Biochim Biophys Acta 1572(2-3): 165-77.

            Analysis of the genome and proteome assumes the focus of attention in efforts to relate biochemical coding with cell functionality. Among other chores in energy metabolism, the talents of carbohydrates to establish a high-density coding system give reason for a paradigmatic shift. The sequence complexity of glycans and glycan-processing enzymes (glycosyltransferases, glycosidases and enzymes introducing substituents such as sulfotransferases), the growing evidence for the importance of glycans from transgenic and knock-out animal models and the correlation of defects in glycosylation with diseases are substantial assets to portray oligosaccharides as code words in their own right. Matching the pace of progress in the work on glycoconjugates, the increasing level of refinement of our knowledge about lectins (definition of this term: carbohydrate-binding proteins, excluding sugar-specific antibodies, receptors of free mono- or disaccharides for transport or chemotaxis and enzymes modifying the bound carbohydrate) epitomizes the sphere of action of the sugar code (functional lectinomics). It encompasses, among other activities, intra- and intercellular transport processes, sensor branches of innate immunity, regulation of cell-cell (matrix) adhesion or migration and positive/negative growth control with implications for differentiation and malignancy. The Q & A approach taken in this review lists a series of arguments in a stepwise manner to make the reader wonder why it is only a rather recent process that the concept of the sugar code has taken root in deciphering the mechanistic versatility of biological information storage and transfer.

 

Gaucher, E. A., X. Gu, et al. (2002). "Predicting functional divergence in protein evolution by site-specific rate shifts." Trends Biochem Sci 27(6): 315-21.

            Most modern tools that analyze protein evolution allow individual sites to mutate at constant rates over the history of the protein family. However, Walter Fitch observed in the 1970s that, if a protein changes its function, the mutability of individual sites might also change. This observation is captured in the "non-homogeneous gamma model", which extracts functional information from gene families by examining the different rates at which individual sites evolve. This model has recently been coupled with structural and molecular biology to identify sites that are likely to be involved in changing function within the gene family. Applying this to multiple gene families highlights the widespread divergence of functional behavior among proteins to generate paralogs and orthologs.

 

Gendel, S. M. (2002). "Sequence analysis for assessing potential allergenicity." Ann N Y Acad Sci 964: 87-98.

            Sequence analysis plays an important role in assessing the potential allergenicity of proteins used in transgenic foods, particularly for proteins that have not previously been part of the food supply. Sequence comparisons are used to indicate potential unexpected cross reactivity to existing allergens and to assess the potential for developing new sensitivities. Although the concept of using sequence analysis is straightforward, implementing a bioinformatic analysis that is accurate and complete can be complex. Several factors need to be considered, including the design and content of the sequence database, the analysis strategy, and the criteria for evaluating the results.

 

Gentzel, M., T. Kocher, et al. (2002). "Proteomics in biological research: the challenge to make proteins speak." Ernst Schering Res Found Workshop(38): 167-89.

           

Gerhold, D. L., R. V. Jensen, et al. (2002). "Better therapeutics through microarrays." Nat Genet 32 Suppl: 547-51.

            DNA microarrays are an integral part of the process for therapeutic discovery, optimization and clinical validation. At an early stage, investigators use arrays to prioritize a few genes as potential therapeutic targets on the basis of various criteria. Subsequently, gene expression analysis assists in drug discovery and toxicology by eliminating poor compounds and optimizing the selection of promising leads. Integral to this process is the use of sophisticated statistics, mathematics and bioinformatics to define statistically valid observations and to deduce complex patterns of phenotypes and biological pathways. In short, microarrays are redefining the drug discovery process by providing greater knowledge at each step and by illuminating the complex workings of biological systems.

 

Gerlai, R. (2002). "Phenomics: fiction or the future?" Trends Neurosci 25(10): 506-9.

            The ease with which genetic mutations can be induced in or introduced into mammalian organisms, such as the mouse, has created a significant need for phenotypic analysis. Developments in computer technology, instrumentation and bioinformatics, as well as in numerous neuroscience disciplines, will help to meet the demands set by the molecular revolution. As a result, the field of 'phenomics' is being born. This will integrate multidisciplinary research, with the goal of understanding the complex phenotypic consequences of genetic mutations at the level of the organism. This paper focuses on one of the disciplines that show promising developments, behavioral science.

 

Gieser, P., G. C. Bloom, et al. (2002). "Introduction to microarray experimentation and analysis." Methods Mol Biol 184: 29-49.

           

Gohil, K. and L. Packer (2002). "Bioflavonoid-rich botanical extracts show antioxidant and gene regulatory activity." Ann N Y Acad Sci 957: 70-7.

            Reactive oxygen and nitrogen metabolites are obligatory and essential products of metabolism. Unregulated increase in their production is associated with a number of chronic illnesses. Diets rich in fruits, vegetables, and wines are implicated in the prevention of chronic diseases. Molecular mechanisms by which fruits and vegetables confer their disease-preventive actions are poorly defined. However, recent developments in the fields of genomics and bioinformatics provide powerful tools to investigate the mechanisms by which botanicals affect cellular functions. This monograph illustrates the potential of large-scale messenger RNA analysis to unravel the role of transcription in mediating the effects of botanical extracts with antioxidant properties. The application of microarrays and oligonucleotide arrays shows multiple effects of antioxidant extracts on the expression of a broad spectrum of genes.

 

Goldsmith, L. J. (2002). "Power and sample size considerations in molecular biology." Methods Mol Biol 184: 111-30.

           

Goodman, N. (2002). "Biological data becomes computer literate: new advances in bioinformatics." Curr Opin Biotechnol 13(1): 68-71.

            Bioinformatics is an art and science concerned with the use of computing in biological research areas such as genomics, transcriptomics, proteomics, genetics, and evolution. This review paints a broad picture of bioinformatics, drawing examples from genomic sequencing and microarray analysis. I highlight the role of bioinformatics at multiple points along the path from high-tech data generation to biological discovery.

 

Goto, S. (2002). "[Mastering Web-based analysis of gene networks]." Tanpakushitsu Kakusan Koso 47(5): 635-41.

           

Grass, G. M. and P. J. Sinko (2002). "Physiologically-based pharmacokinetic simulation modelling." Adv Drug Deliv Rev 54(3): 433-51.

            Drug selection is now widely viewed as an important and relatively new, yet largely unsolved, bottleneck in the drug discovery and development process. In order to achieve an efficient selection process, high quality, rapid, predictive and correlative ADME models are required in order for them to be confidently used to support critical financial decisions. Systems that can be relied upon to accurately predict performance in humans have not existed, and decisions have been made using tools whose capabilities could not be verified until candidates went to clinical trial, leading to the high failure rates historically observed. However, with the sequencing of the human genome, advances in proteomics, the anticipation of the identification of a vastly greater number of potential targets for drug discovery, and the potential of pharmacogenomics to require individualized evaluation of drug kinetics as well as drug effects, there is an urgent need for rapid and accurately computed pharmacokinetic properties.

 

Graves, P. R. and T. A. Haystead (2002). "Molecular biologist's guide to proteomics." Microbiol Mol Biol Rev 66(1): 39-63; table of contents.

            The emergence of proteomics, the large-scale analysis of proteins, has been inspired by the realization that the final product of a gene is inherently more complex and closer to function than the gene itself. Shortfalls in the ability of bioinformatics to predict both the existence and function of genes have also illustrated the need for protein analysis. Moreover, only through the study of proteins can posttranslational modifications be determined, which can profoundly affect protein function. Proteomics has been enabled by the accumulation of both DNA and protein sequence databases, improvements in mass spectrometry, and the development of computer algorithms for database searching. In this review, we describe why proteomics is important, how it is conducted, and how it can be applied to complement other existing technologies. We conclude that currently, the most practical application of proteomics is the analysis of target proteins as opposed to entire proteomes. This type of proteomics, referred to as functional proteomics, is always driven by a specific biological question. In this way, protein identification and characterization has a meaningful outcome. We discuss some of the advantages of a functional proteomics approach and provide examples of how different methodologies can be utilized to address a wide variety of biological problems.

 

Gupta, R., L. J. Jensen, et al. (2002). "Orphan protein function and its relation to glycosylation." Ernst Schering Res Found Workshop(38): 276-94.

           

Guzey, C. and O. Spigset (2002). "Genotyping of drug targets: a method to predict adverse drug reactions?" Drug Saf 25(8): 553-60.

            In the last decades, advances in molecular biology have led to modern pharmacogenetics, which started as a science that focused on investigating drug metabolising enzymes and genetic determinants of pharmacokinetic variability. As more evidence has become available on the structure of drug targets and the genes coding for them, increasing attention has been directed towards pharmacodynamic explanations of variability in therapeutic response as well as in the risk for adverse drug reactions. Traditionally, genetic drug safety research has focused on variations in single genes whose functions are known to be related to given adverse drug reactions. A few such examples, malignant hyperthermia, the long QT syndrome, venous thromboembolic disease, tardive dyskinesia, and drug addiction, are presented in this article. In the future, results from the Human Genome Project together with tools such as DNA microarray technology, high-output screening systems and advanced bioinformatics, will permit a more thorough elucidation than is currently possible of the genetic components of adverse drug reactions. By screening for a large number of single nucleotide polymorphisms (SNPs), SNP patterns associated with adverse drug reactions can be discovered even though the functions of the SNPs as such are completely unknown. On the basis of these findings, it can be expected that pharmacogenetic research will identify situations where a drug should be avoided in certain individuals in order to reduce the risk for adverse drug reactions. If so, it will be feasible to use molecular diagnostics to select drugs that are safe for the individual patient.

 

Haberkorn, U., A. Altmann, et al. (2002). "Functional genomics and proteomics--the role of nuclear medicine." Eur J Nucl Med Mol Imaging 29(1): 115-32.

            Now that the sequencing of the human genome has been completed, the basic challenges are finding the genes, locating their coding regions and predicting their functions. This will result in a new understanding of human biology as well as in the design of new molecular structures as potential novel diagnostic or drug discovery targets. The assessment of gene function may be performed using the tools of the genome program. These tools represent high-throughput methods used to evaluate changes in the expression of many or all genes of an organism at the same time in order to investigate genetic pathways for normal development and disease. This will lead to a shift in the scientific paradigm: In the pre-proteomics era, functional assignments were derived from hypothesis-driven experiments designed to understand specific cellular processes. The new tools describe proteins on a proteome-wide scale, thereby creating a new way of doing cell research which results in the determination of three-dimensional protein structures and the description of protein networks. These descriptions may then be used for the design of new hypotheses and experiments in the traditional physiological, biochemical and pharmacological sense. The evaluation of genetically manipulated animals or newly designed biomolecules will require a thorough understanding of physiology, biochemistry and pharmacology and the experimental approaches will involve many new technologies, including in vivo imaging with single-photon emission tomography and positron emission tomography. Nuclear medicine procedures may be applied for the determination of gene function and regulation using established and new tracers or using in vivo reporter genes such as enzymes, receptors, antigens or transporters. Pharmacogenomics will identify new surrogate markers for therapy monitoring which may represent potential new tracers for imaging. Also, drug distribution studies for new therapeutic biomolecules are needed, at least during preclinical stages of drug development. Finally, new biomolecules will be developed by bioengineering methods which may be used for isotope-based diagnosis and treatment of disease.

 

Halfon, M. S. and A. M. Michelson (2002). "Exploring genetic regulatory networks in metazoan development: methods and models." Physiol Genomics 10(3): 131-43.

            One of the foremost challenges of 21st century biological research will be to decipher the complex genetic regulatory networks responsible for embryonic development. The recent explosion of whole genome sequence data and of genome-wide transcriptional profiling methods, such as microarrays, coupled with the development of sophisticated computational tools for exploiting and analyzing genomic data, provide a significant starting point for regulatory network analysis. In this article we review some of the main methodological issues surrounding genome annotation, transcriptional profiling, and computational prediction of cis-regulatory elements and discuss how the power of model genetic organisms can be used to experimentally verify and extend the results of genomic research.

 

Halperin, I., B. Ma, et al. (2002). "Principles of docking: An overview of search algorithms and a guide to scoring functions." Proteins 47(4): 409-43.

            The docking field has come of age. The time is ripe to present the principles of docking, reviewing the current state of the field. Two reasons are largely responsible for the maturity of the computational docking area. First, the early optimism that the very presence of the "correct" native conformation within the list of predicted docked conformations signals a near solution to the docking problem, has been replaced by the stark realization of the extreme difficulty of the next scoring/ranking step. Second, in the last couple of years more realistic approaches to handling molecular flexibility in docking schemes have emerged. As in folding, these derive from concepts abstracted from statistical mechanics, namely, populations. Docking and folding are interrelated. From the purely physical standpoint, binding and folding are analogous processes, with similar underlying principles. Computationally, the tools developed for docking will be tremendously useful for folding. For large, multidomain proteins, domain docking is probably the only rational way, mimicking the hierarchical nature of protein folding. The complexity of the problem is huge. Here we divide the computational docking problem into its two separate components. As in folding, solving the docking problem involves efficient search (and matching) algorithms, which cover the relevant conformational space, and selective scoring functions, which are both efficient and effectively discriminate between native and non-native solutions. It is universally recognized that docking of drugs is immensely important. However, protein-protein docking is equally so, relating to recognition, cellular pathways, and macromolecular assemblies. Proteins function when they are bound to other molecules. Consequently, we present the review from both the computational and the biological points of view. Although large, it covers only partially the extensive body of literature, relating to small (drug) and to large protein-protein molecule docking, to rigid and to flexible. Unfortunately, when reviewing these, a major difficulty in assessing the results is the non-uniformity in the formats in which they are presented in the literature. Consequently, we further propose a way to rectify it here.

 

Hansch, C., D. Hoekman, et al. (2002). "Chem-bioinformatics: comparative QSAR at the interface between chemistry and biology." Chem Rev 102(3): 783-812.

           

Helfrich, J. P. (2002). "Raw data to knowledge warehouse in proteomic-based drug discovery: a scientific data management issue." Biotechniques Suppl: 48-50, 52-3.

           

Hirano, H. (2002). "[Analysis of expressed proteome]." Tanpakushitsu Kakusan Koso 47(8 Suppl): 889-97.

           

Hocker, B., S. Schmidt, et al. (2002). "A common evolutionary origin of two elementary enzyme folds." FEBS Lett 510(3): 133-5.

            The (beta alpha)(8)-barrel is the most frequent and most versatile fold among enzymes [Hocker et al., Curr. Opin. Biotechnol. 12 (2001) 376-381; Wierenga, FEBS Lett. 492 (2001) 193-198]. Structural and functional evidence suggests that (beta alpha)(8)-barrels evolved from an ancestral half-barrel, which consisted of four (beta alpha) units stabilized by dimerization [Lang et al., Science 289 (2000) 1546-550; Hocker et al., Nat. Struct. Biol. 8 (2001) 32-36; Gerlt and Babbitt, Nat. Struct. Biol. 8 (2001) 5-7]. Here, by performing a comprehensive database search, we detect a striking and unexpected structural and amino acid sequence similarity between (beta alpha)(4) half-barrels and members of the (beta alpha)(5) flavodoxin-like fold. These findings provoke the hypothesis that a large fraction of the modern-day enzymes evolved from a basic structural building block, which can be identified by a combination of sequence and structural analyses.

 

Holland, K. T. and R. A. Bojar (2002). "Cosmetics: what is their influence on the skin microflora?" Am J Clin Dermatol 3(7): 445-9.

            Human skin has a resident, transient and temporary resident microflora. This article considers the possibilities of topical products influencing the balance of the microflora. The resident micro-organisms are in a dynamic equilibrium with the host tissue and the microflora may be considered an integral component of the normal human skin. The great majority of these micro-organisms are gram-positive and reside on the skin surface and in the follicles. The host has a variety of structures, molecules and mechanisms which restrict the transient and temporary residents, as well as controlling the population and dominance of the resident group. These include local skin anatomy, hydration, nutrients and inhibitors of various types. The resident microflora is beneficial in occupying a niche and denying its access to transients, which may be harmful and infectious. Also, the residents are important in modifying the immune system. In the healthy host the microflora causes few and temporary problems. Therefore, it is of interest that topical products have little or no effect on the ecology of the microflora. A range of mechanisms by which long-term use of cosmetics may influence the microflora are considered. Although the risks associated are low, it is argued that it is necessary to monitor these changes in ecology and use technologies of modeling and bioinformatics to predict outcomes, whether good, neutral or of concern.

 

Holmes, I. (2002). "Transcendent elements: whole-genome transposon screens and open evolutionary questions." Genome Res 12(8): 1152-5.

           

Homma, K. and K. Nishikawa (2002). "[Protein structure information provided by the GTOP database and its applications]." Tanpakushitsu Kakusan Koso 47(8 Suppl): 1076-82.

           

Hurley, J. H., D. E. Anderson, et al. (2002). "Structural genomics and signaling domains." Trends Biochem Sci 27(1): 48-53.

            Many novel signal transduction domains are being identified in the wake of genome sequencing projects and improved sensitivity in homology-detection techniques. The functions of these domains are being discovered by hypothesis-driven experiments and structural genomics approaches. This article reviews the recent highlights of research on modular signaling domains, and the relative contributions and limitations of the various approaches being used.

 

Ichihara, H. and H. Toh (2002). "[Extraction of information about protein interaction using evolutionary trace]." Tanpakushitsu Kakusan Koso 47(13): 1863-9.

           

Iglesias, P. A. and A. Levchenko (2002). "Modeling the cell's guidance system." Sci STKE 2002(148): RE12.

            Cell locomotion can be directed by external gradients of diffusible substances leading to chemotaxis. Recently, the mechanisms of gradient sensing, the cell guidance system, came under scrutiny both in experimental analysis and computational modeling. Here, we review several recent computational models of gradient sensing in eukaryotic cells, demonstrating why some of them predict little sensitivity to changes in the gradient and response "locking," whereas others predict high gradient sensitivity at the expense of signal gain. We also propose a way to view chemotaxis regulation as a highly coupled combination of semi-independent control modules, leading to simplifying modeling of this complex cellular behavior.

 

Ito, T. (2002). "[Exploring interaction networks of the yeast proteins]." Tanpakushitsu Kakusan Koso 47(8 Suppl): 898-905.

           

Iwakawa, M., T. Imai, et al. (2002). "[RadGenomics project]." Nippon Igaku Hoshasen Gakkai Zasshi 62(9): 484-9.

            Human health conditions are largely determined by a complex interplay among genetic susceptibility, environmental factors, and aging. The RadGenomics project, which began in April 2001, promotes analysis of genes in response to irradiation, identification of their allelic variants in the human population, development of an effective procedure for quantitating individual radio-sensitivity, and analysis of the interrelationship between genetic heterogeneity and susceptibility to irradiation. Major groups of genes with which the project will concern itself include DNA repair genes, cell cycle genes, oncogenes, tumor suppressor genes, genes for programmed cell death, genes for signal transduction, and genes for oxidative processes. The outcome of the RadGenomics project should lead to improved protocols for personalized radiotherapy and reduce the possible side effects of treatment. The project will contribute to future research on the molecular mechanisms of radiation sensitivity in humans and stimulate the development of new high-throughput technology for a broader application of the biological and medical sciences. Identification of functionally important polymorphisms in the radiation response genes may determine individual differences in sensitivity to radiation exposure. The staff members, who are specialists in a variety of fields including genome science, radiation biology, medical science, molecular biology, and bioinformatics, have come to the RadGenomics project from various universities, companies, and research institutes.

 

Jain, K. K. (2002). "Recent advances in oncoproteomics." Curr Opin Mol Ther 4(3): 203-9.

            Advances in proteomics are contributing to the understanding of pathophysiology of cancer, cancer diagnosis and anticancer drug discovery. Laser capture microdissection (LCM) provides an ideal method for extraction of cells from specimens in which the exact morphologies of both the captured cells and the surrounding tissue are preserved. Differentially expressed proteins in tumor tissue are found by comparing the protein expression patterns generated using SELDI (surface-enhanced laser desorption/ionization)-based protein chip technology. Proteomic technologies have been used for the study of cancer of various organs. Continued refinement of techniques and methods to determine the abundance and status of proteins in vivo holds great promise for future study of cancer and development of personalized cancer therapies.

 

Ji, Y. (2002). "The role of genomics in the discovery of novel targets for antibiotic therapy." Pharmacogenomics 3(3): 315-23.

            The emergence of antibiotic resistance and multi-drug resistance in bacterial pathogens underscores the need for the development of novel classes of antibiotics. The availability of complete genome sequence data from many important human pathogens provides a wealth of fundamental information. This allows us to define each gene and thus to better understand molecular pathogenesis. New techniques have enabled the identification and characterization of genes that are critical for bacterial growth and survival during infection. The combination of genome sequence data and new technologies make it possible to systematically explore the function of each open reading frame in a genome and identify any potential molecular targets for drug discovery. With particular emphasis on antibacterial therapy, this review discusses genome-based technologies and their important applications to anti-infective drug discovery.

 

Ji, Z. L., J. Z. Liu, et al. (2002). "[Strategies of functional analysis of new genes]." Sheng Wu Gong Cheng Xue Bao 18(1): 117-20.

            Functional analysis of new genes is playing a central role in postgenomic era. Here we reviewed several main strategies including bioinformatics, gene transduction, antisense technology, certain gene silence induced by RNA interference (RNAi), transgene and gene knockout and artificial chromosome transduction.

 

Jiang, B., H. Bussey, et al. (2002). "Novel strategies in antifungal lead discovery." Curr Opin Microbiol 5(5): 466-71.

            There have been significant developments in fungal genomics over the past year. The recently released genome sequences of Aspergillus fumigatus and Cryptococcus neoformans have provided unprecedented opportunities for comparative genomics studies of many clinically relevant fungal pathogens. Emerging experimental analysis tools, such as fitness profiling and protein microarrays, have greatly enhanced our ability to conduct genome-wide functional studies.

 

Jorgensen, W. L. and E. M. Duffy (2002). "Prediction of drug solubility from structure." Adv Drug Deliv Rev 54(3): 355-66.

            The aqueous solubility of a drug is an important factor affecting its bioavailability. Numerous computational methods have been developed for the prediction of aqueous solubility from a compound's structure. A review is provided of the methodology and quality of results for the most useful procedures including the model implemented in the QikProp program. Viable methods now exist for predictions with less than 1 log unit uncertainty, which is adequate for prescreening synthetic candidates or design of combinatorial libraries. Further progress with predictive methods would require an experimental database of highly accurate solubilities for a large, diverse collection of drug-like molecules.

 

Kaban, L. B. (2002). "Biomedical technology revolution: opportunities and challenges for oral and maxillofacial surgeons." Int J Oral Maxillofac Surg 31(1): 1-12.

            During this 45-minute presentation, I have tried to describe my vision of the exciting future that awaits us. I have tried to impart my enthusiasm for the opportunities provided to us as surgeons by the advances in molecular biology and genetics, imaging, surgical technology and bioinformatics. Most of all, I hope I have transmitted my optimism for the future to our younger members. I think the following statement or observation by the great educator Margaret Mead accurately summarizes our current situation regarding the application of all this new knowledge that will become available to us as surgeons: 'We are now at the point where we must educate people (surgeons) in what nobody knew yesterday, and prepare in our schools (training programs) for what no one knows yet but what some people must know tomorrow.'

 

Katoh, M. (2002). "Strabismus (STB)/Vang-like (VANGL) gene family (Review)." Int J Mol Med 10(1): 11-5.

            Strabismus 1 (STB1/VANGL2) and Strabismus 2 (STB2/VANGL1), which have been cloned and characterized using bioinformatics and cDNA-PCR, are human homologues of Drosophila tissue polarity gene strabismus (stbm)/Van Gogh (Vang). STB1 and STB2 are tetra-membrane-spanning proteins with 73.1% total-amino-acid identity. Serine-rich domain and Strabismus-homology (STH1 and STH2) domains are conserved among human STB1, STB2, Xenopus Stbm, and Drosophila Stbm. STH2 domain with the C-terminal Ser/Thr-X-Val motif is implicated in binding with Dishevelled (DVL) proteins. STB1 gene is clustered with CASQ1 gene on human chromosome 1q21-q23, while STB2 gene is clustered with CASQ2 gene on human chromosome 1p13. STB1 and STB2 genes are located around cancer susceptibility loci or recombination hot spots in the human genome. STB1 is moderately expressed in K-562 (leukemia), G-361 (melanoma), and MKN7 (gastric cancer) cells. STB2 is highly expressed in MKN28, MKN74 (gastric cancer), BxPC-3, PSN-1, and Hs766T (pancreatic cancer) cells. On the other hand, STB1 and STB2 are significantly down-regulated in several cancer cell lines and primary tumors. Xenopus homologue of human STB1 and STB2 regulates negatively the WNT - beta-catenin signaling pathway. Loss-of-function mutations of genes encoding negative regulators of WNT - beta-catenin signaling pathway lead to carcinogenesis. Based on functional aspects and human chromosomal loci, STB1 gene and STB2 gene are predicted to be potent tumor suppressor gene candidates. STB1 and STB2 might be suitable targets for tissue engineering in the field of re-generative medicine and for chemoprevention and treatment in the field of clinical oncology.

 

Katoh, M. (2002). "GIPC gene family (Review)." Int J Mol Med 9(6): 585-9.

            GIPC1/GIPC/RGS19IP1, GIPC2, and GIPC3 genes constitute the human GIPC gene family. GIPC1 and GIPC2 show 62.0% total-amino-acid identity. GIPC1 and GIPC3 show 59.9% total-amino-acid identity. GIPC2 and GIPC3 show 55.3% total-amino-acid identity. GIPCs are proteins with central PDZ domain and GIPC homology (GH1 and GH2) domains. PDZ, GH1, and GH2 domains are conserved among human GIPCs, Xenopus GIPC/Kermit, and Drosophila GIPC/ LP09416. Bioinformatics revealed that GIPC genes are linked to prostanoid receptor genes and DNAJB genes in the human genome as follows: GIPC1 gene is linked to prostaglandin E receptor 1 (PTGER1) gene and DNAJB1 gene in human chromosome 19p13.2-p13.1 region; GIPC2 gene to prostaglandin F receptor (PTGFR) gene and DNAJB4 gene in human chromosome 1p31.1-p22.3 region; GIPC3 gene to thromboxane A2 receptor (TBXA2R) gene in human chromosome 19p13.3 region. GIPC1 and GIPC2 mRNAs are expressed together in OKAJIMA, TMK1, MKN45 and KATO-III cells derived from diffuse-type of gastric cancer, and are up-regulated in several cases of primary gastric cancer. PDZ domain of GIPC family proteins interact with Frizzled-3 (FZD3) class of WNT receptor, insulin-like growth factor-I (IGF1) receptor, receptor tyrosine kinase TrkA, TGF-beta type III receptor (TGF-beta RIII), integrin alpha6A subunit, transmembrane glycoprotein 5T4, and RGS19/RGS-GAIP. Because RGS19 is a member of the RGS family that regulate heterotrimeric G-protein signaling, GIPCs might be scaffold proteins linking heterotrimeric G-proteins to seven-transmembrane-type WNT receptor or to receptor tyrosine kinases. Therefore, GIPC1, GIPC2 and GIPC3 might play key roles in carcinogenesis and embryogenesis through modulation of growth factor signaling and cell adhesion.

 

Katze, M. G., Y. He, et al. (2002). "Viruses and interferon: a fight for supremacy." Nat Rev Immunol 2(9): 675-87.

            The action of interferons (IFNs) on virus-infected cells and surrounding tissues elicits an antiviral state that is characterized by the expression and antiviral activity of IFN-stimulated genes. In turn, viruses encode mechanisms to counteract the host response and support efficient viral replication, thereby minimizing the therapeutic antiviral power of IFNs. In this review, we discuss the interplay between the IFN system and four medically important and challenging viruses -- influenza, hepatitis C, herpes simplex and vaccinia -- to highlight the diversity of viral strategies. Understanding the complex network of cellular antiviral processes and virus-host interactions should aid in identifying new and common targets for the therapeutic intervention of virus infection. This effort must take advantage of the recent developments in functional genomics, bioinformatics and other emerging technologies.

 

Kennedy, S. (2002). "The role of proteomics in toxicology: identification of biomarkers of toxicity by protein expression analysis." Biomarkers 7(4): 269-90.

            Proteomics, i.e. the high throughput separation, display and identification of proteins, has the potential to be a powerful tool in drug development. It could increase the predictability of early drug development and identify non-invasive biomarkers of toxicity or efficacy. This review provides an introduction to modern proteomics, with particular reference to applications in toxicology. A literature search was carried out to identify studies in two broad classes: screening/predictive toxicology, and mechanistic toxicology. The strengths and limitations of current methods and the likely impact of techniques in drug development are also considered. Proteomics can increase the speed and sensitivity of toxicological screening by identifying protein markers of toxicity. Proteomics studies have already provided insights into the mechanisms of action of a wide range of substances, from metals to peroxisome proliferators. Current limitations involving speed of throughput are being overcome by increasing automation and the development of new techniques. The isotope-coded affinity tag (ICAT) method appears particularly promising. The application of proteomics to drug development has given rise to the new field of pharmacoproteomics. New associations between proteins and toxicopathological effects are constantly being identified, and major progress is on the horizon as we move into the post-genomic era.

 

Kidera, A. and M. Ikeguchi (2002). "[Protein structural dynamics]." Tanpakushitsu Kakusan Koso 47(8 Suppl): 1052-7.

           

Kinoshita, K. (2002). "[Insight into the relation between protein structure and function]." Tanpakushitsu Kakusan Koso 47(8 Suppl): 1064-70.

         &nb