• A complete analysis of HA and NA genes of influenza A viruses.

      Shi, Weifeng; Lei, Fumin; Zhu, Chaodong; Sievers, Fabian; Higgins, Desmond G; The Conway Institute of Biomolecular and Biomedical Research, University College Dublin, Dublin, Ireland. weifeng.shi@ucd.ie (2010-12)
      More and more nucleotide sequences of type A influenza virus are available in public databases. Although these sequences have been the focus of many molecular epidemiological and phylogenetic analyses, most studies only deal with a few representative sequences. In this paper, we present a complete analysis of all Haemagglutinin (HA) and Neuraminidase (NA) gene sequences available to allow large scale analyses of the evolution and epidemiology of type A influenza.
    • Detecting microRNA activity from gene expression data.

      Madden, Stephen F; Carpenter, Susan B; Jeffery, Ian B; Björkbacka, Harry; Fitzgerald, Katherine A; O'Neill, Luke A; Higgins, Desmond G; School of Medicine and Medical Science, Conway Institute, University College Dublin, Dublin, Ireland. (2010)
      BACKGROUND: MicroRNAs (miRNAs) are non-coding RNAs that regulate gene expression by binding to the messenger RNA (mRNA) of protein coding genes. They control gene expression by either inhibiting translation or inducing mRNA degradation. A number of computational techniques have been developed to identify the targets of miRNAs. In this study we used predicted miRNA-gene interactions to analyse mRNA gene expression microarray data to predict miRNAs associated with particular diseases or conditions. RESULTS: Here we combine correspondence analysis, between group analysis and co-inertia analysis (CIA) to determine which miRNAs are associated with differences in gene expression levels in microarray data sets. Using a database of miRNA target predictions from TargetScan, TargetScanS, PicTar4way PicTar5way, and miRanda and combining these data with gene expression levels from sets of microarrays, this method produces a ranked list of miRNAs associated with a specified split in samples. We applied this to three different microarray datasets, a papillary thyroid carcinoma dataset, an in-house dataset of lipopolysaccharide treated mouse macrophages, and a multi-tissue dataset. In each case we were able to identified miRNAs of biological importance. CONCLUSIONS: We describe a technique to integrate gene expression data and miRNA target predictions from multiple sources.
    • Duplicability of self-interacting human genes.

      Pérez-Bercoff, Asa; Makino, Takashi; McLysaght, Aoife; Smurfit Institute of Genetics, University of Dublin, Trinity College, Dublin 2, Ireland. (2010)
      BACKGROUND: There is increasing interest in the evolution of protein-protein interactions because this should ultimately be informative of the patterns of evolution of new protein functions within the cell. One model proposes that the evolution of new protein-protein interactions and protein complexes proceeds through the duplication of self-interacting genes. This model is supported by data from yeast. We examined the relationship between gene duplication and self-interaction in the human genome. RESULTS: We investigated the patterns of self-interaction and duplication among 34808 interactions encoded by 8881 human genes, and show that self-interacting proteins are encoded by genes with higher duplicability than genes whose proteins lack this type of interaction. We show that this result is robust against the system used to define duplicate genes. Finally we compared the presence of self-interactions amongst proteins whose genes have duplicated either through whole-genome duplication (WGD) or small-scale duplication (SSD), and show that the former tend to have more interactions in general. After controlling for age differences between the two sets of duplicates this result can be explained by the time since the gene duplication. CONCLUSIONS: Genes encoding self-interacting proteins tend to have higher duplicability than proteins lacking self-interactions. Moreover these duplicate genes have more often arisen through whole-genome rather than small-scale duplication. Finally, self-interacting WGD genes tend to have more interaction partners in general in the PIN, which can be explained by their overall greater age. This work adds to our growing knowledge of the importance of contextual factors in gene duplicability.
    • A genome-wide scan for common alleles affecting risk for autism.

      Anney, Richard; Klei, Lambertus; Pinto, Dalila; Regan, Regina; Conroy, Judith; Magalhaes, Tiago R; Correia, Catarina; Abrahams, Brett S; Sykes, Nuala; Pagnamenta, Alistair T; et al. (2010-10-15)
      Although autism spectrum disorders (ASDs) have a substantial genetic basis, most of the known genetic risk has been traced to rare variants, principally copy number variants (CNVs). To identify common risk variation, the Autism Genome Project (AGP) Consortium genotyped 1558 rigorously defined ASD families for 1 million single-nucleotide polymorphisms (SNPs) and analyzed these SNP genotypes for association with ASD. In one of four primary association analyses, the association signal for marker rs4141463, located within MACROD2, crossed the genome-wide association significance threshold of P < 5 × 10(-8). When a smaller replication sample was analyzed, the risk allele at rs4141463 was again over-transmitted; yet, consistent with the winner's curse, its effect size in the replication sample was much smaller; and, for the combined samples, the association signal barely fell below the P < 5 × 10(-8) threshold. Exploratory analyses of phenotypic subtypes yielded no significant associations after correction for multiple testing. They did, however, yield strong signals within several genes, KIAA0564, PLD5, POU6F2, ST8SIA2 and TAF1C.
    • Recode-2: new design, new search tools, and many more genes.

      Bekaert, Michaël; Firth, Andrew E; Zhang, Yan; Gladyshev, Vadim N; Atkins, John F; Baranov, Pavel V; School of Biology and Environmental Science, University College Dublin, BioSciences Institute, University College Cork, Ireland. (2010-01)
      'Recoding' is a term used to describe non-standard read-out of the genetic code, and encompasses such phenomena as programmed ribosomal frameshifting, stop codon readthrough, selenocysteine insertion and translational bypassing. Although only a small proportion of genes utilize recoding in protein synthesis, accurate annotation of 'recoded' genes lags far behind annotation of 'standard' genes. In order to address this issue, provide a service to researchers in the field, and offer training data for developers of gene-annotation software, we have gathered together known cases of recoding within the Recode database. Recode-2 is an improved and updated version of the database. It provides access to detailed information on genes known to utilize translational recoding and allows complex search queries, browsing of recoding data and enhanced visualization of annotated sequence elements. At present, the Recode-2 database stores information on approximately 1500 genes that are known to utilize recoding in their expression--a factor of approximately three increase over the previous version of the database. Recode-2 is available at http://recode.ucc.ie.