Gordon Smyth-projects

Analysis of RNA sequencing data

We have developed the limma and edgeR software packages that are widely used around the world for analyzing gene expression experiments. The edgeR package implements exact tests and generalized linear models for RNA-seq counts based on the negative binomial distribution. The limma package implements linear modelling and gene set testing approaches. Both the limma and edgeR packages implement empirical Bayes for borrowing strength between genes in genomic experiments.

Team members: Yunshun Chen, Wei Shi, Yang Liao, Yifang Hu


Sequence read alignment and quantification

Next generation sequencing produces huge numbers of DNA or RNA sequence reads. We have developed a read aligner called Subread, which works well for mapping of short or long reads. It has many applications, but is especially fast and robust relative to alternatives when applied to RNA-seq data. We have also developed a summarisation tool, called featureCounts, which is useful for quantifying abundances of genomic features such as genes, exons and promoters.

Team members: Wei Shi, Yang Liao


Analysis of ChIP sequencing data

We have developed methods for assessing differential binding of epigenetic histone marks and of transcription factors. We are currently developing methods for detecting DNA-DNA interactions using for Hi-C and Chia-PET data. 

Team members: Aaron Lun, Wei Shi


Expression signature analysis applied to stem cells, breast cancer and lung cancer

We have developed a number of gene set test methods for assessing the behavior of co-regulated sets of genes representing higher level biological processes. We  have collaborated extensively with the Visvader/Lindeman laboratory for nearly a decade and have successfully applied gene signature analyses to study adult stem cells and the origins of breast cancer.

Team members: Göknur Giner, Yunshun Chen, Yifang Hu