PseudotimeDE: inference of differential gene expression along cell pseudotime with valid p-values from single-cell RNA sequencing data | Gennome Biology

Dongyuan SongJingyi Jessica Li


Single-cell RNA sequencing (scRNA-seq) technologies have undergone rapid development to measure thousands of genes’ expression levels in the form of messenger RNAs in hundreds to millions of individual cells. To uncover continuous biological processes (e.g., differentiation, immune response, and carcinogenesis) underlying a batch of simultaneously measured cells from scRNA-seq data, researchers perform the analysis called “pseudotime inference” or “trajectory inference” to order cells into a tree branching structure based on the similarities of the cells’ gene expression levels. In each root-to-leaf path in that tree, called a “cell lineage,” the cells assigned to it receive pseudotime values, which indicate the cells’ relative positions in that lineage. Then an important question is what genes undergo expression changes in each cell lineage, as these genes may play functional roles in the biological process corresponding to that lineage. 

To address this question, Mr. Dongyuan Song, a 2nd-year Bioinformatics Ph.D. student in Dr. Jingyi Jessica Li’s lab, developed a statistical method PseudotimeDE, published in the journal Genome Biology, to detect differentially expressed (DE) genes along cell pseudotime. Compared with existing methods, PseudotimeDE is advantageous in its consideration of the uncertainty of pseudotime inference. As a result, PseudotimeDE outputs valid p-values, which are essential for controlling the false discovery rate. PseudotimeDE also outperforms existing methods in its detection power. Overall, PseudotimeDE is an effective bioinformatics tool for studying gene expression dynamics from static scRNA-seq data.



To investigate molecular mechanisms underlying cell state changes, a crucial analysis is to identify differentially expressed (DE) genes along the pseudotime inferred from single-cell RNA-sequencing data. However, existing methods do not account for pseudotime inference uncertainty, and they have either ill-posed p-values or restrictive models. Here we propose PseudotimeDE, a DE gene identification method that adapts to various pseudotime inference methods, accounts for pseudotime inference uncertainty, and outputs well-calibrated p-values. Comprehensive simulations and real-data applications verify that PseudotimeDE outperforms existing methods in false discovery rate control and power.

Media Contact: 

Leticia Ortiz | Marketing & Communications | Building a community around data science in biomedicine​