Peripheral blood is an attractive biofluid for use in clinical research because it is easy to collect and contains nucleic acids and other biomolecules originating from a variety of bodily tissues. However, the high prevalence of erythrocytes contained in peripheral blood presents challenges for RNA-seq based approaches. Both mRNA-seq and microRNA-seq, two techniques often employed to study disease biomarkers in the blood transcriptome, are made less informative by the presence of several highly abundant erythroid-specific transcripts in the sequencing libraries. These overabundant transcripts exhaust molecular machinery during library prep and consume flow cell real estate during sequencing, leading to dramatically decreased detection of low expression transcripts—and ultimately, impaired discovery of potential biomarkers.
Here we present a single-tube workflow that enables the generation of combined mRNA and microRNA libraries from whole blood, while avoiding species which are abundant therein. The workflow relies on a novel technique in which poly(A)-tailed RNA species are selectively reverse transcribed then sheared by RNase H into fragments of useful length. The mRNA fragments are subsequently processed along with the small RNAs present. Depletion of globin mRNA, the erythroid-specific microRNAs miR-486-5p, miR-451a, and miR-92a-3p, and ribosomal RNA is achieved by three unique strategies, none of which involves physically removing target transcripts from the reaction solution. Employing these strategies leads to reduction in globin mRNA of >70% and reduction of erythroid-specific microRNAs by >90%, with ribosomal RNA contamination making up ~5% of sequencing reads. We show that together, these strategies result in increased detection of low-abundance microRNAs and mRNAs with minimal effect on quantitation of non-target transcripts.