Bioinformatics

The NWGC has the necessary computational infrastructure to manage, store, and share the vast amounts of data generated by large-scale sequencing projects. We’ve developed and implemented an automated pipeline for processing data from our Illumina, Element, PacBio, and Oxford Nanopore sequencers. Our RNA and DNA pipelines, designed for major national programs such as AoU, TOPMed, GREGoR, and SMaHT, efficiently map data to reference genomes, conduct quality control, identify variants (including SNVs, Indels, and SVs), and provide annotation from national databases.

We have a dedicated high-performance computing system that includes analysis cluster servers (~3,000 total CPU cores), storage (2.4PB usable), and backup system (4 LTO-8, 12 LTO-7 and 4 LTO-6 drives, 12/6TB/2.5TB native capacity respectively) that are interconnected with high-speed, low-latency 10GbE, 40GbE, and 100GbE allowing data to be quickly moved through the analysis pipeline. Our dedicated tape library ensures the required throughput for data archiving, backups, and disaster recovery. We deliver the relevant data files using our high-throughput transfer software, Globus. Our team of skilled bioinformaticians is available to support post-sequencing analysis and can provide the expertise needed to advance your research project.