Aside

Download a PDF of this CV

Contact

Languages

Technologies

Disclaimer

Main

David Zhang

By bridging bioinformatics and engineering, I translate genetic and transcriptomic data into software that delivers real-world impact. I have lead cross-functional projects across the full software development lifecycle from prototyping innovative solutions to implementing and maintaining robust, production-ready pipelines.

Work Experience

Senior bioinformatics engineer

CoSyne Therapeutics

London, UK (hybrid)

Present - 2024

  • Lead the optimisation and scaling of machine learning tools for single-cell perturb-seq data comprising millions of cells. Collaborate closely with AI, engineering, and computational biology teams, ensuring key internal stakeholders are consistently informed of progress. Apply these tools to generate actionable insights and inform strategic decisions around company direction.
  • Design and deploy a data pipeline to ingest, tidy and version-control data for the CoSyne knowledge graph. Automate the release of the graph to AWS using terraform and CI/CD, improving the efficiency and traceability of data updates.
  • Build and maintain infrastructure tooling including docker images, terraform modules, CI/CD workflows and cruft templates to streamline bioinformatics analyses.

Senior bioinformatics software engineer

Congenica

Hinxton, UK (hybrid)

2024 - 2022

  • Developed scalable nextflow pipelines to process solid tumor DNA-sequencing data covering alignment, variant calling, driver mutation annotation, and therapy matching.
  • Collaborated with clinical and bioinformatics teams to investigate driver variant misclassifications. Led the design, refinement, and implementation of solutions within an agile scrum team, effectively translating complex scientific concepts for engineers without a bioinformatics background to ensure accurate and aligned development.
  • Built python and R packages to improve the efficiency of clinical verification, reducing time taken by 2 weeks per quarterly release.

Bioinformatician internship (2 months)

Verge Genomics

London, UK (remote)

2021

  • Created a reproducible aberrant splicing detection pipeline using docker for drug target discovery in C9orf72 ALS patients.

Education

PhD, Bioinformatics

University College London

London, UK

2022 - 2017

  • Analysed bulk RNA-sequencing data with the aim of improving the diagnosis rate of rare disease patients. Focussed on detection of abberant splicing events as a strategy to prioritise pathogenic variants.
  • Released R/Bioconductor packages that enable bioinformatics analyses and interpretation. Championed best practices for software development through teaching workshops and courses.

MSc, Neuroscience

University College London

London, UK

2016 - 2015

  • Grade: Merit (68%)

BSc, Biomedical science

University College London

London, UK

2015 - 2012

  • Grade: 2:1 (69%)

Open-source software

Web development

N/A

N/A

Present - 2022

  • Portfolio website: Showcases my favourite open-source contributions. Built with Django and deployed using PythonAnywhere.

Rust packages

N/A

N/A

2024

  • tuni: Unify transcript identifiers across different samples.

Python packages

N/A

N/A

2023 - 2021

  • autogroceries: Use Selenium to automate your grocery shop.
  • stravaboard: An extendable Streamlit dashboard for tracking Strava runs.

R packages

N/A

N/A

2022 - 2020

  • ggtranscript: Visualising transcript structure and annotation using ggplot2.
  • dasper: Detection of aberrant splicing events in RNA-sequencing data.

Selected Publications

A complete list of my publications is available via Google Scholar

ggtranscript: an R package for the visualization and interpretation of transcript isoforms using ggplot2

Bioinformatics

N/A

2022

  • Role: Co-first author, R package developer.

Developmental Consequences of Defective ATG7-Mediated Autophagy in Humans

The New England Journal of Medicine

N/A

2021

  • Role: Analyst

Megadepth: efficient coverage quantification for BigWigs and BAMs

Bioinformatics

N/A

2021

  • Role: R package developer