2026 Future Talent Program – Data Scientist, Causal Network – Co-op – MSD USA

USA
September 20, 2025

Job Description

R364594

Job Description

The Future Talent Program features Cooperative (Co-op) education that lasts up to 6 months and will include one or more projects. These opportunities in our Research and Development Division can provide you with great development and a chance to see if we are the right company for your long-term goals.

2026 Future Talent Program - Data Scientist, Causal Network - Co-op

The Precision Genetics group within the Data, AI and Genome Sciences (DAGS) Department is recruiting a Co-op student to process and analyze proteomic data generated by large consortium using different platforms. We are seeking a self-motivated student to work along with our company scientists to develop a comprehensive data processing and analysis pipeline. This pipeline will enable cross-platform comparison and integrative analysis with other ‘omics data, facilitating robust biomarker discovery and mechanistic insights. This project offers a valuable opportunity for a Co-op student to gain hands-on experience with state-of-the-art data analysis approaches and contribute to advanced method development in proteomic data analysis.


In this exciting role, you will be:

1. Contributing to Data Collection: Large-scale plasma proteomic datasets generated using Olink, SomaLogic, and mass spectrometry technologies will become available to our company starting Q4 2025 and continuing through 2026-2028. A major data source is the UKB-PPP consortium, where our Company is a member. Complementary whole genome sequencing and blood transcriptomic data from UK Biobank participants will also be accessible. The student will familiarize themselves with these datasets, their formats, and associated study designs.

2. Implementing Data Processing and Analysis Pipeline: Preprocessing and quality control (QC) of proteomic data are essential, especially given the heterogeneity of technologies and cohorts. The student will participate in implementing standardized QC pipelines to ensure comparability across datasets. Genetic association analyses, such as protein quantitative trait locus (pQTL) mapping, will be performed. Then, proteomic data will be stratified by clinical phenotypes (e.g., disease status), followed by construction of association and causal networks for each subgroup. This phase will provide the student with experience in best practices for data processing, QC, and integrative analysis.

3. Performing Exploratory Analyses and Contributing to Method Development: Multiple therapeutic areas at our Company, including neurodegenerative diseases and immunology, urgently need proteomic biomarkers for patient subgrouping and drug development. Beyond identifying differentially abundant proteins, the student will explore differential regulation of proteins and biological pathways by comparing proteomic networks across technologies, cohorts, and diseases. Network comparison methods are rapidly evolving, and the student will have the opportunity to contribute to the development and application of advanced computational approaches in this area.
 
Learning Outcomes for the Co-op Student
•    Gain expertise in handling large-scale proteomic and multi-omics datasets.
•    Develop skills in data preprocessing, quality control, and genetic association analyses.
•    Learn to construct and interpret biological networks and perform integrative analyses.
•    Contribute to cutting-edge method development in proteomic data integration and network comparison.
•    Collaborate with interdisciplinary teams across our company and consortium partners.

Education
•    Candidates must be currently enrolled in a minimum of a Bachelor’s degree program major in analytic sciences such as mathematics, statistics, computer science, physics, computational biology or a related field of study. 
•    Students enrolled in graduate programs (MS or PhD) are highly encouraged to apply. 
•    Candidates must be available to work full-time for 6 months in 2026.  

Required Experience and Skills:


•    Candidates must have excellent academic achievement and analytical ability. 
•    Candidates must demonstrate understanding in multi-omics data analysis.
•   Candidates must have proficiency in R/Python.
•    Candidates must possess good oral and written communication skills.

Please note that this position may be closed before the posted end date or may remain open longer, at the discretion of the company.

Under New York City, Colorado State, Washington State, and California State law, the Company is required to provide a reasonable estimate of the salary range for this job. Final determinations with respect to salary will take into account a number of factors, which may include, but not be limited to the primary work location and the chosen candidate’s relevant skills, experience, and education.

Salary range:

The salary range for this role is $39,600.00-$105,500.00 USD

FTP2026

RL2026