full-time part-time employee contract
MPI does not discriminate on the basis of race, color, religion, sex, sexual orientation, gender identity or expression, national origin, age, disability, veteran status, marital status, or based on an individual's status in any group or class protected by applicable federal, state or local law. MPI encourages applications from minorities, women, the disabled, protected veterans and all other qualified applicants.
The Bio-tech Company is looking for a motivated, experienced and creative Data Engineer. This individual will be responsible for building reliable, distributed data pipeline to handle millions of raw microscopy images and their extracted features. This will allow Data Scientists and Machine Learning teams to fully leverage the Company's data to accelerate internal drug discovery efforts.
* Design and architect a data warehouse to support downstream analytics
* Manage and improve the data lake
* Work with Data Scientists to incorporate image processing workflow into data pipelines
* Build and manage the Company's databases of trillions of chemical structures
* Expert in engineering big data pipelines using modern technologies and cloud infrastructures
* Experience in cloud computing preferably AWS
* High-end distribution data processing experience (Spark, Hadoop)
* Experience in Linux environments (SQL)
* Experience with pipeline managers (Luigi, Airflow, Nextflow)
* Highly proficient in Python and the PyData stack (numpy, pandas, scipy, dask etc)
* Given the growth and fast paced environment, the Company requires a thoughtful and high energy Data Engineer who can partner with the wider organization
* Can work in a fast paced environment
* Goal and reward driven
* A passion for advancing in human health and supporting the research for life changing drugs
The Biotechnology company who specializes in building platforms to develop treatments for historically intractable causes of human disease. The Company's technology platform provides quantitative and comprehensive affinity binding data at a huge scale.
* You could be a part of a team to cure intractable causes of human disease
* Compensation is competitive
* Remote until further notice, then return to the office