Bioinformatician Engineer (Mid-level)
Job Type:Â Contractor
Job Location: Remotely from Anywhere
Primary Skills:Â Data Science
Position Type: Contract (12 month +)
Position Type: Contract (12 month +)
Project Scope and Brief Description:
- The position is for work in the bioinformatics space, principally writing new and/or maintaining existing bioinformatics workflows and pipelines such as an Eukaryote Genome Annotation Pipeline.
- As such the role requires knowledge of Cloud technologies (AWS, Kubernetes, Container orchestration) as well as experience with industry-level scientific workflow management.
Responsibilities:
- Design, develop, optimize, and maintain scalable bioinformatics workflows for processing and analyzing large-scale genomics datasets in the cloud and in-house
- Include a flexible modular architecture into the workflows to enable the exchange of analysis components and different algorithms
- Implement the bioinformatics data processing pipelines using workflow management tools and programming languages such as Python
- Work with team members to perform quality control and validation of pipelines to ensure accuracy and reproducibility of results
- Document the development processes, including code, workflows, data flow diagrams, and standard operating procedures, following software development and DataOps best practices
Skills / Experience:
Required Qualifications
- Previous experience developing industrial scale scientific data workflows.
- Strong programming skills in Python including libraries for Data Science such as NumPy, Pandas, NetworkX, matplotlib, etc.
- Working knowledge of container technologies (such as Docker, ContainerD, or Podman) and container orchestration.
- Experience with data pipeline tools (like Argo, Ray, AirFlow, Redun or NextFlow).
- Familiarity with the AWS platform (IAM, EC2, S3, CloudWatch, Spot instances) and Kubernetes, EKS, ECS, AWS Batch or other Cloud compute architectures.
- Ability to work both independently and collaboratively with good communication skills. Interest in learning new technologies
Preferred Qualifications:
- Specific experience analyzing large genomics datasets
- Familiarity with common bioinformatics tools and datatypes for the analysis of NextGen sequencing data
- Familiarity with statistical analysis methods and tools commonly used in bioinformatics analysis such as Gene Expression or ChIPSeq
- Knowledge of any additional programming languages such as C, Rust, Perl, R, Unix Shell or others