Knowledge Engineer 

We are leaders in the field of data integration and knowledge representation in the Healthcare and Life Sciences domain. We are looking for a mixture between a scientist, engineer and philosopher who can create and maintain complex Linked Data graphs that amalgamate multiple data sources to represent complex entities in the biomedical domain. Projects take many forms and require a diverse and flexible skillset that let you solve complex data integration and analytical issues quickly and proficiently.

 

What you get to do every day:

  • Design, implement and maintain complex Linked Data models
  • Uncover data and their relationships from a variety of sources: relational databases, flat files and other RDF documents
  • Provide efficient data integration with public domain data from bioinformatic resources such as EBI and NCBI to satisfy our customer’s requirements
  • Create and instantiate OWL ontologies to expose semantics encoded in the data
  • Construct and maintain ETL pipelines to keep Linked Data resources up to date from various sources

 

Requirements

  • A degree in any STEM field from the life sciences domain
  • Expertise with two or more of the following technologies: SQL, Neo4j, Hadoop RDF, OWL, SPARQL
  • 1+ years experience as a relational database administrator
  • 1+ years experience with of one of the following programming languages: Java, C++ or Ruby
  • Working experience with one of : Python, PHP, Bash, Javascript or R

Good to have

  • PhD in a relevant field (Biology, CS, Biochemistry)
  • 2+ years experience in creating and maintaining ETL pipelines using one of the following frameworks: KNIME, Pipeline Pilot, Talend, Mule, etc.
  • Excellent technical writing skills
  • Experience with Software Engineering
  • Experience with Machine Learning
  • Working knowledge in creation and instantiation of OWL2 ontologies
  • Working knowledge of NoSQL technologies

 

 

For more information, contact Bob Stanley: rstanley@io-informatics.com



 

The Sentient Platform adds significant value in any data-intensive industry that needs to reduce the time and cost of product discovery, development and marketing.