Senior Data Scientist
We are looking for an experienced Data Scientist with a focus on large clinical text datasets (electronic medical records) to extract structured information like cancer diagnosis, stage, treatment, and timing, using LLMs. In this role, you will be responsible for processing clinical documents, prompt engineering, evaluating outputs against human-labeled data, and collaborating closely with clinical and data science teams.

Job (Project) Description
Customertimes is a global digital engineering, product development, and technology consulting company. Headquartered in New York, we have a team of 1300+ experts and offices in 12 countries.
Requirements:
- Strong proficiency in Python;
- 3+ years of relevant working experience in a technical capacity, with a focus on ML. Prior experience with LLMs is strongly preferred;
- Familiarity with LLM-based prompt engineering and text processing workflows;
- Basic familiarity with Spark, Databricks, and MLFlow environments (for interacting with infrastructure, not deep configuration);
- Experience working in AWS environments;
- Comfort with object-oriented programming concepts (e.g., using/extending classes, Pydantic models);
- Experience working with Anthropic models (preferred) or other major LLMs (OpenAI, LLAMA, etc.);
- Familiarity with RAG (Retrieval Augmented Generation) concepts (not core but potentially useful in the future);
- Basic exposure to classical NLP (e.g., LSTM architectures, non-LLM text processing);
- Healthcare or clinical text processing experience is a strong plus, but not required;
- Understanding of model evaluation metrics (classification, regression) and statistical validation basics.
Responsibilities:
- Process electronic medical record documents into a form suitable for LLMs;
- Design and craft effective prompts for information extraction;
- Evaluate LLM performance against human-labeled ground truth;
- Collaborate with clinicians, epidemiologists, and statisticians to align extracted data with clinical meaning;
- Troubleshoot and iterate extraction pipelines;
- Integrate with internal ML infrastructure and production environments.
What We Offer:
- Monthly payment in USD;
- 100% remote work opportunity;
- Important: Benefits will be shared during the selection process depending on the candidate's location.
Apply now
Senior Data Scientist
