Skip to main content

Senior Platform Data Engineer

Apply
Location Work from Home Job Category Business Strategy and Innovations Schedule Days Work Type Full time Department IT Data Management Division Date posted 04/16/2026 Job ID R-95109

Job Summary

The Senior Platform Data Engineer owns roadmap, priorities, platform standards, and architecture reviews; provides formal input on performance reviews. This position makes clinical data ready for AI at scale: owning the shared data products, retrieval infrastructure, and platform administration that the entire AI portfolio depends on. Owns Real-time data feeds. Reusable clinical data models and feature pipelines. RAG retrieval infrastructure (ingestion, chunking, embeddings, vector DB, retrieval pipelines). Databricks platform administration.

Job Duties

  • Streams data from Epic SDE, ADT feeds, lab results, and other clinical sources into Databricks for downstream model consumption.

  • Curates shared clinical feature tables (patient demographics, labs, vitals, diagnoses, utilization history, imaging metadata) in Databricks/Unity Catalog that multiple AI programs consume for model training, validation, and monitoring.

  • Owns RAG Infrastructure, the shared retrieval-augmented generation platform that agentic and generative AI programs use to ground LLM outputs in organizational knowledge.

  • Designs and operates document ingestion pipelines: normalizing clinical documents, policies, guidelines, and unstructured data sources into formats ready for embedding and retrieval.

  • Implements and optimizes chunking strategies tailored to healthcare content (e.g., preserving clinical note structure, section-aware chunking for guidelines and protocols).

  • Manages the embedding pipeline: selecting, tuning, and versioning embedding models (domain-specific clinical models where they outperform general-purpose).

  • Administers the vector database: schema design, indexing, metadata management, access controls, and performance tuning.

  • Builds and maintains retrieval pipelines: hybrid search (vector + keyword/BM25), reranking, and relevance filtering to maximize retrieval precision for downstream agents and LLM applications.

  • Establishes data quality gates for RAG: automated profiling, completeness checks, and accuracy scoring before content enters the vector store.

  • Monitors retrieval quality metrics (Precision@K, Recall@K, MRR) and continuously optimize retrieval performance.

  • Databricks workspace configuration and Unity Catalog governance.

  • Cluster policies, compute management, and cost monitoring.

  • Manges user/group management and access control.

  • Administrator for Feature Store.

Work is typically performed in an office environment. Accountable for satisfying all job specific obligations and complying with all organization policies and procedures. The specific statements in this profile are not intended to be all-inclusive. They represent typical elements considered necessary to successfully perform the job.

*Relevant experience may be a combination of related work experience and degree obtained (Master's Degree = 2 years).

Position Details

Key Technologies:

  • Databricks (Delta Live Tables, Feature Store, PySpark, Unity Catalog)
  • Epic SDE / epic-ws for real-time clinical data extraction
  • Vector databases (Pinecone, Weaviate, Qdrant, or Databricks Vector Search)
  • Embedding models and pipelines (clinical domain-specific and general-purpose)
  • SQL, pandas
  • Streaming and batch ingestion patterns
  • CDIS Data Warehouse (source system for batch clinical data)

Required Skills & Qualifications:

  • 5+ years in data engineering, with strong experience building both batch and streaming data pipelines
  • Expert-level Databricks skills: Delta Live Tables, PySpark, Unity Catalog, Feature Store
  • Hands-on experience with real-time data ingestion (Kafka, Spark Structured Streaming, or comparable frameworks)
  • Strong SQL and Python (pandas, PySpark) skills for data transformation and feature engineering
  • Experience administering Databricks workspaces: cluster policies, compute management, access controls, cost monitoring
  • Familiarity with clinical data models and healthcare data sources (EHR extracts, ADT feeds, lab results, claims data) strongly preferred
  • Experience with Epic data extraction methods (SDE, FHIR, epic-ws) a significant plus
  • Understanding of data governance principles: lineage, quality monitoring, access controls

Education

Bachelor's Degree-Related Field of Study (Required), Master's Degree-Related Field of Study (Preferred)

Experience

Minimum of 5 years-Relevant experience* (Required)

About Geisinger

Founded more than 100 years ago by Abigail Geisinger, the system now includes ten hospital campuses, a 550,000-member health plan, two research centers and the Geisinger Commonwealth School of Medicine. With nearly 24,000 employees and more than 1,700 employed physicians, Geisinger boosts its hometown economies in Pennsylvania by billions of dollars annually. Learn more at geisinger.org or connect with us on Facebook, Instagram, LinkedIn and Twitter.

Equal Opportunity Employer

All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, age, pregnancy, genetic information, disability, status as a protected veteran, or any other protected category under applicable federal, state, and local laws.

Our Vision & Values

Everything we do is about making better health easier for our patients, our members, our students, our Geisinger family and our communities.

KINDNESS: We strive to treat everyone as we would hope to be treated ourselves.

EXCELLENCE: We treasure colleagues who humbly strive for excellence.

LEARNING: We share our knowledge with the best and brightest to better prepare the caregivers for tomorrow.

INNOVATION: We constantly seek new and better ways to care for our patients, our members, our community, and the nation.

SAFETY: We provide a safe environment for our patients and members and the Geisinger family.

Our Benefits

We offer healthcare benefits for full time and part time positions from day one, including vision, dental and prescription coverage.

Apply

Recently viewed jobs

No previously viewed jobs

Related jobs

Jillian Schaeffer

“I’ve been surprised by the community our floor has even though we can all work different days and different shifts. We celebrate successes and life events for employees and patients — that’s what makes it a fun place to work!”

Jillian Schaeffer

Nurse Assistant

A place where you can lead a healthy lifestyle and follow your dreams.

Only at Geisinger.

Best employer for healthy lifestyles   – National Business Group

Access to 121 state parks

Sign up for job alerts

Sign up below to receive job alerts and communications about career opportunities with Geisinger.

Interested InSearch location and/or categories of interest below and click "Add".

I acknowledge I have read Geisinger’s Terms of Use.

Returning Users