+91 80748 68174 contactoffcampusjob@gmail.com

Forward-deployed Data Engineer

SkyPoint Cloud Portland, Oregon, US

About the Role

Skypoint is a HITRUST r2–certified Agentic AI platform for healthcare operations , designed to accelerate productivity and operational efficiency across healthcare organizations. Our platform enables healthcare providers, payers, and senior care organizations to unify fragmented data, model industry‑specific ontologies, and deploy AI agents that automate workflows and support better, faster decision‑making. Founded in 2020 in Portland, Oregon, Skypoint has grown to a team of over 75 employees and now serves more than 100 customers. We are proud to be recognized on Deloitte’s 2024 and 2025 Technology Fast 500™ , celebrating the fastest‑growing technology companies in North America, and to be featured on the INC. 5000 list in 2025 , reflecting our strong and sustained revenue growth over the past three years. About the Role We are looking for a Forward‑Deployed Data Engineer who thrives at the intersection of technical craftsmanship and client impact. This is a hands‑on engineering role embedded within our customer‑facing delivery team, working directly with healthcare clients — across payer, provider, and health system environments — to design, build, and optimize the data infrastructure that powers their most critical analytics and AI initiatives. You are a builder at heart, but you understand that the best data pipelines are ones that serve real people making real decisions. You are fluent in SQL and dbt, meticulous about data modeling, and energized by the challenge of turning messy, complex healthcare data into clean, reliable, well‑governed data products. You also bring an AI‑first mindset to your craft. You reach for AI‑assisted coding tools instinctively, you think about how the pipelines you build today can power agentic workflows tomorrow, and you are genuinely excited about what it means to build data infrastructure for a world where AI agents are first‑class consumers of data. What You’ll Do Design, build, and maintain scalable ELT/ETL pipelines that ingest, transform, and serve healthcare data across cloud platforms including Databricks and Snowflake Develop robust dbt projects — models, tests, documentation, macros, and packages — that serve as the transformation layer for client data platforms Build and manage data pipelines handling complex healthcare data types: claims, clinical, eligibility, provider, and financial datasets Implement data quality frameworks, testing strategies, and observability tooling to ensure pipeline reliability and data trustworthiness Optimize query performance, warehouse configurations, and pipeline orchestration for cost‑efficiency and speed Data Modeling & Warehouse Design Design dimensional models and star schema architectures that are clean, well‑documented, and optimized for downstream analytical and AI consumption Build and maintain semantic and conformed data layers that serve as the authoritative source for reporting, ML features, and agentic workflows Establish and enforce data modeling standards, naming conventions, and layering patterns (raw, staging, intermediate, mart) within client environments Work closely with analytics engineers and data consumers to ensure models meet business requirements without sacrificing technical rigor Client Engagement & Technical Communication Work directly with client data and engineering teams throughout project delivery — translating requirements, reviewing existing architectures, and aligning on technical approaches Participate in client working sessions and technical discussions, clearly communicating data modeling decisions, trade‑offs, and recommendationsProduce clean technical documentation — data dictionaries, lineage diagrams, architecture overviews — that clients can actually use and maintain Act as a reliable, knowledgeable partner to client teams, building credibility through consistent delivery and clear communication Agentic AI & AI‑First Engineering Build the data foundations that make agentic AI systems reliable: clean, well‑governed data products with clear semantics and dependable freshness SLAs Collaborate with AI engineers and analytics leads to ensure data pipelines meet the requirements of LLM‑powered and agentic applications — including vector‑ready outputs, structured tool‑use schemas, and streaming data patterns where applicable Use AI‑assisted coding tools (GitHub Copilot, Cursor, or equivalent) as a core part of your development workflow — not occasionally, but as a default Stay current on how agentic AI systems consume and interact with data, and apply that understanding to how you design and document data products What You Bring 4+ years of data engineering experience, with meaningful exposure to healthcare data environments — payer, provider, and/or health system experience strongly preferred Working familiarity with healthcare data concepts and standards: claims (medical, pharmacy, dental), eligibility, HL7/FHIR, EHR/EMR data structures, HEDIS, and encounter data Understanding of healthcare data sensitivity and compliance considerations, including HIPAA‑compliant data handling and de‑identification patterns Core Technical Skills Advanced SQL proficiency — you write complex, performant queries and understand how to optimize them across both Snowflake and Databricks environments Expert‑level proficiency in Power BI — including complex DAX, data modeling, deployment pipelines, row‑level security, and enterprise governance Deep, hands‑on dbt expertise — you have built and maintained production dbt projects and are comfortable with advanced features: macros, packages, incremental models, snapshots, and test frameworks Proven experience designing star schema and dimensional models — you know the difference between a fact and a dimension table in your sleep, and you know when to break the rules Strong experience with Databricks — Delta Lake, Unity Catalog, Spark SQL, notebook‑based development, and workflow orchestration Strong experience with Snowflake — including performance optimization, Snowpark, data sharing, and cost governance Proficiency in Python for pipeline development, data transformation scripting, and automation Experience with pipeline orchestration tools such as Airflow, Prefect, Dagster, or equivalent AI‑First Tooling & Mindset Demonstrated adoption of AI‑assisted coding tools (GitHub Copilot, Cursor, Amazon CodeWhisperer, or equivalent) as a daily productivity standard — not an occasional experiment Enthusiasm for agentic AI and a clear understanding of what it means to build data products for AI agents as consumers, not just human analysts Comfort with the data requirements of AI systems: structured schemas, embedding‑ready outputs, retrieval‑friendly data products, and reliable freshness guarantees Curiosity and initiative in applying new AI tooling to engineering challenges — you look for ways to move faster and build better with the tools available Clear, confident technical communication — you can explain a data model to a data analyst and a pipeline architecture to a platform engineer without losing either audience Experience working in client‑facing or cross‑functional delivery environments where your work is visible and your decisions have direct business impact Strong documentation habits — you treat docs as part of the deliverable, not an afterthought Comfort with ambiguity and evolving requirements, common in healthcare data environments where source systems are messy and specifications change Nice to Have Experience with Microsoft Fabric — Fabric Lakehouses, Dataflows Gen2, Fabric Notebooks, or OneLake Exposure to vector databases (Pinecone, pgvector, Azure AI Search) and RAG pipeline patterns for AI‑powered applications Experience building data pipelines that feed agentic or LLM‑powered workflows — tool schemas, structured outputs, or real‑time data serving Familiarity with healthcare interoperability platforms (Redox, Health Gorilla, Rhapsody) or FHIR API integrations Exposure to population health, risk stratification, or quality measure (HEDIS, STAR) reporting data DBT certifications or Databricks/Snowflake certifications Experience with streaming data platforms (Kafka, Kinesis, or Databricks Structured Streaming) for near‑real‑time pipeline patterns Why This Role Do real engineering work that matters — the pipelines you build directly power healthcare decisions that affect real patients and populations Work at the cutting edge of healthcare data modernization alongside engineers who take craft seriously Be part of a team where AI‑first is a genuine operating principle, not a buzzword — you will be expected and supported to build with the best tools available Grow your exposure to agentic AI and the infrastructure patterns that will define the next generation of data systems Competitive compensation, comprehensive benefits, and a flexible remote‑first culture Skypoint is an Equal Opportunity Employer. We do not discriminate based on race, color, religion, sex, national origin, age, disability, veteran status, or any other protected characteristic. #J-18808-Ljbffr

Responsibilities

  • Collaborate with clients to design data infrastructure
  • Build and optimize data pipelines for healthcare analytics
  • Align data models with governance and AI-driven workflows

Qualifications

  • BS/BA in Computer Science or related field
  • Experience with SQL, dbt, data modeling
  • Experience with healthcare data preferred

Required Skills

SQL dbt data modeling data governance AI-assisted tooling

Keywords

data engineering forward-deployed healthcare

Interested in this role?

Apply now and take the next step in your career.

Apply Now