Descripción del Puesto
About Us
Staq is a leading Banking-as-a-Service (BaaS) and embedded finance platform, transforming the way businesses integrate banking and financial services. At Staq, we empower our clients to innovate, expand, and streamline their financial services offerings using our cutting-edge platform. Our mission is to bridge the gap between traditional banking and the digital era by providing seamless, scalable, and secure financial solutions.
The Role
Our agents, recommendation systems, and automations are only as good as the data they consume. An agent giving financial advice needs rich, accurate, timely context about a user’s accounts, transactions, spending patterns, and financial goals. A recommendation engine needs well-structured feature data. An automation trigger needs reliable signals.
Right now that data plumbing doesn’t have a dedicated owner. As we scale from one product to an SDK that multiple banking applications use, the data layer becomes a shared dependency that every AI feature builds on top of. This role owns the pipelines that feed the intelligence platform, the evaluation data that tells us if our AI is working, and the infrastructure that lets us iterate on data quality without slowing down AI development.
Key Responsibilities
Context & Feature Pipelines for AI
Build and maintain the data pipelines that transform raw financial data (Plaid transactions, bank accounts, credit data, subscription records) into the enriched context that agents consume at runtime
Design the feature store or context layer that serves real-time and batch features to agents, recommendation engines, and automation triggers
Ensure data freshness, quality, and consistency across all pipelines feeding the intelligence platform
Build the context enrichment that makes the difference between a generic chatbot and a financial assistant that actually understands a user’s financial situation
Evaluation & Observability Data
Build the data infrastructure for AI evaluation — collecting agent decisions, recommendation results, automation outcomes, and user feedback into queryable, analyzable datasets
Own the LLM observability data layer — structured collection of call latencies, token usage, cost per flow, error rates, and model performance metrics across all agent and automation flows
Create dashboards and data products that let the AI team measure agent quality, recommendation relevance, automation success rates, and LLM operational health
Support A/B testing and experiment tracking data infrastructure so we can iterate on AI behavior with evidence, not intuition
SDK Data Contracts
Design data contracts and schemas that serve both Zeen and future banking applications that plug into the intelligence platform SDK
Own the ingestion layer for partner and third-party data sources — as the SDK expands to other banks, each will bring their own data formats and integration patterns
Build the feedback loops that connect production outcomes back to agent and recommendation improvement
Data Quality & Operations
Own data quality monitoring, validation, and alerting across all pipelines
Build data lineage tracking so we can trace any agent decision back to the data that informed it
Ensure PII handling in data pipelines aligns with platform policy — financial data requires careful treatment, and the AI layer has strict boundaries around what data reaches LLMs
Technical Environment
Python for pipeline development; SQL for analytics and data modeling
Financial data sources: Plaid, partner APIs, internal domain services (banking, credit, subscriptions, journal/ledger)
OpenTelemetry traces and structured artifacts as data sources for AI evaluation
Cloud-native infrastructure; containerized services
Financial data with strict handling requirements
What We Are Looking For
Must Have
3+ years building and operating production data pipelines
Strong Python and SQL; experience with data transformation frameworks
Experience designing schemas and data contracts for consumption by application services or ML/AI systems
Understanding of data quality practices — validation, monitoring, alerting on pipeline failures
Comfort working with sensitive financial data and understanding why data handling discipline matters
Strong Signals
Experience building data infrastructure that feeds AI/ML systems (feature stores, context pipelines, evaluation datasets)
Fintech or financial services background
Familiarity with observability data (OpenTelemetry, structured logs) as a data source
Experience building monitoring and analytics for LLM systems — latency tracking, cost attribution, and performance dashboards
Experience with data lineage, audit trails, or data governance
Exposure to real-time streaming alongside batch processing
Experience designing data contracts for multi-tenant or multi-product platforms