Platform Investigations Engineer

White Hat Gaming is a state-of-the-art iGaming platform providing a secure, scalable and flexible modular Casino and Sportsbook Player Account Management solution.

We offer operators choice, from our proprietary Player Account Management (PAM) to a full white-label solution. WHG provides market-leading content including Kambi Sportsbook, CRM tools, all payment options, and more than 3000 games from 120 leading games providers.

With over 500 talented colleagues from around the world, we offer a dynamic, collaborative environment where your ideas can flourish alongside industry leaders. Join us and be at the forefront of iGaming innovation!

Summary:

We are looking for a Platform Investigations Engineer to join our engineering team. This is not a traditional support or software development role. You will act as a technical detective - parachuted into complex, ambiguous situations across a very broad software ecosystem - to identify root causes, resolve data integrity issues, and recommend concrete fixes.

You will work across a large-scale event-driven platform comprising a monolithic core application, distributed microservices, and Kafka-driven messaging infrastructure. Investigations are varied and unpredictable: no two days are the same, and the ability to follow evidence across system boundaries is essential.

This role reports to the Head of Engineering and is fully remote.

Your day to day:

Investigation & Root Cause Analysis

Own end-to-end investigation of ad-hoc technical issues raised by product, operations, or engineering teams.
Diagnose data integrity problems by tracing issues across event streams, APIs, and database layers.
Analyse Kafka event histories to reconstruct system state and identify where data diverged from expected behaviour.
Produce clear root cause analysis (RCA) documentation that explains what happened, why, and how to prevent recurrence.

Data Analysis & Resolution

Write complex SQL queries against production and staging databases to investigate data anomalies, gaps, and inconsistencies.
Understand normalised and denormalised data models, and navigate across multiple schemas with confidence.
Propose targeted data remediation scripts or operational procedures to resolve identified data issues safely.
Validate proposed fixes in lower environments before escalating to production resolution recommendations.

API & System Interaction

Use REST APIs to probe system state, trigger reprocessing workflows, and verify expected outputs.
Construct and execute API requests using Postman, and interpret responses, including error payloads, pagination, and hypermedia links.
Work with engineering and product teams to understand service contracts and identify where API behaviour diverges from the specification.

Code Reading & Technical Collaboration

Read and interpret Scala source code to understand business logic, event handling, and data transformations — without necessarily writing new code.
Collaborate closely with software engineers to understand system internals and escalate findings with sufficient context for developers to act.
Contribute to post-incident reviews and help engineering teams understand systemic weaknesses exposed by investigations.

Documentation & Communication

Produce concise, structured investigation reports suitable for both technical and non-technical audiences.
Maintain a knowledge base of known issues, recurring patterns, and resolution playbooks.
Communicate investigation status and findings clearly to stakeholders, managing expectations around timelines and scope.

AI and In-House Tooling

Use in-house tooling to aid investigations and RCA.
Provide feedback to tool owners on your experience and provide guidance on bugs and improvements.
Apply AI (Claude) intelligently and sceptically; use it as one of the tools in your toolbox, not as a crutch or shortcut to hard-won understanding.

Out of scope:

Software Development - this is not a software development role; feature development, bug fixes, DDL, etc., are done by other teams.
Infrastructure changes and set-up - although knowledge of AWS Cloud services is useful, this is not an infra-ops role.
Regular out-of-hours support - NOC and other support teams handle OOH calls. This role is for EU / UK business hours.

What we are looking for:

Technical Foundations

Solid background in software development - you should be able to think like a developer even if you are no longer writing production code.
Demonstrable experience investigating complex systems issues in a professional environment.
Strong analytical mindset with the ability to form and test hypotheses methodically.

SQL & Data

Proficient in writing complex SQL, including multi-table joins, subqueries, window functions and aggregations.
Comfortable working with unfamiliar data models and reverse-engineering schemas without full documentation.
Experience working with relational databases (e.g. PostgreSQL, MySQL or similar) in a professional context.
Ability to reason about data consistency, referential integrity and the impact of partial failures on stored state.

APIs & Tooling

Strong understanding of REST API design principles, including HTTP methods, status codes, request/response structures and authentication patterns.
Proficient with Postman or equivalent tools for constructing, executing and inspecting API requests.
Ability to interpret API responses, including error structures, timestamps and identifiers, to trace issues across services.

Distributed & Event-Driven Systems

Working knowledge of event-driven architectures and asynchronous processing patterns.
Familiarity with Apache Kafka, including topic structure, consumer groups, offset management and the role of events in driving system state.
Understanding of the challenges inherent in distributed systems, including eventual consistency, message ordering, idempotency and failure modes.
Experience with or exposure to microservices and the ability to reason about inter-service dependencies.

Code Comprehension

Able to read and understand Scala code (Functional and OO), particularly around data transformation logic, event handlers and domain models. Writing Scala is not required.
Some knowledge of Cats Effect or ZIO.
Comfortable navigating large codebases to understand system behaviour without step-by-step guidance.

Investigation & Collaboration

Highly curious, with a genuine interest in solving complex, ambiguous technical problems.
Rigorous and methodical, documenting your reasoning and validating findings before reaching conclusions.
Self-directed and able to navigate large, unfamiliar systems with minimal guidance.
Strong collaboration and communication skills, with the ability to work effectively alongside software engineers, data teams and product stakeholders while adapting technical findings for different audiences.
Comfortable working in uncertain environments where investigations rarely have obvious starting points, using structured problem-solving to identify root causes.

Nice to have:

Experience with Kafka tooling such as Kafka CLI, Kafdrop, Confluent Control Centre, or equivalent.
Familiarity with monitoring and observability tools (e.g. Datadog, Grafana, Kibana) for cross-referencing system behaviour during an investigation.
Exposure to cloud infrastructure (AWS or GCP) and understanding how infrastructure-level events can surface as application issues.
Scripting ability in Python, Bash, or similar for automating repetitive investigation steps.
Experience with incident management processes, RCA frameworks, or post-mortem facilitation.

How we approach things:

Dynamic Medium-Sized Environment: We have a can-do ethos, where innovation is encouraged, and action is valued.
Core Values at Heart: We live by Teamwork, Innovation, Trust, and Integrity in everything we do.
Results-Oriented Focus: We prioritise getting things done while supporting each other to reach both collective and individual goals.
Open Collaboration: Our open-door policy fosters collaboration across all levels and departments, where ideas flow freely.
Global Team: We are truly a global team with people from various countries and cultures contributing to our success.

What we offer:

A remote and flexible working schedule.
Generous time off varied based on the country of residence.
Discretionary annual performance bonus
Training and other learning & development opportunities to support you through your career progression.
Hardware & Software allowance or work equipment is provided to make sure you have all the right tools to get the job done.
Various well-being programmes and initiatives.

We are an equal opportunities employer and welcome applications from all suitably qualified persons regardless of their race, gender, disability, religion/belief, sexual orientation, or age.

By submitting your application, you agree that we process your data in accordance with our Privacy Policy for the management of your candidature to any of the positions we offer.

Tech - Platform

Remote

Partager sur :

Conditions d’utilisation Confidentialité Cookies Alimenté par Rippling