The Client:
NDA. Our client is a huge social network with millions of users where they meet new people, chat, shop, and can earn real money. It is a US-based company in complex, continuous, high-traffic environments and uses a data platform for streaming data and advanced analytics to quickly gain insights into customer behaviour.

Engage Customers in the Game with Advanced Analytics
The Business and Technical Challenges:
-
The client's on-premises infrastructure limited their ability to innovate and scale analytics. Analysts were working with batch-processed historical data and lacked tools for generating fast, actionable reports on customer in-game behavior.
-
There was no test environment to validate analytics hypotheses, creating bottlenecks and slowing down decision-making.
To overcome these limitations, the client needed to: -
Modernize their platform's data architecture
-
Enable continuous data processing and analytics
-
Create an environment for scalable insights and experimentation
-
On-Premises to AWS cloud migration of data infrastructure
-
Increment database replication using CDC for analytical workloads
-
Build an analytical Data Warehouse as Single source of truth
The Solution:
We supported the client through a full Lift-and-Shift migration from on-prem to AWS and re-architected the data infrastructure to enable advanced analytics at scale.
We helped:
-
Designing a comprehensive cloud migration and modernization strategy
-
Building a robust analytical Data Warehouse
-
Developing near real-time streaming jobs on Kubernetes (EKS)
-
Setting up Change Data Capture (CDC) using Debezium → Kafka → Hudi for incremental database replication
-
Reworking Airflow DAGs to meet SLAs and improve performance
-
Introducing a test environment for analysts to experiment safely and efficiently

The Tech Stack Used in the Project:
-
AWS EMR
-
Apache Airflow
-
Apache Spark
-
Trino/Presto
-
Kubernetes
-
Debezium/Hudi
-
Kafka
-
Spark streaming
-
Terraform IaC
The Result:
We helped optimize our client's data architecture, ensuring efficient data flow, scalable infrastructure, and enhanced fast analytics. Our engineers helped to:
-
establish modern cloud infrastructure for large-scale data processing;
-
design data warehouse along with tools for rapid data insights;
-
design and implement a streaming pipeline for customer events;
-
introduce Change Data Capture to enable incremental replication of relational databases;
-
optimize and fine-tuneETL jobs to reduce the daily DAG run time;
-
improve significantly overall system stability, fault tolerance, and reliability

The Data Security:
-
AWS security best practices were followed in the design and configuration of the infrastructure.
-
Data in transit is encrypted.
-
Users are granted fine-grained access to warehouse data, ensuring appropriate levels of visibility and control.