How We Simplified Our Data Ingestion & Transformation Process

Introduction As Grab grew from a small startup to an organisation serving millions of customers and driver partners, making day-to-day data-driven decisions became paramount. We needed a system to efficiently ingest data from mobile apps and backend systems and then make it available for analytics and engineering teams. Thanks to modern data processing frameworks, ingesting data isn’t a big issue. However, at Grab scale it is a non-trivial task. We had to prepare for two key scenarios:...

March 3, 2019 · 9 min · Yichao Wang, Roman Atachiants, Oscar Cassetti, Corey Scott

A Lean and Scalable Data Pipeline to Capture Large Scale Events and Support Experimentation Platform

Introduction Fast product development and rapid innovation require running many controlled online experiments on large user groups. This is challenging on multiple fronts, including cultural, organisational, engineering, and trustworthiness. To address these challenges we need a holistic view of all our systems and their interactions: For a holistic view, don’t just track systems closely related to your experiments. This mitigates the risk of a positive outcome on specific systems translating into a negative global outcome....

January 16, 2019 · 15 min · Oscar Cassetti, Roman Atachiants

Querying Big Data in Real-Time with Presto & Grab's TalariaDB

Introduction Enabling the millions and millions of transactions and connections that take place every day on our platform requires data-driven decision making. And these decisions need to be made based on real-time data. For example, an experiment might inadvertently cause a significant increase of waiting time for riders. Without the right tools and setup, we might only know the reason for this longer waiting time much later. And that would negatively impact our driver partners’ livelihoods and our customers’ Grab experience....

January 2, 2019 · 10 min · Roman Atachiants, Oscar Cassetti