Building a Faster, Leaner Vector Search Library in Go

It all started innocently enough with a gaming side project. I wanted to create a fast and clever chat-bot for an NPC (Non-Player Character) that could keep up with players’ questions in real-time. Imagine a smart in-game assistant that could respond naturally to player queries without breaking immersion. But, as these things go, the project quickly evolved into something more ambitious. Over weekends and late nights, my simple chat-bot idea evolved into a full-blown vector search library kelindar/search....

October 23, 2024 · 4 min · Roman Atachiants

Toying with Columnar K/V Store

As I am interating over the API design and the implementation of kelindar/column, I’ve started to build some toy examples. One of such toy examples I wanted to explore is how one might go about building a key-value cache using columnar storage. Now, let me first start by saying that doing this is not efficient and you are probably better off using a traditional key-value store organised as a B+Tree or a Hashmap, in fact, you’d be better of just using map[string]string than using the columnar store....

September 10, 2021 · 5 min · Roman Atachiants

Benchmarking Columnar Store Concurrency

In the recent months, I’ve been working on a side-project of mine where I’m trying to build a high-performance, in-memory columnar datastore - kelindar/column. One of the initial builds had a huge lock around the every update and read, which is not really scalable since it’s rather prone to lock contentions. I’ve improved recently by introducing a sharded mutex, essentially an array of 128 locks with some padding around them to avoid false sharing....

September 9, 2021 · 3 min · Roman Atachiants

Building a Columnar, In-Memory Store for Modern Hardware

Besides working on the new experimentation platform in Careem, over the past few weekends I’ve been toying with building a high-performance columnar, in-memory store in Go. Most importantly, I wondered if it is possible to build a nice API while having data laid out in a very cache-friendly way and support indexing. In addition, most of columnar data stores are designed primarily for OLAP and there’s not too many OLTP or HTAP stores built in a columnar fashion so can we build something for OLTP?...

June 26, 2021 · 6 min · Roman Atachiants

Being a Principal Engineer at Grab

Over the past few years Grab has grown from a small startup to one of the largest technology companies in South-East Asia. Along with the company’s growth, the number of microservices, features and teams also grew substantially. At the time of writing this blog, we have around 350 microservices powering our super-app. A great engineering team is a critical component of our success. As an engineer you have two career paths in front of you: an individual contributor role, or a management role....

September 25, 2019 · 6 min · Roman Atachiants

How We Simplified Our Data Ingestion & Transformation Process

Introduction As Grab grew from a small startup to an organisation serving millions of customers and driver partners, making day-to-day data-driven decisions became paramount. We needed a system to efficiently ingest data from mobile apps and backend systems and then make it available for analytics and engineering teams. Thanks to modern data processing frameworks, ingesting data isn’t a big issue. However, at Grab scale it is a non-trivial task. We had to prepare for two key scenarios:...

March 3, 2019 · 9 min · Yichao Wang, Roman Atachiants, Oscar Cassetti, Corey Scott

A Lean and Scalable Data Pipeline to Capture Large Scale Events and Support Experimentation Platform

Introduction Fast product development and rapid innovation require running many controlled online experiments on large user groups. This is challenging on multiple fronts, including cultural, organisational, engineering, and trustworthiness. To address these challenges we need a holistic view of all our systems and their interactions: For a holistic view, don’t just track systems closely related to your experiments. This mitigates the risk of a positive outcome on specific systems translating into a negative global outcome....

January 16, 2019 · 15 min · Oscar Cassetti, Roman Atachiants

Querying Big Data in Real-Time with Presto & Grab's TalariaDB

Introduction Enabling the millions and millions of transactions and connections that take place every day on our platform requires data-driven decision making. And these decisions need to be made based on real-time data. For example, an experiment might inadvertently cause a significant increase of waiting time for riders. Without the right tools and setup, we might only know the reason for this longer waiting time much later. And that would negatively impact our driver partners’ livelihoods and our customers’ Grab experience....

January 2, 2019 · 10 min · Roman Atachiants, Oscar Cassetti

Orchestrating Chaos using Grab's Experimentation Platform

Background To everyday users, Grab is an app to book a ride, order food, or make a payment. To engineers, Grab is a distributed system of many services that interact via remote procedure call (RPC), sometimes called a microservice architecture. Hundreds of Grab services run on thousands of machines with engineers making changes every day. In such a complex setup, things can always go wrong. Fortunately, many of the Grab app’s internal services are not critical for user actions like booking a car....

November 23, 2018 · 8 min · Roman Atachiants, Tharaka Wijebandara

Reliable and Scalable Feature Toggles and A/B Testing SDK at Grab

Imagine this scenario. You’re on one of several teams working on a sophisticated ride allocation service. Your team is responsible for the core booking allocation engine. You’re tasked with increasing the efficiency of the booking allocation algorithm for allocating drivers to passengers. You know this requires a fairly large overhaul of the implementation which will take several weeks. Meanwhile other team members need to continue ongoing work on related areas of the codebase....

November 2, 2018 · 10 min · Roman Atachiants