Technical debt and how to avoid it

Technical debt is when we use shortcuts during development, that are quick and easy, instead of opting for a more robust and generic solution. We promise to fix this later, i.e., paying the debt, yet forget about it. This results in growing list of TODOs/refactorings, a debt ridden code and an overall reduction in confidence on the platform.Why you accumulate technical debtThe prime cause of accumu […]

Processing personal data of users while respecting privacy protection laws

When working on projects that collect personal identifiable information of users (referred to as PII hereafter), it is a legal obligation to comply with the local privacy protection laws like GDPR / CCPA etc. However, this does not mean that whole teams and business units should be blocked from using this data without a valid reason to access PII.Here are a few techniqes that I have used or come ac […]

Writing tests for apache beam sliding window based streaming pipeline with late event triggers

I've seen engineers struggle to write functional test for streaming pipelines, especially when event ordering is important and complexities arise with late arriving data. In this post, I will explain how to control the watermarks and test a sliding window based aggregation logic, while also considering late events.If you are new to windowing in apache beam, I suggest you go through the concepts bef […]

Running the LLaMA AI Language Model on a Laptop

LLaMA is an open source large language model built by Meta. It is quite small in size compared to other similar models like GPT-3, thus with the potential to be run on everyday hardware, atleast for fun, like I did. It is impressive how complex AI models, like these, can be packaged into files of few gigabytes and can be launched anywhere.The trained model of LLaMA was only made available to resear […]

Infrastructure as Code On Google Cloud Platform using Terraform

Terraform is a tool designed for provisioning infrastructure on various platforms using Infrastructure as Code (IaC) paradigm. In this approach, you define configuration files to create resources on cloud platforms, instead of manually creating them either through the UI.As always, let's get our hands dirty and see how we can utilize Terraform to create resources on GCP.Here is a list down of the f […]