Chess Challenges?!
Hey! Hey! Hey! 
I'm Arvind Abraham, a Senior Software Engineer with a passion for coding and optimizing big data processes. With a strong interest in Machine Learning & Data Engineering, I thrive on tackling complex challenges and mastering new skills on the fly.
With 6 years of experience at Kayzen, a cutting-edge German AdTech startup, and at leading MNCs like Zscaler and Nokia Alcatel Lucent, I've had the privilege of working across diverse domains alongside some of the brightest minds in the industry.
I see software engineers as real-life superheroes, using code to harness the power of electricity and make the world a better place.
Always eager to improve and learn, I've recently started contributing to open-source projects, further expanding my horizons.
Tech Stacks
Certifications, Honours and Achievements
CyberSecurity Meet in person with Micha Weis (Head of Finance Cyber Unit - Ministry of Finance, Israel)
First Prize in Debugging (C/C++) at Inter-College fest associated with Anna University
Bagged Gold & Silver medals from HelpAge India for raising the highest funds since I was 10 years old
Received commendation from Kapil Sibal, Minister of Human Resource Development, for successfully completing the inaugural batch of CCE
Thrilled to have completed a 15,000-foot skydive over Prague, an incredible honor and adventure
Experience
Zscaler | Banglore | May 2024 - Present
- [DataEng]
- Built a real-time data streaming data flow on Azure using ADX and EventHubs via ARM.
Kayzen | Berlin & Bangalore | Aug 2019 - May 2024
- [ML]
- Built E2E ML backend for retargeting users, driving in a median monthly revenue of $30k.
- Implemented ML solutions for AB Testing, Loss Notification Analysis, Model Deployment & Validation, and Feature Pipelines, while also creating insightful dashboards.
- [DataEng]
- Optimized performance of large-scale Big Data ETL processes involving Spark, Kafka, ClickHouse, SQL, Hadoop, and Airflow.
- Optimized Parquet ETL, reducing data transfer time from ClickHouse to HDFS from over 24 hours to 40-50 minutes for 1.5 TB of data.
- Improved JSON data ingestion from Kafka to ClickHouse on-premise, increasing throughput from 48,000 to 93,000 messages per second.
- Architected DL & DWH using technologies such as Hive, Spark, HDFS, ClickHouse, Iceberg, Delta, and Dremio.
- Reduced data ingestion time to MySQL from 70 to 3 minutes for 10 hours of batch refresh data.
- [Dev-Ops]
- Pioneered the Adoption of Apache Airflow during its early development phase.
- Developed frameworks for database operations, focusing on scalability, sharding, configuration, and migration tasks.
- Handled DevOps responsibilities, including debugging, monitoring, and maintaining on-premises servers across three data centers.
Nokia, Bengaluru, India | On-Site | Aug 2018 - Aug 2019
- [R&D] Secure integration of Schema Registry with Kafka.
- [Security] Development of scripts to generate SSL certificates for internal authentication on K8s clusters with 1k nodes.
- [BDD - Radish Framework] Performance testing Kafka integration with KSQL, KConnect, Schema Registry.
Publications
Proceedings of the 5th International Conference on Cyber Security & Privacy in Communication Networks (ICCS) 2019 | NIT, Kurukshetra, India
Publication Date: 30 December 2019
It is known that phishing URLs change intermittently, which causes models to become obsolete after a period of time. In this work we examine the efficiency of a phishing detection model in terms of model drift. That is given a trained phishing detection model, how long will the model maintain the performance.
- 🔠Working on multiple projects, including data engineering, software development, and monitoring.
- 🌱 Learning to contribute to open source.
- 👯 Looking to collaborate on Apache Airflow.
- 📫 How to reach me: arvindeybram@gmail.com
- 😄 Pronouns: He/Him
- âš¡ Fun fact: Can do a handstand.
Apart from work, I love watching Anime.
My never-give-up attitude is my ninja way.