Hello, I'm

Henry Richard J

Data Engineer

Building scalable pipelines, optimizing big data workflows, and delivering cloud-based analytics solutions. Helping organizations turn complex data into actionable insights.

Henry Richard J

About Me

Data Engineer with 4+ years of experience building scalable pipelines, optimizing big data workflows, and delivering cloud-based analytics solutions. Proficient in Python, PySpark, SQL, and Databricks with deep expertise in ETL development, data migration, and workflow orchestration using Apache Airflow.

Hands-on experience with Azure (ADF, DevOps, Function Apps) and Microsoft Fabric. Proven track record of transforming opaque legacy systems into modern, robust, and auditable data platforms.

Chennai, Tamil Nadu, India

Technical Skills

Languages & Frameworks

Python PySpark SQL C#

Data & ETL Tools

Databricks Apache Airflow Azure Data Factory (ADF) Delta Lake Microsoft Fabric

Cloud & DevOps

Microsoft Azure Azure DevOps Azure Function Apps Azure Bot Framework

Databases

MySQL Snowflake Delta Lake

AI / LLM

LangChain LLM Integration Databricks Genie API

Other Tools

FastAPI ReactJS Streamlit Plotly Jinja2 Playwright Domo Tableau

Work Experience

Data Engineer | Agilisium Consulting

Mar 2022 – Present

Client: Union Chimique Belge (UCB) (Feb 2024 – Present)

  • SP-PSP Internalization: Eliminated 3rd-party dependency by re-engineering transformation pipelines in PySpark and SQL within Databricks, reducing licensing and maintenance risk. Improved pipeline auditability via version-controlled notebooks.
  • HUB Migration: Owned end-to-end migration of 16 datasets from a legacy system into UCB's Azure ADF and Databricks platform. Optimized data refresh to complete 2 hours ahead of SLA. Drove post-UAT change requests under tight deadlines.
  • AI Data Analytics Chatbot: Led architecture of a MS Teams bot using Databricks Genie API and LangChain. Achieved org-wide self-service analytics adoption, reducing time-to-insight from hours to seconds.
  • Git-Deploy Optimization: Delivered a 73% reduction in Git deployment time by replacing git clone with Databricks Repos Sync API in Azure DevOps.
  • Optimize Framework: Reduced ADF pipeline processing time by 57% by introducing Azure Function App for event-driven SFTP triggers.

Client: National Broadcasting Company (NBC) (Jun 2022 – Feb 2024)

  • Pipeline Optimization: Proactively migrated a scraping bottleneck from Selenium to Playwright, reducing TikTok extraction runtime by 67%. Restored Meta social analytics continuity via Graph API upgrade.
  • Snowflake to Databricks Migration: Migrated business transformation logic, unifying compute and reducing query overhead. Validated accuracy through row-level benchmarking.
  • Domo to Tableau Migration: Converted Domo Dataflows to Snowflake-native views. Automated Source-to-Target Mapping docs via Python. Rebuilt pipelines as Airflow DAGs with zero stakeholder disruption.
  • Snaplogic to Databricks Migration: Authored validation scripts for row-by-row comparisons. Resolved production bugs and implemented automated failure alerting, reducing mean time to detection.

Data Engineer Trainee (Mar 2022 – May 2022)

  • Completed intensive onboarding covering Apache Airflow, Snowflake, and end-to-end data pipeline development.
  • Built sample pipeline integrating Spotify Web API with MySQL, visualizing insights using Streamlit and Plotly.

Personal Projects

Henry's GitHub Stats
Top Languages

Airflow Orchestration Framework

Python, MySQL, Jinja2, FastAPI, Apache Airflow, ReactJS

Engineered a self-hosted orchestration platform on Oracle Cloud VM with reusable DAG templates and a FastAPI backend, eliminating boilerplate. Built a ReactJS frontend for declarative task config.

NextGen Data Ingestion Framework

PySpark, Microsoft Fabric, LangChain, LLM

Built a configuration-driven framework supporting 5+ source types with zero code changes per new source. Designed Medallion architecture on Microsoft Fabric and integrated LLM to autonomously parse email requests.

Stocks Aerial View

C#, MySQL

Developed a Windows desktop app for NSE stock analysis with live data scraping, automated pivot point calculations, and one-click spreadsheet export functionality.

Certifications

  • Databricks Certified Data Engineer – Associate
  • PCEP – Certified Entry-Level Python Programmer

Achievements

  • Global Town Hall 2026 – Outstanding Contribution Award, Agilisium
  • Synergy Award, Agilisium Consulting (2024)
  • CEO's Team – Existing Customer Eminence Award (2023)

Education

B.E. Computer Science

Sri Lakshmi Ammal Engineering College

2017 - 2021 CGPA: 7.75

Languages

English (Professional) Tamil (Native)