Loading
Hi there, I'm

Hemesh RM Data Solutions Engineer

I build scalable data pipelines, ML-driven applications, and analytical systems that optimize performance, reduce costs, and deliver actionable insights

Scroll Down

About Me

I'm a Computer Science Engineer with a passion for data science, and I've been expanding my toolkit to include database architecture, machine learning, and front-end development. Currently, I'm pursuing my master’s in Computer Science at Indiana University Bloomington, where I refine my skills and tackle tech challenges head-on. My journey through code has been driven by a love for puzzles—each project is a unique problem waiting to be solved.

Every data set presents a new challenge, a hidden pattern just waiting to be uncovered. I enjoy diving into messy data, debugging complex algorithms, and turning chaos into clarity. With an insatiable appetite for learning and a knack for finding solutions others might miss, I invite you to explore my portfolio and see how I blend technical expertise with creative problem-solving.

Software Engineering
Data Scientist
Data Analyst
Data Engineering

Experience

Software Engineering Intern

June 2024 - Dec 2024
Reinsurance Group of America Chesterfield, MO
  • Migrated legacy JavaScript codebase to React 18 with TypeScript and Material UI, boosting maintainability and performance.
  • Optimized Node.js backend with caching and efficient API design, improving platform speed by 15%.
  • Automated UI testing using TypeScript, Playwright, and Cucumber JS; reduced manual regression effort by 80%.
  • Fixed security vulnerabilities (SSDLC) while modernizing the frontend for compliance
  • Designed and deployed an AWS Lambda function to automatically detect and terminate idle EC2 instances running EMR clusters, reducing cloud costs by $1,000 per month; integrated with Datadog for real-time monitoring and Slack for automated notifications
React TypeScript Node.js Mocha React Testing Library Material UI Jenkins Node.js Playwright AWS Lambda

Software Engineering Intern

Feb 2023 - Apr 2023
New Pro Data Madurai, India
  • Built an NLP resume parser with 95% accuracy, extracting key resume data automatically.
  • Used MPNet embeddings and FAISS for semantic job-resume matching, boosting accuracy to 93%.
  • Developed a Django-based HR chatbot that automated onboarding and eliminated third-party platforms.
Python Django MPNet HuggingFace AWS

Data Analyst Intern

Aug 2021 - Sep 2021
The Sparks Foundation Chennai, India
  • Conducted EDA with Matplotlib, Seaborn, and Plotly, optimizing inventory and reducing costs by 20%
  • Evaluated historical sales data to identify key trends affecting item availability; proposed strategic adjustments resulting in an increase of sold items per week from 120 units to an impressive average of 180 units
  • Analyzed sales and inventory data to identify pricing opportunities, addressing the top three margin loss contributors on high-demand products.
Python Pandas Matplotlib Seaborn EDA Tableau

Education

Master of Science - Computer Science

Indiana University Bloomington
Bloomington, Indiana, USA
Aug 2023 - May 2025

Coursework: Elements of Artificial Intelligence, Applied Machine Learning, Applied Algorithm, Computer Vision, Engineering Cloud Computing, Software Engineering, Applying Machine Learning Techniques for Natural Language Processing, Computer Networks, Information Visualization

Bachelor of Technology - Computer Science

Amrita Vishwa Vidyapeetham University
Chennai, Tamil Nadu, India
August 2019 - May 2023

Coursework: Data Structures and Algorithms, Operating Systems, Computer Networks, Database Management Systems, Computer Architecture, Software Engineering, Distributed Systems, Cloud Computing, Machine Learning, Security Computing, Natural Language Processing, Business Analytics, Digital Image Processing, Compiler Design, Web development

Projects

Cookbook: AI Recipe Generator

React Firebase GroqCloud API LLM

Developed a web application that enables users to discover and generate personalized recipes using GroqCloud's LLM (Llama model).

  • Integrated AI to generate meal ideas tailored to user preferences
  • Supported filters for cuisine, nutrition goals, and allergies
  • Stored and retrieved user history with Firebase for seamless UX

Semantic Book Recommender

Python Hugging Face LangChain Qdrant Gradio

Built a semantic recommendation system using embeddings to help users discover relevant books based on content similarity and genre preferences.

  • Embedded 7,000+ book descriptions into 384-d vectors using all-MiniLM-L6-v2 for similarity search
  • Integrated Hugging Face and LangChain with Qdrant for fast cosine retrieval and metadata filtering
  • Deployed a Gradio app with a zero-shot classifier (∼80% accuracy) for fiction vs non-fiction labeling

AI Agent for Flappy Bird Simulation

Python Pygame NEAT

Designed and trained an AI agent using NEAT to autonomously master the Flappy Bird game through neural network evolution.

  • Developed autonomous gameplay using NEAT to evolve neural networks without human input
  • Implemented collision detection, game mechanics, and dynamic pipe difficulty for realistic simulation
  • Achieved 100% survival rates by optimizing bird movement based on pipe positions and altitude

Football Analysis

Python YOLOv8 OpenCV PyTorch KMeans

Built an AI-powered system for real-time football analysis using computer vision techniques to extract insights.

  • Used YOLOv8 for object detection of players, referees, and the ball, achieving 79% mAP and 80% IoU
  • Clustered players by jersey color using KMeans with a silhouette score of 0.6; added optical flow to stabilize tracking with camera motion
  • Computed player speed and distance via perspective transformation for real-world accuracy

Netflix Dashboard

Tableau

Created a Tableau dashboard analyzing Netflix’s global catalog to uncover content strategies and market potential.

  • Identified high-demand regions like the U.S. and India, and untapped areas such as Africa and Eastern Europe
  • Revealed that 68% of the content consists of movies, highlighting Netflix’s focus on versatile viewing patterns
  • Surfaced trends in genres like Documentaries and Dramas, and growth in niche categories such as true crime

Automotive Sales ETL Pipeline

Azure Data Factory Databricks PySpark

Designed a robust ETL architecture for automotive sales data using industry-standard patterns and modern data engineering tools.

  • Implemented scalable pipelines with Medallion Architecture, Delta Lake, and Unity Catalog
  • Built a Star Schema with Fact and Dimension tables; automated incremental loads via stored procedures in ADF
  • Used SCD modeling in Databricks to ensure high data consistency and traceability

YouTube Trends Analysis

AWS S3 AWS Glue AWS Lambda Athena QuickSight

Built an AWS-based data pipeline to analyze regional YouTube video trends, leveraging serverless tools for transformation and visualization.

  • Ingested and transformed trending video data using AWS S3, Glue, and Lambda into Parquet format
  • Performed SQL-based analytics with Athena and created interactive dashboards in QuickSight
  • Used AWS Glue Catalog for automated schema detection and metadata management

Stock Market Real-Time Data Engineering

Apache Kafka AWS Glue Athena

Built a real-time data pipeline to simulate and process stock market data, leveraging Kafka and AWS analytics services for streaming insights.

  • Implemented producer-consumer pipelines in Kafka to ingest and stream real-time stock data
  • Simulated data using Python and Boto3, writing outputs to S3 for downstream processing
  • Used AWS Glue Crawlers to catalog datasets and Athena for on-demand SQL analysis

Lip Reader

Python TensorFlow Streamlit

Built a sentence-level lip-reading system using deep learning reduce reliance on audio-based speech recognition.

  • Built a LipNet-inspired model using STCNNs, RNNs, and CTC loss for sequence prediction
  • Trained on the GRID dataset and developed a Streamlit app for real-time lip-reading
  • Integrated into an assistive communication system for improved accessibility

Video Generator from Blog Posts

React Node.js Express GPTScript Tools

Built a web application that converts blog URLs into summarized videos using GPTScript and FFmpeg pipelines.

  • Used GPTScript to summarize content and generate visual + audio media
  • Implemented Node.js backend to handle FFmpeg workflows and API requests
  • Developed a React frontend to present generated videos interactively
  • Enhanced content accessibility and engagement through automation

Movie Ticket Booking Application

MERN Stack Stripe Redux Toolkit JWT

Built a full-featured theatre management platform with real-time seat tracking and secure payment integration.

  • Developed role-based portals for Users, Admins, and Theatre Owners with secure access
  • Built with Ant Design, Redux Toolkit, JWT + BCrypt authentication, and deployed on Render
  • Integrated Stripe for payments and implemented show scheduling, seat selection, and ticket control

Algae Classification

Python TensorFlow

Built a deep learning system to classify microscopic algae images for environmental monitoring.

  • Developed a deep learning system using CNN, AlexNet, and ViT to classify algae images
  • Processed FlowCam DB images with data augmentation, achieving 98% top-5 accuracy
  • Deployed at the City of Bloomington’s office, integrating into a live preprocessing-to-display pipeline

Skills

Languages

Python, Java, C/C++, R, TypeScript, JavaScript, HTML/CSS, PySpark

Frameworks & Databases

Flask, Django, Express.js, React, SQL, MongoDB, Neo4j, PostgreSQL

Tools

Git, Docker, Power BI, Tableau, Jenkins, Terraform, Kafka, Airflow, Hadoop, Kubernetes, dbt

Cloud

AWS (EC2, DynamoDB, S3, Athena, Redshift, Lambda, Glue), Azure, Databricks, Snowflake

Libraries

Pandas, NumPy, Matplotlib, Seaborn, Scikit-learn, TensorFlow, PyTorch, RAG