Machine Learning Engineer · Algorithmic Systems Builder

Parv Patel

Building topology-aware generative models, GPU-accelerated ML pipelines, and interactive algorithmic platforms. From persistent homology to production-grade inference — I implement the math, not just the API call.

4th
Inter IIT Rank
13+
Repositories
5
Systems Built
IIT
Palakkad
🏆 COMPETITION ACHIEVEMENT

4th Rank — Inter IIT Tech Meet 14.0

Algorithmic Optimisation Challenge by Genuity IO. An integrated AI systems pipeline — topology-preserving synthetic data generation, multi-generator fusion, and hierarchical semantic retrieval under computational constraints.

Problem Space

Generate high-fidelity synthetic tabular data that preserves topological structure of real distributions while maintaining statistical similarity and downstream classification utility. Simultaneously build a retrieval system for hierarchical semantic search over large knowledge spaces.

What I Built

  • TopoGAN — persistent homology-regularized adversarial network for topology-preserving generation
  • Multi-generator synthetic fusion: VAE + CatBoost + ensemble modeling pipeline
  • GPU-accelerated validation and benchmarking engine
  • Hierarchical semantic tree retrieval via recursive K-Means with adaptive top-k routing

Technical Depth

  • Multi-loss optimization: adversarial + persistence diagram + statistical similarity
  • Structural metric benchmarking across wasserstein distance, coverage, and density
  • Statistical fidelity + downstream utility evaluation pipeline
  • Computational tradeoff optimization: latency vs. topological fidelity

Outcome

  • Outperformed StandardGAN, CTGAN, and TVAE across all structural metrics
  • 25–32% improvement in statistical similarity over baseline generative models
  • 17–22% gain in downstream classification utility
  • 4th overall national rank among IITs
25–32%
Statistical Similarity ↑
17–22%
Downstream Utility ↑
3 Models
Baselines Outperformed

Core Projects

Each project represents a distinct engineering axis — applied ML, algorithm design, systems architecture, or backend engineering.

ML Systems
NEXUS: E-Commerce Analytics Platform
End-to-end ML analytics pipeline processing 100K+ transactions with automated ETL, inference, and association mining.
Pipeline Architecture
  • Built automated ETL pipeline with data cleaning, feature engineering, and outlier handling for 100K+ transaction records
  • Implemented statistical inference engine: t-tests, Chi-square, ANOVA, and correlation analysis with automated significance detection
  • Developed Apriori-based market basket analysis computing Support, Confidence, and Lift metrics for item association rules
ML Engineering
  • Customer segmentation using K-Means, DBSCAN, and Hierarchical clustering with PCA for dimensionality reduction
  • Interactive product association network graphs via PyVis — reduced manual analysis time by 80%
  • Deployed as a Streamlit application with interactive dashboards and real-time filtering
PythonStreamlitScikit-LearnMLxtendPlotlySciPyPyVis
ML
Human Activity Recognition
Optimized ML pipeline: PCA-based dimensionality reduction from 561 to 102 features, achieving 96.17% accuracy with SVM.
Optimization Reasoning
  • Applied PCA to reduce 561 sensor features to 102 components — 82% dimensionality reduction while preserving signal
  • Accuracy improved from 93.6% (raw features) to 96.17% (PCA-reduced) — noise removal via subspace projection
  • Hyperparameter search across 360+ configurations using GridSearchCV on Logistic Regression, Random Forest, and SVM
Production Pipeline
  • Exported production-ready artifacts: scaler.pkl, pca.pkl, svm_pca.pkl for zero-configuration deployment
  • Comprehensive model comparison with cross-validated metrics: precision, recall, F1, confusion matrices
  • Feature importance analysis and explained variance curves for component selection justification
PythonScikit-LearnPCASVMGridSearchCVJoblib
Algorithms
Max-Flow Hub: Network Flow Visualizer
Three network flow algorithms implemented from scratch — Ford-Fulkerson, Edmonds-Karp, Push-Relabel — with D3.js visualization engine.
From-Scratch Implementations
  • Ford-Fulkerson: DFS-based augmenting path search with capacity updates and backtracking on the residual graph
  • Edmonds-Karp: BFS-based augmenting paths with explicit residual graph construction — O(VE²) worst-case guarantee
  • Push-Relabel: Height labeling, excess flow handling, and relabel operations — O(V²E) local operations
Visualization Engine
  • Built state-driven D3.js animation engine to step through algorithm execution: path discovery, capacity updates, residual modifications
  • Demonstrates Max-Flow Min-Cut theorem interactively — visualizes both the max flow and the minimum cut
  • Complexity analysis display for each algorithm variant alongside execution
JavaScriptD3.jsTailwind CSSHTML
Backend Systems
FinTrack: Investment Management System
Full-stack portfolio management with normalized PostgreSQL schema, ACID transactions, and role-based access control.
Database Architecture
  • Normalized PostgreSQL schema design with foreign key constraints, check constraints, and referential integrity
  • Stored functions and triggers for automated portfolio calculations and transaction validation
  • Index optimization on frequently queried columns — covering indexes for common join patterns
  • ACID-compliant transaction handling with proper isolation levels for concurrent portfolio operations
Backend Engineering
  • Flask REST API with bcrypt password hashing and session-based authentication
  • Role-based dashboards: Users, Admins, and Advisors with scoped data access
  • Real-time portfolio tracking with advisor assignment and goal monitoring
FlaskPostgreSQLbcryptJavaScriptHTML/CSS

Academic Engineering

Implemented foundational ML, AI, and distributed systems algorithms as part of rigorous academic engineering work at IIT Palakkad.

Machine Learning

Regression, classification, and clustering implementations from mathematical foundations. Feature engineering, model evaluation, and ensemble methods — built without framework abstractions where possible.

Linear Regression Logistic Regression SVM Random Forest K-Means PCA Ensemble Methods

Artificial Intelligence

Search algorithms, adversarial game trees, constraint satisfaction, and probabilistic reasoning. BFS, DFS, A* implemented from scratch. Minimax with alpha-beta pruning. SAT-based Sudoku solver.

BFS / DFS / A* Minimax Alpha-Beta SAT Solver HMMs

Deep Learning & NLP

Neural network architectures implemented through backpropagation. FNN, CNN, RNN, LSTM, GAN, and Transformer implementations. Word2Vec, N-gram language models, and BERT-based experiments.

FNN CNN RNN / LSTM / GRU GANs Transformers Word2Vec BERT

Data Analytics & Big Data

Statistical inference, hypothesis testing, clustering analysis, and dimensionality reduction. Hadoop MapReduce, Spark ML, and RDD internals for distributed computation at scale.

ANOVA Hypothesis Testing Hadoop MapReduce Spark / PySpark HDFS RDD

Algorithms & Systems Thinking

Not a library consumer. I implement core algorithms, reason about complexity, and design systems from the mathematical foundations up.

G

Graph & Network Flow

Ford-Fulkerson, Edmonds-Karp O(VE²), Push-Relabel O(V²E). Residual graph construction. Max-Flow Min-Cut duality.

T

Persistent Homology

Topological data analysis integrated into GAN training. Persistence diagram losses for structural regularization of synthetic distributions.

K

Recursive Clustering Trees

Hierarchical K-Means with adaptive top-k routing for semantic retrieval. Sub-linear search over structured knowledge spaces.

P

Dimensionality Reduction

PCA from eigenvalue decomposition. Variance-explained analysis for component selection. 82% feature reduction with accuracy improvement.

M

Multi-Loss GAN Training

Adversarial + topological + statistical similarity losses. Balancing discriminator/generator dynamics with structural constraints.

D

Distributed Computation

MapReduce, Spark ML, RDD transformations. Understood from the internals — not just the API surface.

S

Search & Adversarial

A*, BFS, DFS from scratch. Minimax with alpha-beta pruning. Constraint satisfaction via SAT reduction.

A

Association Mining

Apriori algorithm for market basket analysis. Support/Confidence/Lift computation with configurable thresholds and pruning.

Engineering Philosophy

01
I prioritize mathematical rigor, algorithmic clarity, and system-level design. Every abstraction I use is one I can rebuild from scratch.
02
Understanding complexity and trade-offs is central to how I build ML systems. I don't just optimize metrics — I reason about why a particular architecture works.
03
I build core components before abstracting them. The distance between theory and implementation is where real engineering happens.
04
Research informs my systems thinking. Production constraints sharpen my research intuition. I operate at the intersection of both.

Let's build something.

Open to research collaborations, competitive programming teams, and systems engineering roles. If you value algorithmic depth over buzzwords — let's talk.