INITIALIZING · SM
SM

Selected work.

A few things I've built at the intersection of AI and product — each shipped end-to-end.

Computer Vision · Research

CoreSight

86.5% recognition accuracy
FastAPIOpenCVface_recognitionSQLitePython

An AI-powered attendance system built on CNN-based 128-dimensional facial encoding. The system performs real-time student recognition from a classroom camera feed, writes attendance records to SQLite, and exposes a FastAPI service for integration with existing academic systems. It was benchmarked across varied lighting and pose conditions, achieving 86.5% system accuracy for real-time tracking.

  • 128-D facial embedding pipeline with cosine-distance matching
  • Real-time inference on commodity hardware (~30 FPS)
  • Published as a research paper at the ICSEAIS 2026 conference
Generative AI · Legal Tech

AI Legal Risk Analyzer

Automated risk & compliance analysis
PythonLLMsOCRFastAPIRBAC

An AI-powered platform automating legal contract processing, risk identification, and compliance analysis. Engineered a microservices-based architecture using Python, incorporating LLMs and OCR for automated clause extraction and legal insights. Implemented enterprise-grade security features including RBAC, encryption, and audit logging to ensure data integrity and tenant isolation.

  • LLM + OCR pipeline for automated clause extraction and legal insights
  • Microservices architecture with enterprise-grade RBAC and encryption
  • Comprehensive audit logging and tenant isolation for data integrity
Machine Learning · Analytics

Gaming & Academic Performance Prediction

Cross-validated grade prediction
Pythonscikit-learnPandasNumPyMatplotlib

A machine learning pipeline using scikit-learn to analyze lifestyle factors affecting student grades. The project covers end-to-end workflow: data preprocessing, feature scaling, model training with Linear Regression, and cross-validation to ensure robust, generalizable predictions.

  • End-to-end ML pipeline from raw data to validated predictions
  • Feature scaling and preprocessing for reliable model inputs
  • Linear Regression with cross-validation for robust evaluation
Data Analytics

RepoAI

Composite repo health score
PythonPandasGitHub REST APIMatplotlib

A Python-based tool for automated GitHub repository data extraction and quality analysis. It ingests any public GitHub repository and produces a structured quality report — commit cadence, contributor distribution, issue health, documentation coverage, and maintainability heuristics — combined into a single composite score with actionable insights.

  • Pulls and normalizes 10+ repository signals via the GitHub API
  • Weighted scoring model with explainable per-metric breakdown
  • Generates a clean visual report for engineering managers

Currently building.

Active projects being expanded with new features, models, and architecture improvements.

Generative AI · Legal Tech

AI Legal Risk Analyzer

In Progress
PythonLLMsOCRFastAPIRBACRedis

Expanding the AI Legal Risk Analyzer with advanced multi-document contract review, clause comparison across agreements, and a collaborative workspace for legal teams. Currently building out the fine-tuned legal-domain LLM pipeline and integrating real-time redline generation for contract negotiation workflows.

  • Multi-document contract comparison with semantic diff visualization
  • Fine-tuned legal-domain LLM for jurisdiction-specific clause interpretation
  • Real-time collaborative redline generation for negotiation teams
  • Enhanced RBAC with matter-level permissions and approval chains