Experience & Projects

Real engineering work — internships, research, and what I built and why.

87.1%ROC-AUC
500k+Records Processed
90%Time Saved
2Internships

Final Year Thesis

Professional Experience

Internship
Data Engineering Intern
Kontemporary Konsulting Ltd. · Abuja
Aug – Sep 2025
85%
Manual Entry Reduced
500k+
Records Processed
99.9%
Data Accuracy
36
States Covered
  • Architected a Selenium-based extraction system navigating multi-layer dynamically rendered portals across all 36 Nigerian states — standard scrapers fail at portal depth 3+ due to session tokens.
  • Processed 500k+ records in a 4-week sprint, standardizing outputs to Parquet and loading into a queryable PostgreSQL database (99.9% accuracy vs INEC official source).
  • Benchmarked ViT-B/16 vs DINOv2-B on brain tumor MRI data with Grad-CAM — DINOv2 outperformed (93.8% vs 91.4%) with better spatial feature localization on tumor boundaries.
SeleniumPostgreSQLPythonPyTorchDINOv2Pandas
Internship
IT Summer Support Intern
Tertiary Education Trust Fund (TETFund) · Abuja
Mar – Sep 2024
90%
Time Reduction
~40 hrs
Saved per Cycle
12
Report Layouts
  • Built Python automation pipelines that replaced a 2-day manual Excel report cycle with a batch run under 4 hours — handling 12 different layouts across universities, polytechnics, and colleges of education.
  • Automated correction logging: every changed cell written to an audit trail so staff could verify and reproduce all transformations without developer involvement.
PythonPandasopenpyxlPostgreSQL
Education
B.Sc. Computer Science
Landmark University · Omu-Aran
2021 – 2025
  • Algorithms, data structures, machine learning, and software engineering.
  • Final year thesis on multimodal AI for medical diagnostics — 87.1% ROC-AUC via MedCLIP + TabNet early fusion architecture.
Computer ScienceMachine LearningSoftware Engineering

All Projects

Expand any card for full implementation details, metrics, and what didn't work.

Brain Tumor Classification
PyTorch • ViT • DINOv2 • Grad-CAM

Architecture benchmark on MRI data: Vision Transformer vs DINOv2 with Grad-CAM explainability. Built during Kontemporary internship.

ViT: 91.4%DINOv2: 93.8%Grad-CAMMRI
  • DINOv2-B: 93.8% accuracy — faster convergence, better tumor-boundary localization on Grad-CAM. Winner.
  • ViT-B/16: 91.4% accuracy — more epochs required, diffuse attention patterns on scan.
  • Inference: DINOv2 ~12ms/image vs ViT ~18ms/image on CPU — relevant for clinical deployment.
Key insight
  • ViT's fixed-grid patch encoding conflicted with MRI's variable-scale features. DINOv2's self-supervised pretraining transferred better despite never seeing MRI data.

Repository private — organizational work. Details available on request.

Human Activity Recognition
PyTorch • Transformers • HAR-70

Real-time activity recognition via Transformer on wearable sensor time-series. 12 classes, 70 participants.

92.3% Accuracy12 Classes70 Participants
  • 92.3% test accuracy across 12 activity classes.
  • Multi-head self-attention over 6-axis IMU windows — 128-sample windows at 50Hz.
  • Confusion concentrated on stair-up vs stair-down and standing vs slow-walking — physically ambiguous motion profiles.
Heart Disease Prediction
Scikit-Learn • XGBoost • SMOTE

Clinical decision support optimized for Recall — minimizing false negatives is the right objective in medical screening.

0.91 Recall0.87 F1Class Imbalance
  • 0.91 Recall on the positive class — optimized to minimize false negatives, the more dangerous error in screening.
  • XGBoost beat Random Forest (F1 0.87 vs 0.83) after 5-fold stratified CV.
  • Compared class_weight='balanced' vs SMOTE — SMOTE yielded +0.03 Recall at +0.02 FPR; documented both with trade-off rationale.
Tech Salary Prediction
Python • XGBoost • Plotly

Regression model predicting tech salaries in USD. Experience outranks location as #1 compensation driver.

R² 0.81MAE ~$9,400Feature Engineering
  • R² = 0.81, MAE ≈ $9,400 — model explains 81% of salary variance.
  • Top predictors: experience (34%), remote ratio (21%), company size (17%) — location ranked 4th, not 1st as commonly assumed.
  • Collapsed 200+ job titles into 12 role families, improving MAE by ~$2,100 vs raw encoding.
Nigeria Polling Unit Scraper
Python • Selenium • PostgreSQL • Click CLI

Large-scale electoral data infrastructure across 36 states. Multi-layer portal navigation, 500k+ records.

500k+ Records99.9% Accuracy36 States
System Architecture
CLI Entry (Click) └── StateWorker(state_id) ├── navigate_portal(driver, depth=3) # session token handling ├── retry_handler(max=5, backoff=exp) # portal uptime ├── extract_polling_units(driver) └── validate_and_clean(rows) │ bulk_insert(conn, rows) │ PostgreSQL: polling_units ┌─────────────────────────────┐ │ state | lga | ward │ │ pu_code | pu_name │ │ lat | lng | reg_voters │ └─────────────────────────────┘
  • Standard scrapers fail at depth 3+ due to session tokens — solved with explicit wait chains and session state management.
  • CLI with --resume flag for mid-run recovery without re-scraping completed states.
  • Outputs to Parquet for analysis + PostgreSQL queryable by state, LGA, and ward.

Repository private — organizational data. Architecture and sanitized schema available on request.

TETFund Report Automation
Python • Pandas • openpyxl • PostgreSQL

Pipeline replacing manual report processing at a federal agency — 2-day cycle cut to under 4 hours.

90% Time Saved~40 hrs/cycle12 Layouts
Pipeline Architecture
Input: /reports/*.xlsx (12 heterogeneous layouts) │ ingest_report(path) ├── detect_schema(df) # 12 layouts + fallback ├── clean_missing_values(df) # forward-fill + flag ├── fix_encoding_errors(df) # cp1252 / utf-8 mixed └── validate_funding_cols(df) │ export_to_postgres() └── upsert on (institution_id, report_period) Output: structured DB + full audit log
  • 12 different Excel layouts across institution types — each with different column names for the same underlying data.
  • Every changed cell written to an audit trail so staff could verify all modifications.

Repository private — institutional data. Pipeline design available on request.

Credit Card Fraud Detection
Scikit-Learn • Random Forest • EWMA

Recall-optimized classifier on highly imbalanced financial data. Behavioral drift captured via 7-day sliding-window EWMA features.

Recall-OptimizedEWMA Features0.17% Positive Rate
  • EWMA of transaction frequency over a 7-day window captures behavioral drift before fraud events.
  • Optimal threshold selection reduced false negatives by 18% vs default 0.5 at controlled FPR.
  • SMOTE + cost-sensitive learning for 0.17% positive class rate.
Other Work
Bunmi Adelugba & Co. — Corporate Website
HTML • CSS • JavaScript • Google Apps Script

Multi-page corporate site for a Chartered Accounting firm. Responsive layout, service pages, contact form.

ResponsiveApps Script
  • Service pages: Audit, Taxation, Advisory. Fully responsive.
  • Debugged Google Apps Script form — fixed multipart/form-data vs application/x-www-form-urlencoded mismatch causing silent POST failures.

On GitHub

Public work — open source and available to review directly.

Available for ML engineering and data science roles

Remote · Abuja, Nigeria · Open to relocation