Atlanta, GA

Data Analyst

SQL · Python · ETL Pipelines · Power BI

M.S. CS, Emory University (4.0 GPA)

SQLPythonETL PipelinesPostgreSQLPower BI
4.0
M.S. GPA — Emory
2
IEEE Publications
20M+
Records Processed
Scroll to explore

About Me

I'm a data analyst with a background in SQL optimization, ETL pipeline development, and Python-based data validation. Most recently at ABM Industries, I worked on migrating enterprise FP&A queries from Oracle Fusion to Azure SQL and validating 200K financial records during a live cloud migration — work that required both technical precision and an understanding of how data quality affects downstream reporting. I recently completed my M.S. in Computer Science at Emory University with a 4.0 GPA. I'm currently looking for full-time Data Analyst roles in Atlanta or remotely across the US.

Work Experience

ABM Industries
Data Analyst Intern
June 2025 – August 2025
Atlanta, GA
  • Migrated AR Cash Received and AR Aging FP&A queries from Oracle Fusion to Azure SQL, reducing query runtime by 50% and saving 3 hours per close cycle, enabling the data team to build downstream financial dashboards, by converting PL/SQL to T-SQL and applying indexing optimization.
  • Validated 200K financial records during Oracle-to-Azure SQL migration, reducing data mismatches from 12K to 1.5K and improving accounts receivable reconciliation accuracy, by developing Python validation scripts and SQL reconciliation logic.
  • Built an Azure OpenAI-powered Q&A prototype over 30 SOP documents using Python parsing, reducing policy lookup time from 30 minutes to under 5 minutes in internal testing.
Habib University
Research Assistant — Computer Science
July 2022 – July 2024
Karachi, Pakistan
  • Designed and validated lab curriculum for Database Systems across two semesters, covering relational modeling, query optimization, and schema design for 30+ students per cohort, including pre-solving all assignments to identify edge cases before distribution.
  • Developed assessments for Algorithms, Data Structures I/II, and Nature of Computation, requiring formal correctness validation and edge case analysis on each problem set prior to student release.
  • Built and evaluated coursework for Deep Learning (two semesters) and an inaugural LLM course, maintaining technical currency across transformer architectures and prompt engineering methods.
  • Contributed to research design and weekly code reviews across 10 Deep Learning projects, resulting in 2 IEEE conference acceptances including a Best Paper Award.
Parents Voice Association – UJALA Centre
Web Developer
June 2021 – May 2022
Karachi, Pakistan
  • Built and deployed a full-stack MERN application for a special education NGO, digitizing records for 30 students and 10 teachers with role-based authentication across admin, student, and finance users.
Ismail Industries Limited
IT Intern
July 2021 – August 2021
Karachi, Pakistan
  • Converted 100K+ daily Trend Micro security logs into structured CSVs using Python automation and loaded into SQL Server via ETL pipeline, reducing alert review time from 5 hours to 1 hour.
  • Built a Power BI dashboard from SQL Server data, reducing reporting time from 2 hours to 10 minutes and enabling IT leadership to implement timely policy updates.

Education

Emory University
Master of Science in Computer Science
GPA: 4.0/4.0 | August 2024 – December 2025
Relevant Coursework
Database SystemsMachine LearningInformation VisualizationData MiningData Privacy & Security
Habib University
Bachelor of Science in Computer Science (Minor: Mathematics)
GPA: 3.85/4.0 | August 2018 – June 2022
Relevant Coursework
Database SystemsArtificial IntelligenceDeep LearningComputer VisionData ScienceMathematics for Machine LearningAstroStatisticsScientific Methods

Publications

🏆 Best Paper Award
Deep Learning based Poet Attribution model for Punjabi Poetry
F. Tariq, R. Gopchandani, R. H. Nizamani, A. Samad, M. M. Anwar
2024 International Conference on Emerging Trends in Smart Technologies (ICETST)
Karachi, Pakistan · 2024 · pp. 1-6
A Deep Learning model for Poet Attribution for Punjabi poetry using Shahmukhi, Gurmukhi, and Roman scripts. Dataset consists of 830 poems from 11 poets. Using Multilingual DistilBERT with Bi-LSTM and Bi-GRU, achieved 91.57% accuracy on Roman script.
PythonPyTorchDistilBERTBi-LSTMBi-GRU
A Deep Learning based Approach for Sindhi Poet Classification using Couplets
A. Samad, M. M. Anwar, R. K. Kataria, M. Murtaza, F. Ali
2024 International Conference on Emerging Trends in Smart Technologies (ICETST)
Karachi, Pakistan · 2024 · pp. 1-6
A Deep Learning model for automatic classification of Sindhi poets based on couplets. Dataset consists of 3000 couplets from five poets. Using Word2Vec, MuRIL, and 1D CNNs, achieved 87.2% test accuracy — the first study to focus on Sindhi poetry.
PythonPySparkBERTKerasCNN

Technical Skills

💻

Languages

SQL, Python, JavaScript

🐍

Python Libraries

Pandas, NumPy, Matplotlib, Flask

🗄️

Databases

PostgreSQL, Azure SQL, Oracle Database, SQL Server, MongoDB

⚙️

Data Engineering

ETL Pipelines, Data Modeling, Data Warehousing, API Integration, Query Optimization, Data Quality

📊

Data Visualization

Tableau, D3.js

📈

Business Intelligence

Power BI, Excel (PivotTables, Power Query, XLOOKUP, INDEX/MATCH), Dashboard Building

🔎

Analytics

Statistical Analysis, A/B Testing

🛠️

Tools & Platforms

Git, GitHub, Jupyter Notebook, VS Code, Postman

My Projects

A showcase of projects demonstrating machine learning, visualization, and end-to-end pipeline development.

💳

Card Transaction Analysis Dashboard

March 2026 – April 2026

Built an Excel-based dashboard tracking $38.6M in spend across 55.7K transactions, using PivotTables, slicers, timeline filters, and KPI cards to surface fraud patterns, category performance, and weekly transaction trends.

Microsoft ExcelPivotTablesPower Query
🚕

NYC Taxi Analytics Platform

August 2025 – December 2025

Engineered a normalized PostgreSQL analytics platform ingesting 20M+ taxi records through Parquet ETL, a modular Flask REST API, and an interactive React dashboard, reducing fare and demand analysis from 7 hours to 15 minutes.

PythonPostgreSQLFlaskReactPandasPyArrow
🎓

AI Learning Effectiveness Study

January 2025 – April 2025

Investigated ChatGPT and interactive visualizations as AI learning tools across 36 participants, finding interactive visualization yielded 27% higher knowledge gains by applying paired and independent-samples t-tests on pre/post-test scores.

PythonPandasMatplotlibSciPyStatsModels
📊

Gender Wage Gap Analysis Platform

January 2025 – April 2025

Designed an interactive D3.js platform processing 344K wage records across 6 demographic dimensions spanning 32 years, reducing manual analysis from 1.5 hours to 15 minutes.

PythonPandasJavaScriptD3.js
🔗

Graph-Based Fraud Detection in Cryptocurrency Networks

January 2025 – April 2025

Benchmarked 8 traditional and graph-based ML models for illicit transaction detection on Elliptic (203K nodes) and Ethereum datasets. GraphSAGE achieved 98.57% accuracy and 93% Macro F1 on Ethereum while Random Forest achieved 97% accuracy and 88% Macro F1 on Elliptic.

PythonPyTorchPyTorch GeometricScikit-Learn
📰

Fake News Classifier

August 2024 – December 2024

Benchmarked Logistic Regression, LSTM, BERT, and RoBERTa on 44,898 ISOT articles. RoBERTa achieved 99.91% accuracy and 99.92% AUC, with systematic ablation across model families informing architecture selection.

PythonPyTorchScikit-LearnTransformers
🔐

Membership Inference Attack on Personalized Differential Privacy Models

August 2024 – December 2024

Evaluated privacy risks of Individualized DP-SGD models using black-box membership inference attacks on CNNs trained on CIFAR-10, SVHN, and MNIST, analyzing ROC-AUC scores across privacy budget groups.

PythonPyTorchOpacus
🦎

Camouflaged Animal Detection

January 2022 – May 2022

Merged MoCA, Chameleon, and COD10K datasets into a unified benchmark and trained YOLOv5 for camouflaged animal detection, establishing baseline results across the combined dataset.

PythonPyTorchYOLOv5OpenCV
🧠

Compression-Based Perceiver

August 2021 – May 2022

Undergraduate capstone — generated latent embeddings for CIFAR-10, CIFAR-100, and ImageNet using Supervised Contrastive Learning and Autoencoders, and trained the Perceiver architecture on limited compute.

PythonPyTorchReactHeroku
🧵

GAN-Based Textile Design Generation

August 2021 – December 2021

Built a custom dataset of Pakistani textile prints via web scraping and trained generative models including DCGAN, StyleGAN, and VAE to synthesize fabric patterns across 6 pattern categories.

PythonPyTorchBeautifulSoupPillow
🔬

Neural Networks as Universal Function Approximators

January 2021 – May 2021

Analyzed universal approximation theory and exponential depth advantages in neural networks, validating theoretical results through experimental implementation.

PythonTensorFlowNumPyMatplotlib
📈

Madness of Markets

January 2021 – May 2021

Modeled human decision-making and cascading behavior using network models and game theory, applied to panic buying during COVID-19 in Pakistan and volatility in the Karachi Stock Exchange.

PythonMicrosoft ExcelMatplotlib

Resume

4.0 GPA · 200K+ Records Validated · 50% Query Runtime Reduction

Resume Preview

PDF viewer may not work on mobile

Open Resume in Browser

Quick Access

Scan these QR codes with your phone to quickly access my resume and LinkedIn profile

Resume PDF

LinkedIn Profile

Get In Touch

Open to full-time Data Analyst roles in Atlanta or remotely across the US. Feel free to reach out to discuss opportunities or connect.

Contact Information

Phone

(943) 241-3640

Location

Atlanta, GA

Quick Response

I typically respond within 24 hours.

Best times to reach me:

Best times: Mon–Fri, 9 AM – 6 PM.

Response time: Usually within 24 hours