All My Projects

Qanouny - AI Legal Assistant

  • Designed and implemented a production-ready RAG architecture tailored for Arabic legal language and high-risk decision environments, leveraging Groq API for the LLM.
  • Developed an end-to-end multi-modal AI pipeline integrating OCR and speech-to-text to process text, audio, and images, and engineered a vector-based retrieval system using ChromaDB and multilingual embeddings.
  • Integrated risk-aware intelligence layers and legal term simplification into the responses, ensuring source-grounded, auditable, and transparent outputs suitable for critical legal contexts.
  • Collaborated closely with Data Engineering and Flutter teams to ensure seamless integration and deployment in a real-world production environment.
Python RAG Embeddings ChromaDB OCR Speech-to-Text LLMs Groq API

Telco Customer Churn Analysis and Prediction

  • Developed an end-to-end machine learning pipeline to predict customer churn, enabling proactive retention strategies for a telecommunications company.
  • Performed comprehensive Exploratory Data Analysis (EDA) using Seaborn and Matplotlib to uncover hidden patterns and identify key churn drivers, such as contract types and billing methods.
  • Evaluated and compared the performance of XGBoost, LightGBM, and Logistic Regression to identify the most reliable model for distinguishing between churners and loyal customers.
  • Optimized the winning models through a strategic two-stage hyperparameter tuning process (RandomizedSearchCV followed by GridSearchCV), achieving a high Recall of 81.3% and an AUC-ROC of 84.4%.
Python Classification EDA RandomizedSearchCV Hyperparameter Tuning GridSearchCV Logistic Regression Random Forest XGBoost LightGBM

Smart Study Companion – RAG-Based Interactive Document

  • Built a Retrieval-Augmented Generation (RAG) system using FastAPI to transform static PDF documents into interactive, conversational knowledge bases for students and researchers.
  • Implemented hybrid search combining semantic search (ChromaDB with multilingual embeddings) and keyword-based retrieval (BM25) to achieve high accuracy in finding both conceptual context and exact technical terms.
  • Designed an intelligent query pipeline with automatic query rewriting to resolve conversational pronouns and context, ensuring accurate retrieval even for follow-up questions.
  • Applied recursive text splitting strategy to optimize context preservation while maintaining fast processing speeds.
  • Engineered performance optimizations through FastAPI lifespan events for model preloading (Alibaba-NLP/gte-multilingual-base embeddings) and strict prompt hardening to ensure responses remain grounded exclusively in document content.
Python RAG Langchain FastAPI OpenRouter BM25

Loan Approval Prediction

  • Developed a machine learning workflow to automate loan approval decisions based on customer demographics and financial history.
  • Performed Exploratory Data Analysis (EDA) using Seaborn and Matplotlib to identify key correlations between applicant income and loan status.
  • Conducted a comparative analysis between Logistic Regression and Decision Tree classifiers to determine the optimal model for risk assessment.
  • Optimized model performance using GridSearchCV for hyperparameter tuning, achieving a final Accuracy of 86% and an F1-Score of 91%.
Python Classification EDA GridSearchCV Logistic Regression Decision Tree