Youssif Ahmed Abdallah

Junior Data Scientist

Where logic meets creativity

About Me

I'm Youssif Ahmed Abdallah, a Junior Data Scientist with a strong focus on artificial intelligence and a background in Computer Science and Mathematics. I specialize in transforming raw data into actionable insights through end-to-end data analysis and machine learning workflows.

I recently graduated from the Digital Egypt Pioneers Initiative (DEPI) in AI and Data Science, where I developed solid hands-on experience in data preprocessing, exploratory data analysis (EDA), feature engineering, and building machine learning models using Python, Pandas, NumPy, Scikit-learn, and data visualization tools.

My experience includes working on real-world projects such as an AI-powered legal assistant built using a production-ready RAG architecture. I also have a strong foundation in database design and MySQL, along with clean, object-oriented programming principles.

I’m driven by curiosity, continuous learning, and a passion for building intelligent, data-driven solutions that support informed decision-making and real business impact.

Let’s connect and explore how we can create meaningful solutions together.

Youssif Ahmed Abdallah

Education

Bachelor's Degree in Computer Science and Mathematics

Faculty of Science, Helwan University

Grade: 73.24% – Good (Equivalent to GPA: 2.93/4.0)

2025

Training

DEPI | AI and Data Science

Digital Egypt Pioneers Initiative

06/2025 – 12/2025

  • Developing strong skills in Python programming for data analysis and machine learning.
  • Performing Exploratory Data Analysis (EDA) using Pandas, NumPy, Matplotlib, and Seaborn to identify trends and patterns.
  • Applying data preprocessing techniques including cleaning, encoding, scaling, and feature selection to prepare data for modeling.
  • Building and evaluating supervised learning models using Scikit-learn with a focus on performance metrics and optimization.
  • Practicing data visualization and storytelling to communicate insights effectively.

ITI (summer training) | Full-Stack Web Development

Information Technology Institute

07/2024 – 08/2024

  • Practiced backend logic implementation using Laravel and PHP
  • Applied database design principles to build dynamic systems with MySQL
  • Practiced database integration, routing, MVC structure, and clean code practices
  • Collaborated with a team to simulate real-world project environments

Skills & Technologies

Programming Languages

Python
SQL
HTML5
CSS3

Frameworks & Libraries

FastAPI FastAPI
LangChain Langchain
NumPy Numpy
Pandas Pandas
Scikit-learn Scikit-learn
Matplotlib Matplotlib
Seaborn
Plotly Plotly
Dash
MLflow MLflow
Beautiful Soup

Software Engineering Concepts

Object-Oriented Programming (OOP)
Algorithms & Data Structures
Testing
Agile Methodology
Waterfall Model
Prototyping

Databases

MySQL MySQL
Database Design
Data Modeling
Normalization

Version Control (Basics)

Git
GitHub

Soft Skills

Problem Solving
Quick and Autonomous Learning
Attention to Details
Teamwork and Collaboration

Languages

Arabic: Native
English: B2 (Upper intermediate)

Featured Projects

Qanouny - AI Legal Assistant

  • Designed and implemented a production-ready RAG architecture tailored for Arabic legal language and high-risk decision environments, leveraging Groq API for the LLM.
  • Developed an end-to-end multi-modal AI pipeline integrating OCR and speech-to-text to process text, audio, and images, and engineered a vector-based retrieval system using ChromaDB and multilingual embeddings.
  • Integrated risk-aware intelligence layers and legal term simplification into the responses, ensuring source-grounded, auditable, and transparent outputs suitable for critical legal contexts.
  • Collaborated closely with Data Engineering and Flutter teams to ensure seamless integration and deployment in a real-world production environment.
Python RAG Embeddings ChromaDB OCR Speech-to-Text LLMs Groq API

Telco Customer Churn Analysis and Prediction

  • Developed an end-to-end machine learning pipeline to predict customer churn, enabling proactive retention strategies for a telecommunications company.
  • Performed comprehensive Exploratory Data Analysis (EDA) using Seaborn and Matplotlib to uncover hidden patterns and identify key churn drivers, such as contract types and billing methods.
  • Evaluated and compared the performance of XGBoost, LightGBM, and Logistic Regression to identify the most reliable model for distinguishing between churners and loyal customers.
  • Optimized the winning models through a strategic two-stage hyperparameter tuning process (RandomizedSearchCV followed by GridSearchCV), achieving a high Recall of 81.3% and an AUC-ROC of 84.4%.
Python Classification EDA RandomizedSearchCV Hyperparameter Tuning GridSearchCV Logistic Regression Random Forest XGBoost LightGBM
View All Projects

Get In Touch

Let's work together!

I'm always interested in new opportunities and exciting projects. Feel free to reach out!

yoossifahmed66@gmail.com
+201143095568
Giza, Egypt