Zain Zia
Karachi, PK
zainzia0341@gmail.com
(+92) 341-2356573

Passionate Generative AI Developer with a strong background in building intelligent, context-aware applications using pre trained models and AI techniques. Specialized in creating chatbots, automated assistants, and multimodal AI models that integrate seamlessly into various platforms.

Education


NED University of Engineering and Technology
December 2016
 — 
October 2020
BS in Applied Physics
  • Electronics
  • Python
  • Mathematical Physics

Experience


Generative AI Developer
February 2024
 — 
Present
GeeksVisor

Integrated pre-trained and fine-tuned models into a variety of applications, including conversational agents, language processing, and image generation tasks.

  • Designed and implemented chatbots with fine-tuned conversational abilities using GPT, Gemini, Llama, and other licensed and open-source models.
  • Created multimodal AI solutions integrating voice, text, and image processing for enhanced user interactions.
  • Worked on optimizing and fine-tuning large language models for specific use cases, improving efficiency and performance.

Projects


Chatbot with PostgreSQL & ChromDB
September 2024
 — 
October 2024

An AI-powered chatbot using PostgreSQL for persistent memory and ChromaDB for vector storage, featuring advanced LangChain and LangGraph tools for efficient response generation.

  • Built an interactive Streamlit interface, enabling users to create, manage, and delete conversations.
  • Utilized PostgreSQL to store conversation history, preserving context for continuity.
  • Integrated ChromaDB for fast, accurate data retrieval and vector storage.
  • Implemented AI agents in LangGraph for efficient retrieval and response generation as needed.
Chatbot with MongoDB & Pinecone
August 2024
 — 
September 2024

A chatbot with MongoDB for memory persistence and Pinecone for vector storage, using LangChain for structured prompt management and response generation.

  • Developed a Streamlit interface for creating, managing, and deleting conversations.
  • Configured MongoDB for persistent storage, ensuring a seamless chat history experience.
  • Integrated Pinecone for vector storage, facilitating efficient and relevant data retrieval.
  • Utilized LangChain for optimized prompt management and efficient query handling.
Multimodal Chatbot
July 2024
 — 
August 2024

Interactive chatbot that engages users through both text and image inputs, leveraging AWS Bedrock to process multimodal data for contextually relevant responses.

  • Designed an intuitive interface with Streamlit, allowing seamless text and image inputs.
  • Integrated AWS Bedrock to process multimodal inputs and generate relevant responses.
  • Enabled session state management to maintain chat history for a cohesive user experience.
  • Achieved efficient data handling and processing of multimodal inputs, enhancing interaction quality.
News-to-Jokes Generator
July 2024
 — 
July 2024

An application that creates jokes from news headlines using Hugging Face’s FLAN-T5 model, with feedback mechanisms for continuous model improvement.

  • Developed a user-friendly interface with Streamlit for news input and joke generation.
  • Integrated Hugging Face’s FLAN-T5 model to produce humorous responses based on news headlines.
  • Enabled feedback collection and stored it for further fine-tuning to improve joke relevance and quality.
  • Applied parameter-efficient fine-tuning (PEFT) with LoRA to optimize model performance with minimal training parameters.
Hate Speech Minimization
June 2024
 — 
June 2024

A fine-tuned FLAN-T5 model optimized to generate less toxic responses using reinforcement learning and Meta AI's hate speech reward model.

  • Set up model training with DialogSum dataset and FLAN-T5 for contextually aware summarization.
  • Integrated Meta AI’s RoBERTa-based hate speech model to guide the detoxification process.
  • Applied Proximal Policy Optimization (PPO) with parameter-efficient fine-tuning (PEFT) to reduce toxicity in generated content.
  • Achieved significant improvements in toxicity control and response quality.
Invoice Analyzer
May 2024
 — 
June 2024

AI-based tool that extracts and analyzes data from invoice images using Google’s Gemini Pro Vision model, with prompt-based query capabilities for in-depth insights.

  • Created a responsive interface with Streamlit for image uploads and prompt handling.
  • Integrated Google’s Gemini Pro Vision model for image analysis and extraction of invoice details.
  • Implemented functionality for managing image uploads and displaying analysis results.
  • Enabled prompt-based query handling, providing specific responses based on user queries.
Object Detection with YOLOv8
April 2024
 — 
May 2024

A custom object detection application fine-tuned with YOLOv8, enabling users to detect objects within images through an interactive Streamlit interface.

  • Developed a user-friendly interface with Streamlit for easy image uploads and object detection.
  • Fine-tuned YOLOv8 model to improve detection accuracy for specific objects.
  • Implemented efficient image processing to ensure quick and accurate detection results.
  • Achieved high-quality object recognition for diverse image types and scenarios.
Voice Chat with Whisper & GPT-4
April 2024
 — 
April 2024

Voice-enabled chatbot allowing users to communicate through microphone input, displaying transcriptions on-screen and providing GPT-4 generated responses.

  • Created a conversational interface with Streamlit, enabling voice input for real-time interaction.
  • Integrated Whisper for speech-to-text conversion, displaying user voice inputs as text.
  • Leveraged GPT-4 to generate contextually relevant responses based on user input.
  • Enabled seamless voice interaction, enhancing user engagement and accessibility.
Clothing Suggestions Based on Weather
March 2024
 — 
April 2024

An intelligent application providing clothing recommendations based on weather conditions, leveraging LangChain for decision-making and weather tool integration.

  • Developed an interactive Streamlit UI for users to inquire about clothing suggestions.
  • Integrated a weather tool, with an AI agent determining when to call the tool based on user queries.
  • Used GPT-4o to analyze weather data and decide suitable clothing recommendations.
  • Enhanced user experience with accurate, context-aware clothing suggestions.
Image Manipulation Suite
February 2024
 — 
March 2024

Suite of AI-powered tools for advanced image processing tasks, utilizing Amazon Titan Image Generator for features like background replacement and image extension.

  • Designed interactive interfaces in Streamlit for uploading and processing images.
  • Integrated Amazon Titan Image Generator for tasks like background replacement, image masking, and outpainting.
  • Provided customizable parameters for various image processing features.
  • Implemented efficient image data handling, enhancing the application’s usability and functionality.
RAG Chatbot
January 2024
 — 
January 2024

A powerful chatbot that combines Retrieval-Augmented Generation (RAG) capabilities with FAISS and Anthropic Claude-3-Sonnet model, delivering accurate and context-aware responses.

  • Built an interactive chat interface with Streamlit for real-time conversation.
  • Integrated FAISS for in-memory vector storage to enable fast, accurate data retrieval.
  • Configured Claude-3-Sonnet model through Bedrock for generating contextually relevant responses.
  • Utilized LangChain to handle the llm and vector data base.
Medical Chatbot with Llama 2
January 2024
 — 
December 2023

A medical chatbot that provides detailed answers to healthcare queries, utilizing Llama 2 and Pinecone for vector storage of medical data.

  • Developed a user-friendly interface with streamlit for handling medical queries.
  • Configured Pinecone for vector storage to enable efficient and accurate information retrieval.
  • Integrated Llama 2 for generating detailed responses to medical questions.
  • Implemented document processing and text chunking with LangChain to efficiently handle medical data in PDF files for the vector database.

Languages


English:
Fluent
Urdu:
Native speaker

Skills


Generative_AI:
Agentic AI, Natural Language Processing, Chatbot, Image Generation, Multimodal Chatbot, Fine-tuning, LangChain, LangGraph, Hugging Face, AWS Bedrock, AWS SageMaker, GPT, Gemini, Llama, Claude, Amazon Titan, Stable Diffusion and other open source models

Interests


AI Research and Development:
Deep Learning, Machine Learning, Agentic AI

References


Hasnain Khan COO & Co-Founder At GeeksVisor

Zain is an exceptionally talented AI Developer with impressive problem-solving skills. His cooperative and humble nature makes him a pleasure to work with, consistently creating innovative and highly functional AI solutions.