Implemented Q-learning and Value Iteration agents for policy optimization and exploration.
Implemented Value Iteration and Q-learning agents for policy optimization tasks. Experimented with discount factors, noise, and living rewards to analyze agent behavior in environments such as Pacman and bridge crossing. Developed Epsilon-Greedy exploration and Approximate Q-learning for scalability.
A production-grade Retrieval-Augmented Generation (RAG) chatbot with multi-query retrieval, evidence...
An evidence-based AI system using RAG to generate ATS-optimized resumes from GitHub and research dat...
Time series forecasting of Bitcoin prices comparing multiple neural network architectures including ...