I am a PhD student at the Computing Sciences Department, University of Alberta, Canada, advised by Prof. Michael Bowling. My research work focuses on how AI systems learn if feedback is not always observable. How do design agents learn to avoid causing harm to humans or other agents or being destructive to the environment or themselves without human supervision or prior knowledge?

Before starting PhD, I was an AI engineer at SonyAI for three years, working on designing a multi-agent high-dynamic environment in a physics simulator, training robust agents cooperatively and competitively with self-play and goal-conditioned RL, transferring learned policies to the real world, and integrating them with vision systems and robot control methods.

Work Experience:

AI Engineer, SonyAI
Designing a Multi-agent high dynamic environment in a physics simulator, Training robust agents cooperatively and competitively with Self-play and Goal-conditioned RL, transfers learned policies to the world and integrates them with vision systems and robot control methods. Direct manager Peter Duerr.
Tokyo, Japan
June 2020 - October 2022 (remotely June 2020 - October 2020)

Research Associate, University of Alberta
Develop an algorithm that automatically behaves cautiously in novel circumstances using robust optimization. Supervised by Prof Michael Bowling.
Edmonton, Canada
February 2020 - December 2020

Machine learning Intern, Sony
Develop Reinforcement Learning algorithms for skill acquisition via imitation learning in cooking robots with visual feedback. Direct manager Michael Spranger.
Tokyo, Japan
July 2019 - Jan 2020

Publications:

Learning to Be Cautious Using Counterfactual Regret Minimization
Montaser Mohammedalamen, Dustin Morrill, Alexander Sieusahai, Yash Satsangi, and Michael Bowling
Under review, 2025
arXiv | code | bibtex | Talk

Generalization in Monitored Markov Decision Processes
Montaser Mohammedalamen, and Michael Bowling
Under review, 2025
arXiv | code | bibtex

Monitored Markov Decision Processes
Simone Parisi, Montaser Mohammedalamen, Alireza Kazemipour, Matthew E. Taylor, and Michael Bowling
International Conference on Autonomous Agents and Multiagent Systems (AAMAS) Auckland, New Zealand, 2024
arXiv | code | bibtex | Talk

Transfer Learning for Prosthetics Using Imitation Learning
D. Khamies,Montaser Mohammedalamen, and Benjamin Rosman
Black in AI workshop, NeurIPS, 2018
arXiv | code | bibtex

Education:

PhD student
Computing Science department, University of Albert, Canada
Janunary 2022 - September 2027(expected)

M.Sc. Machine Intelligence
African Master in Machine Intelligence (AMMI), African Institute for Mathematical Sciences (AIMS), Rwanda
September 2018 - September 2019

B.S.c, Honors Electronics & Computer Engineering
First Class Honors, University of Khartoum, Sudan (ranked 3rd in the Electronics department, CGPA 7.5/10)
August 2013 - September 2018

Scholarships & Awards:

Graduate Student Engagement Scholarship (GSES)
University of Alberta
10,000 CAD
September 2023 - August 2024

Graduate Student Engagement Scholarship (GSES)
University of Alberta
10,000 CAD
September 2022 - August 2023

African Master in Machine Intelligence (AMMI)
African Institute for Mathematical Sciences (AIMS), Rwanda.
4% acceptance rate Africa-wide
15,000 USD
August 2018 - September 2019

The best Undergraduate research project
University of Khartoum
500 USD
October 2018

Undergraduate research prize, "Reinforcement Learning controller for Prosthetics"
University of Khartoum
2,000 USD
December 2017

Audiences Prize, "Wheelchair Robot controlled by Brain Signal"
Falling Walls Lab finals
Berlin, Germany
1,000 USD
November 2017

Ericsson Scholarship
ICT Professional Foundation Ericsson Middle East University Program
November 2016

Best Undergraduate project
Electrical and Electronics Engineering Students Exhibition (EEESE)
500 USD
May 2016 & May 2015

Patent #4245: Wheelchair Robot controlled by Brain signal
Patent #3334: Smart Farm
Patent Office, Ministry of Justice, Sudan
May 2016 & May 2016

Projects:

Implementation for Behavior Cloning (BC) and behavior cloning from observation (BCO) algorithms in PyTorch
Imitation learning algorithms.
GitHub repo

Wheelchair robot Controlled by Brain Signal
Develope a wheelchair robot helps disabled people move using EEG signals from the brain.
GitHub repo

NeurIPS2018 Challenge: RL for Prosthetics
Build an RL Prosthetics controller, that learns to walk like humans, and uses Imitation Learning to accelerate learning.
GitHub repo

Monitored Markov Decision Processes
RL assumes that the reward is observable to the agent all the time, but is often not applicable in some real-world problems, Mon-MDPs present a new framework to tackle that by discussing the theoretical and practical consequences of this setting
GitHub repo

Learning to be Cautious
An algorithm that automatically behaves cautiously and safely in novel circumstances, without prior knowledge or human supervision
GitHub repo

Benchmark ChainerRL library in OpenAI Gym Environments
Benchmarking RL algorithms: Deterministic Policy Gradient (DDPG), Trust Region Policy Optimization (TRPO) and Proximal Policy Optimization (PPO) algorithms.
GitHub repo