Hand Imitation
Reinforcement Learning for an Imitating Robotic Arm.
Code and Details Abstract
RL-based learning for a robotic arm to imitate a given hand in a image feed with handOfJustice as our environment
To Setup
pip install gym-handOfJustice
else
git clone https://github.com/hex-plex/gym-handOfJustice
cd gym-handOfJustice
pip install -e .
To Train
we used Actor Critic technique to update the a CNN examples are RL-train.py and RL-Test.py
- In RL-train we have built the Actor and the critic model using tensorflow
- In RL-Test we have used stable-baselines SAC model with LnCnnpolicy policy
- Dataset which we used consisted of 50,000 images that meant content for 50,000 episodes.
to use the same it could be downloaded from the drive link
—-or—-
wget --no-check-certificate -r 'https://docs.google.com/uc?export=download&id=1YeJecxl8LDR_r3JAWfSbDP4X_klVQfrO' -O dataset.7z pacman -Sy p7zip-full # Or any package manager you like 7z e dataset.7z
Training Metrics
The training was completed( only on ~50% of the dataset ) over a span 36 days. Special thanks to Center for Computing and Information Services, IIT (BHU) varanasi to provide the computational power i.e., the Compute Cluster. It ran for about 20,960 episodes making upto 20 Million steps. the following are the log of all the metrics.
Actor Loss | Critic Loss | Reward | Cummulative Reward |
---|---|---|---|
![]() |
![]() |
![]() |
![]() |
These results may look decent but we should keep in mind that what i have tried to do is a end to end model for a very complex and really having a few layers after a mobile net is surely not sufficient for learning the forward kinematics of a robotic arm and being able to estimate the pose of the arm in the image.
Output
These are the best result after training over a limited amount of time
Note we have used clips from different versions of trained model and environment so there is a edit on these clips that in the gym-handOfJustice==0.0.6 a flip in the environment was added to make feel of the robotic hand more mirror like which can be spotted in the gif files
The End
Thats all from our side
</img>
The Team
Somnath Sendhil Kumar |
Yash Garg |
L N Saaswath |
Atul Kumar |