Skip to 0 minutes and 14 secondsVIVID can be used to learn many computer vision tasks. The first application is semantic segmentation. Since we have all the physical information of objects in the virtual reality, we can use the information to learn semantic segmentation. Also, we have depth information of the world, so we can also use it to learn depth prediction. The other important application is autonomous navigation. To learn autonomous navigation, we need to do trial-and-error million times. Since we don’t have so much money to crash vehicles so many times, we need to do reinforcement learning in virtual reality. Another application is human action recognition, which is a unique feature of VIVID.
Skip to 0 minutes and 59 secondsWe apply the human skeleton system of Unreal, and can simulate actions like running, jumping and shooting. The simulated actions can be used to learn action recognition or simulate real-life events. We made a table to compare VIVID with other state-of-the-art VR simulators. In a nutshell, we try to include the advantages of other simulation environments such as easy-to-sue, flexibility, photorealistic rendering. In addition, our environment supports human action recognition. For more details please refer to our paper. So what are the advantages of VIVID? First, VIVID is easy to use. We hide the details of the complicated 3D technology, and provide simplified API for users. Second, we design specific APIs for deep reinforcement Learning, such as random object generation, teleport, map reloading.
Skip to 2 minutes and 0 secondsOther functions include multiple agent control and human action recognition. We put all python deep learning examples on our GitHub. Third, Vivid supports distributed Learning through Ethernet. We use remote procedure call to communicate with external programs and support many programming languages. With distributed learning, we can run simulation and learning process on different machines, which can accelerate the training speed. Fourth, simulate real-life events. Vivid support human action simulation such as jumping, running, gun shooting, and combine with other objects we can simulate real-life events. Last but not least, VIVID is equipped with large-scale indoor scenes and outdoor scenes, which can provide diversified training data. This is the architecture of vivid.
Skip to 2 minutes and 54 secondsVIVID is based on Unreal engine, one of the most advanced 3D Engine in the world.
Skip to 3 minutes and 0 secondsWe support three types of vehicles: drone, robot and car. For the underlying API protocol, we use Microsoft RPC library. We include the AirSim plugin and support hardware in the loop simulation.
VIVID: Virtual Environment for Visual Deep Learning
Prof. Lai introduces VIVID: Virtual Environment for Visual Deep Learning in this video. VIVID can be applied in different areas: Semantic Segmentation, Depth Prediction, Autonomous Navigation, Action Recognition. Prof. Lai will explain each of them in detail.
There are certain advantages of VIVID:
- It is easy to use
- Specific API for deep learning
- Distributed learning through TCP/IP
- Simulate real-life events with human actions
- Large-scale indoor and outdoor scenes