## Current Openings

If you are interested in exploring AI for VR/AR with us, and write your Bachelor or Master Thesis please contact Dr. Cristian Axenie (axenie@thi.de) or Prof. Thomas Grauschopf (thomas.grauschopf@thi.de).

**Neural Network Predictive Tracking System for VR Systems**

Problem description

The tracking devices are the main components for the VR systems. They interact with the system’s processing unit which computes the orientation of the user’s view point. Solutions for full-body tracking / motion capturing need a lot of cameras and are therefore expensive. Furthermore, the calibration process prior to usage is not trivial and highly time consuming. The tracking data are updated with a certain frequency and to ensure smooth movements prediction is crucial. Despite the multitude of algorithms for such a predictive tracking scenario, the performance is dictated by the underlying tracking system’s sampling rate and the noise and variance in the captured data. Moreover the types of motion that the user performs, being head or hand (controller) play an important role in determining which algorithm to use. The project proposes a neural network approach for predictive tracking. Such a neural network predictive tracker learns a generic relationship between motion and appearance. This approach doesn’t need to develop a complex mathematical model of the problem, i.e. the projection of 18D body motion coordinates (three translational coordinates which define the position and three rotational coordinates which define the orientation for each of the head and hand VR controllers) to 3D world coordinates. The mapping is performed by the neural network which behaves like a universal function approximator.

Tasks

- Program a data acquisition interface for logging the 18 degrees of freedom (DOF) as input from the VR controllers tracking system. These are six DOFs for the head and each of the hands.
- Investigate the design of deep neural networks for regression in multidimensional problems.
- Design and develop a novel position calibration using VR controllers data using deep artificial neural networks.
- Design a prediction model which uses the neural network calibration for tracking.
- Test and evaluate the learnt mapping against ground truth (i.e. camera tracking system).

Required skills

Strong programming experience, good mathematical skills, basic VR technologies, machine learning and algorithms.

Preferred field of study

BA/MA Computer Science, BA/MA Mechatronics(Robotics)

**Learning Inverse Kinematics for VR Avatars**

Problem description

In VR systems head and hands controllers are critical for motion estimation in reliable avatar construction. Inverse kinematics calculations offer the possibility to calculate poses for arm joints and upper body out of controllers’ positions. This data could be used for an improved avatar display. The problem of learning of inverse kinematics in VR avatars interactions is useful when the kinematics of the head or controllers are not accurately available, when Cartesian information is not available from camera coordinates, or when the computation complexity of analytical solutions becomes too high. The major obstacle in learning inverse kinematics is the fact that this representation has an infinite solution space. Thus the learning algorithm has to converge to a particular inverse and to make sure that this inverse is a valid solution. The project proposes a neural network learning approach for leaning the inverse kinematics mapping (i.e. the Jacobian). For the task of learning such a non-linear mapping among the combined position and joint angles to changes in joint angles (i.e. angular velocities) we investigate the use of a multi-layer deep neural network with a dedicated architecture capable of avoid kinematic singularities, using tracking data, which is always physically correct and will not demand impossible postures as can result from an ill-conditioned matrix inversion.

Tasks

- Investigate the basics of inverse kinematics calculation in VR systems.
- Investigate neural networks capabilities for function approximations.
- Design and implement a neural network learning system for Jacobian estimation.
- Test and evaluate the learnt mapping against ground truth (i.e. camera tracking system).

Required skills

Strong programming experience, mechanics and kinematics knowledge, basic VR technologies, machine learning and algorithms.

Preferred field of study

BA/MA Computer Science, BA/MA Mechatronics(Robotics)

**Neural Network Avatar Reconstruction in Remote VR Systems**

Problem description

In collaborative VR scenarios with remote sites, data must be transferred through network. The amount of data is therefore limited by the given bandwidth. Also, the data transfer is prone to network latency which is induced by a variety of factors like signal speed and time processing / buffering in network nodes. The larger the amount of data to transfer, the larger is the liability to network congestion which induces additional latency. Therefore, it is preferable to limit the amount of transferred data as far as possible. An approach capable to overcome such problems is compressive sensing, which can use deep learning through the capability to represent many signals using a few coefficients in a suitable representation. The project proposes the development of a system capable of learning the inverse transformation (i.e. generating an image out of tracking data) from measurement vectors to signals using a neural network. Such an approach will allow learning both a representation for the signals / data being transmitted and an inverse mapping approximating a recovery. Deep Networks are a good candidate for such a task, and especially Generative Adversarial Networks (GAN) have been shown to be quite adept at synthesizing novel data based on training samples, especially from noisy and small amounts of data. Such a network is forced to efficiently represent the training data, making it more effective at generating data similar to the training data. The system will be employed both locally and remotely in the avatar reconstruction to allow the rendering to be more accurate.

Tasks

- Study the basics of data bandwidth impact on remote VR avatar reconstruction.
- Investigate deep neural networks for data compression and recovery.
- Design and implement a neural network learning system using GAN.
- Test and evaluate the data reconstruction against ground truth (i.e. camera tracking system).

Required skills

Strong programming experience, machine learning and algorithms, signal processing.

Preferred field of study

BA/MA Computer Science, BA/MA Mechatronics(Robotics)

**Multimodal Avatar Augmentation using Event-based and Depth Cameras**

Problem description

Remote VR has enormous potential to allow physically separated users to collaborate in an immersive virtual environment. These users and their actions are represented by avatars in the virtual environment. It has been shown that the appearance of those avatars influences interaction. Moreover, a one-to-one mapping of the user’s movements to the avatar’s movements might have advantages compared to pre-defined avatar animations. In this context, the project proposes a multimodal augmentation of typical avatars (i.e. built using head and hand controller tracking). Using new modalities (i.e. event based vision and depth sensors) the avatar augmentation can be two fold: motion can be improved by having faster motion detection and estimates from the event-based camera; localization can me improved by calculating distance to objects using depth information. The event-based camera (Dynamic Vision Sensor – DVS) is a novel technology in which each pixel individually and asynchronously detects changes in the perceived illumination and fires pixel location events when the change exceeds a certain threshold. Thus events are mainly generated at salient image features like edges which are for example due to geometry or texture edges. The depth information is obtained from an active depth-sensing camera sensor (e.g. Kinect, ASUS Xtion, Primesense). The combination of the two results in a sparse stream of 3D point events in camera coordinates which directly give the 3D position of salient edges in the 3D scene. The combination of DVS and depth sensors is a promising opportunity for a new type of visual processing by using a sparse stream of 3D points events which captures only dynamic and salient information invaluable for having precise avatar construction.

Tasks

- Get familiar with the DVS and depth sensor.
- Investigate depth sensor technologies (i.e. resolution, detection, interfacing).
- Program an interface to acquire data from DVS and depth sensor.
- Program a fusion mechanism for data from DVS and depth sensor.
- Program an interface from DVS and depth sensor to VR system.

Technology

http://xtionprolive.com/primesense-carmine-1.0

Required skills

Strong programming experience, computer vision and algorithms.

Preferred field of study

BA/MA Mechatronics(Robotics), BA/MA Computer Science