With the continued technology innovation in 3D avatar reconstruction for easier live capture of humans and more viable methods of human tracking to present animated avatars, traditional 2D video-based tele-conference systems are evolving into immersive and interactive 3D mixed reality (MR) based on which one can communicate with the teleported and remote other as if present, moving and interacting naturally in the same location. Compared to the traditional 2D tele-conference systems which offer only flat 2D upper body imageries, mostly a fixed view point, and inconsistent gaze directions, a shared MR tele-conference system situated in an augmented space with 3D teleported avatars would be more natural and realistic, and the system can give an enhanced and immersive communication experience.
In such MR development, 3D teleported avatars to construct more natural tele-conference systems like talking to real people need to consider two aspects: the represented way to maximize quality of conversation in the situation of talking with participants in front of the position, and placement to the specific location in remote sites.
Then, one important quality of the tele-collaboration system is how to improve the tele-conference experience and collaboration through the sense of co-presence and trust as felt by the participating users with the teleported avatar. Also, another central concern is the need to resolve the differences in the physical environments between two sites and the teleported avatar’s motion anomaly between the remote site and the local one. For instance, the human controller is interacting at one (e.g. sitting on a low chair) and the teleported avatar is being displayed at the other (e.g. augmented on a high chair).
In this thesis, first, we experimentally investigated the effects of the teleported 3D avatar’s visual forms (photo-realistically reconstructed vs. pre-built 3D avatar) and the form of background (traditional 2D video vs. 3D VR vs. 3D MR) on the sense of co-presence with tele-conference prototypes. Note that the video-typed background in 2D flat screen was compared as a reference. Second, we carried out a preliminary study comparing the level of trust toward the effectiveness in communication performance of 3D MR tele-conference systems.
In addition, to express the teleported avatar to overcome environmental differences between remote sites, we presented a novel method to first establish a spatial and object-level match between the remote and local sites, and a method to align the position and adapt the motion of the teleported avatar according to the matched information in the physical configuration of the other site. The adaptation technique is based on preserving a spatial property using the avatar and its interaction objects between the two sites.
In the experiment, our study in terms of co-presence of background types showed that participants generally exhibited a higher sense of co-presence when situated with a MR environment by the real background, and they showed greater confidence/trust when interacting with a realistically reconstructed avatar in the avatar’s visual forms.
Also, to adapt the teleported avatar’s position and motion, we have developed a test prototype to demonstrate our approach using the Kinect-based human tracking and a video see-through head-mounted display, and techniques for scene matching and adapting the teleported avatar’s motions to the remote site configuration in real-time. The spatial relationship between two sites is pre-established between the important joint positions of the user/avatar and carefully selected points on the environment interaction objects. The motions of the user transmitted to the other site are then modified in real-time considering the “changed” environment object and preserving the spatial relationship as much as possible.
In the near future, our results will be able to help the design of more effective tele-conference and collaborative systems, and participants can a natural looking with spatially correct rendering of the remote user in the augmented space to provide a significantly improved tele-conference experience and communication performance.