VR Implementation: A Jam Session in VR Headset

Platform: Unity 5.6.4+

Hardware: HTC ViVE; Sennheiser Ambeo Mic; Giroptic 360 camera;

Related packages: VRTK; MIDIJack;


  • Ambisonic signal playback in Unity & Spatialiser testing
  • 360 Video implemented
  • Jamming and Mixing with virtual musicians.


A virtual environment in which one can interact with the sound of an ensemble is likely to appeal to musicians, who often desire to create fresh soundscapes to release to their audience. The system described in this paper has been largely unexplored, and so taps into this constant need for new methods for creating sounds, and for new sounds themselves. The method described here is unique in several ways, as it uses a pre-recorded ensemble as the foundation for a new musical piece, whereas artists more often work from scratch. While this falls into the musical category of “remixing,” this system is further distinguished by its implementation on the HTC Vive VR headset, allowing the viewer to watch the ensemble play while modifying their sound in real time.
This system may also be used in an entirely different way, with little to no modification to its structure. By muting the volume of any one instrument in the scene, a musician could play along with the recorded ensemble by replacing the muted instrument. Alternatively, the musician could add themselves on top of the ensemble sound without muting any instruments. This would allow the player to truly feel as though they were playing with recorded musicians, given a suitably immersive audiovisual experience. Such an immersive experience would allow musicians to play with experts without having to have the experts physically present. This could help those musicians learn how to play with higher-caliber musicians, as well as conquer any fears they have of playing with such musicians. For example, a jazz musician who fears soloing at jam sessions could quell their fears by simulating the jamming environment before experiencing the real thing.


The first step in this project was to create a virtual environment with 360-degree video and immersive audio. This step primarily employed techniques in recording and production to capture audio signals that possess certain qualities, like object and reverberant information. Using a combination of different software plugins, DAWs, and gaming engines, the next steps included signal processing and virtual reality (VR) implementation. The final product can be described as interactive mixing. This VR application will give the user the ability to point to any instrument or player in a 360-degree video scene and apply different audio effects. The audio effects, whose parameters will appear on a graphical user interface (GUI), will include volume control, reverb, filtering, and equalization (EQ).

(Sennheiser Ambeo Mic)
  • Recording

In accordance with the guidelines offered by Frank, Zotter & Sontacchi, we decided to use a series of spot microphones and Sennheiser’s AMBEO FOA microphone. We arranged the instruments in a hexagonal format, equidistant from the AMBEO microphone and 360-degree camera to increase the distinction of sound localization for the listener in the VR environment.


  • Ambisonic format conversion & Unity Setting-ups

After the recording session finished, there were multiple tracks which included the signal from spot microphones and ambisonic microphone. The first thing that was needed was to sync the recorded video with the multiple tracks. This process was aided by one of the performers clapping their hands as a symbol of ‘rolling’ before each piece. Then, there were different ways to process the two kinds of signal. For the spot microphones, we simply bounced them out. To deal with the ambisonic signals, the Ambeo plugin was used to convert four mono A-Format signals into a single B-format file that contains each of the four channels inside. For the specification of Ambeo plugin, there was 0-degree default settings for microphone rotating, no ambisonics correction filter and the low cut filter was applied, upright position, and ambiX output format, which is a suggested format according to previous research and official unity documents.

(Editing in Protools)
(…And sometimes, Reaper is a really good choice to handle the multichannel signal track…)
  • VR Implementation in Unity

After preparing the audio file well, Unity 5.6.4 was selected as the gaming platform according to our requirements for mounting 360-degree videos and processing the object and Ambisonic signals. A sphere-shape game object was built as the carrier of the 360-degree video, in which the camera position was located in the center of the sphere and video rendered on the inner layer of the sphere. Six audio sources were placed on the circumference as the carrier of the signals generated by different spot microphones, which the audio sources’ position corresponds to the players’ positions in the video. According to the official documents from unity, the ambisonic file should be a single B-format WAV file with ACN component ordering and SN3D normalization.

To run this project in a virtual reality environment, the HTC VIVE was selected as the target platform for playback. Besides that, the Steam VR and VRTK offer all the scripts and tools which contributed build the connection between the VR equipment and Unity game engine. Since the final goal of live mixing is an interactive operation, it was necessary to create a virtual interface to let the user modify the audio values in Unity. A Graphical User Interface (GUI) mixing panel was built and attached in the VR scene, to allow the user to see the fader that modifies the volume of each musical instrument. VRTK offers the toolkits in which the controller event script can edit the function of each button on controllers. In this project, users need to hold the grip button to activate the mixing panel, then using the laser of the pointer to indicate the specific position for fader. Finally, the user can move the fader by holding the trigger button.

(Implementing in Unity)
Related Background: Object-based Representation; Sound-field Representation
Altman, M., Krauss, K., Susal, J., & Tsingos, N. (2016). Immersive Audio for VR. Audio Engineering Society Conference Paper, Presented at the Conference on Audio for Virtual and Augmented Reality, Los Angeles, CA, 1-8.
Frank, M., Zotter, F., & Sontacchi, A. (2015). Producing 3D Audio in Ambisonics. Audio Engineering Society 57th International Conference, Hollywood, CA, pp.1-8.
Meyer, J., & Elko, G. W. (2004). Spherical Microphone Arrays for 3D Sound Recording, Audio Signal Processing for next-generation multimedia communication systems, New York: Springer Science Business Media, 2.
Nachbar, C., Zotter, F., Deleflie, E., & Sontacchi, A. (2011, June). Ambix-a suggested ambisonics format. In Ambisonics Symposium, Lexington.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.