ASPOD: Acoustic Source Position Overlay Device

Sound is one of the most important modes of communication for animals living in the ocean since light only travels for several tens of meters (if the visibility is good) while sound can travel thousands of kilometers (depending on the frequencies) underwater. Therefore, many marine animals such as dolphins, whales and several species of fish use sound to forage, communicate with conspecifics and generally to sense their environment. To research and understand the interaction between animals of either the same species or of different species, it is thus essential to be able to identify what individual is vocalizing and who is responding at what time. This becomes particularly difficult with larger groups of animals like a pod of 50 or more dolphins with many of them vocalizing at the same time. Normally, when researchers observe behaviors of animals in the ocean, they use underwater cameras to film the animals, and then analyze the recorded videos later. Underwater cameras are normally equipped with only one underwater microphone (hydrophone) and besides the visual clues during instances of vocalization such as strings of bubbles from blowholes, no reliable identification of who said what at what time is possible with these methods.

To address this problem ARL has developed a recording device that captures synchronized video and audio from three channels. This system is contained in an underwater housing and can be operated by a snorkeler or a diver while observing the animals. Since the audio channels are synchronized, any sounds produced will have different arrival times between the three hydrophones that are separated at fixed distances of about 60 cm in a triangular array. These arrival time differences can then be used to calculate angles from which the sound is arriving from with respect to the positions of the hydrophones, and the azimuth and elevation angle can then be obtained. The camera is placed in the center of the array plane and the visual field of view is known. The camera is also synchronized to the audio channels, which means that for each given frame of the video we can take the corresponding audio channels and look for any sound event in it – then calculate the origin and place a marker on the location (if it is within the visual field of the camera). The individual frames are then reassembled into a video with the sound locations included. This system can track several vocalizing animals at the same time (for example echolocating dolphins) and identify and assign the recorded vocalizations correctly to each individual.

The new technique therefore allows for the first time to reliably identify sound producing animals and thus opens the door to a much larger area of behavior research and observation that previously was very difficult to access.