Signal detection¶
In this package, signal detection refers to the general act of 1)’recognising’ the signal of interest within a given channel.
Expected Inputs¶
The input data to run the different signal detection algorithms are the raw multichannel audio data. The most important requirement is that all the channels are synchronised!
Outputs to expect¶
Each algorithm will at least provide the detected signals list, and may optionally produce additional information too (eg. confidence intervals, prediction probabilities., etc.). All detected signals are output in a list with sublists holding candidate signal regions for each channel. The sublists are ordered in the order of the audio files. For example:
[ [(0.1,0.3), (0.5,0.8), (0.9,1.2)],
[(0.15,0.35), (0.23, 0.45) ],
[(0.2,0.9), (.9,1.2), (1.23,1.26)],
[(0.1,0.7), (1.,1.1), (1.29,1.38)]
]
This corresponds to an input of a 4 channel audio. There are 3,2,3,3 candidate signal regions detected in 1st-4th channels respectively.
Each candidate signal region consists of a tuple with the start and end time in the audio channel. For example:
# let's take the same signal detections above, but only look at the 1st channel:
[ [(0.1,0.3), (0.5,0.8), (0.9,1.2)],
.......
]
The candidate region (0.1,0.3), has a start time of 0.1s and and end time of 0.3s. This means the signal of interest
is located between these two time stamps in the first channel. This means there are three candidate regions in the first channel
starting at 0.1, 0.5 and 0.9 seconds.
The simplest method: threshold based detection¶
Whenever the audio goes beyond a particular level (in RMS), a signal is considered detected. This is a common (and relatively robust) algorithm that’s used in many programs.
One thing to consider with the threshold method is that if the signal isn’t received with the same intensity on all channels (and also in case of mic directionality etc.) – then the ‘wrong parts’ of a signal will be cross-correlated - leading to poor localisations. How to overcome this problem?
Less simple methods¶
Threshold based methods could lead to false positives, where there are similar non-target sounds in the same frequency range (eg. two species calling at the same time). Here there are a bunch of options
Signal Detection API¶
Deals with the actual detection of signals in multichannel audio files. There are two problems that need to solved while detecting a signal of interest.
within-channel signal detection
across-channel correspondence matching
Within-channel signal detection¶
This task involves locally checking if there are any signals of interest in one channel at a time. The exact methods used for the within-channel can be set by the user, though the simplest is of course a basic threshold-type detector. Whenever the signal goes beyond a particular threshold, a signal is considered to be in that region.
Built-in detection routines¶
The detection module has a few simple detection routines. More advanced routines are unlikely to form a core part of the package, and need to be written by the user.
#. dBrms_detector : Calculates the moving dB rms profile of an audio clip. The User needs to define the size of the moving window and the threshold in dB rms.
#. envelope_detector : Generates the Hilbert envelop of the audio clip. Regions above the set threshold in dB peak amplitude are defined as detections. This method is faster than the dBrms_detector.
-
batracker.signal_detection.detection.cross_channel_threshold_detector(multichannel, fs, **kwargs)¶ - Parameters
multichannel (np.array) – Msamples x Nchannels audio data
fs (float >0) –
detector_function (function, optional) – The function used to detect the start and end of a signal. Any custom detector function can be given, the compulsory inputs are audio np.array, sample rate and the function should accept keyword arguments (even if it doesn’t use them.) Defaults to dBrms_detector.
- Returns
all_detections – A list with sublists containing start-stop times of the detections in each channel. Each sublist contains the detections in one channel.
- Return type
list
Notes
For further keyword arguments see the threshold_detector function
See also
-
batracker.signal_detection.detection.dBrms_detector(one_channel, fs, **kwargs)¶ Calculates the dB rms profile of the input audio and selects regions which arae above the profile.
- Parameters
one_channel –
fs –
dbrms_threshold (float, optional) – Defaults to -50 dB rms
dbrms_window (float, optional) – The window which is used to calculate the dB rms profile in seconds. Defaults to 0.001 seconds.
- Returns
detections – Each tuple corresponds to a candidate signal region
- Return type
list with tuples
-
batracker.signal_detection.detection.envelope_detector(audio, fs, **kwargs)¶ Generates the Hilbert envelope of the audio. Signals are detected wherever the envelope goes beyond a user-defined threshold value.
Two main options are to segment loud signals with reference to dB peak or with reference dB above floor level.
- Parameters
audio –
fs –
- Keyword Arguments
threshold_db_floor (float, optional) – The threshold for signal detection in dB above the floor level. The 5%ile level of the whole envelope is chosen as the floor level. If not specified, then threshold_dbpeak is used to segment signals.
threshold_dbpeak (float, optional) – The value beyond which a signal is considered to start. Used only if relative_to_baseline is True.
lowpass_durn (float, optional) – The highest time-resolution of envelope fluctuation to keep. This effectively performs a low-pass at 1/lowpass_durn Hz on the raw envelope signal.
- Returns
- Return type
regions_above_timestamps
-
batracker.signal_detection.detection.get_start_stop_times(findobjects_tuple, fs)¶
-
batracker.signal_detection.detection.moving_rms(X, **kwargs)¶ Calculates moving rms of a signal with given window size. Outputs np.array of same size as X. The rms of the last few samples <= window_size away from the end are assigned to last full-window rms calculated :param X: Signal of interest. :type X: np.array :param window_size: Defaults to 125 samples. :type window_size: int, optional
- Returns
all_rms – Moving rms of the signal.
- Return type
np.array