El efecto del enojo en los procesos automatizados de identificación forense de personas locutoras basados en espectros del habla a largo plazo


  • Manuel Ortega-Rodríguez Escuela de Física y Centro de Investigaciones Geofísicas, Universidad de Costa Rica https://orcid.org/0000-0003-3070-5530
  • Hugo Solís-Sánchez Escuela de Física y Centro de Investigaciones Geofísicas, Universidad de Costa Rica https://orcid.org/0000-0001-8465-3786
  • Diana Valverde-Méndez Department of Physics, Princeton University
  • Ariadna Venegas-Li Physics Department, University of California at Davis https://orcid.org/0000-0002-8660-8513




automated speaker identification, long term spectra, forensic acoustics, emotional distortions, anger


Forensic speaker identification has traditionally considered approaches based on long term (a few tens of seconds) spectra analysis as especially robust, given that they work well for short recordings, are not sensitive to changes in the intensity of the sample, and continue to function in the presence of noise and limited passband. Because of this, the long term spectra approach is one of the prefered tools for forensic speaker identification, in addition to formant analysis, speed of speech and the determination of the fundamental frequency. We find, however, that anger induces a significant distortion of the acoustic signal for long term spectra analysis purposes. Even moderate anger offsets speaker identification results by 33% in the direction of a different speaker altogether (in the space of sample correlations). Thus, caution should be exercised when applying this tool.


