+1 386-243-9402 MON – FRI : 09:00 AM – 05:00 PM
Investigating crime. Magnifying glass and handcuffs

Robot Attitude is coming this summer! Click here to get an alert when it publishes.

This is the final post about the use of voices and AI. As we have seen in the prior posts, voice analytics can play a role in numerous areas. In the case of criminal investigations, it can do much more. One of the leading scientists in the field of voice recognition is Rita Singh at Carnegie Mellon University’s Language Technologies Institute. Over the period of more than 20 years she and her team have developed techniques to extract a lot of intelligence from a small sample of our voices. Simon Brandon, reporting from the World Economic Forum in 2018, explained,

The techniques developed by Singh and her colleagues at Carnegie Mellon analyse and compare tiny differences, imperceptible to the human ear, in how individuals articulate speech. They then break recorded speech down into tiny snippets of audio, milliseconds in duration, and use AI techniques to comb through these snippets looking for unique identifiers. Your voice can give away plenty of environmental information, too. For example, the technology can guess the size of the room in which someone is speaking, whether it has windows and even what its walls are made of. Even more impressively, perhaps, the AI can detect signatures left in the recording by fluctuations in the local electrical grid and can then match these to specific databases to give a very good idea of the caller’s physical location and the exact time of day they picked up the phone.[i]

In 2014, the U.S. Coast Guard was working on a case where an unknown person had made 28 false distress telephone calls. The emergency responses to the calls cost an estimated $500,000.[ii] The investigators had little to go on other than the recordings of the emergency calls. They reached out to Ms. Singh at Carnegie Mellon. Using nothing more than the voice recordings, she was able to determine the hoax caller’s age, height, and weight.[iii] The case is ongoing, but having this information provided valuable clues for the investigation.

Ms. Singh, through years of research, has learned the voice carries information which predicts demographic, environmental, medical, physical, physiological and other characteristics of the speaker. The tiny snippets of human voice patterns are called microsignatures and they can be used to create profiles of people. Ms. Singh acknowledges the technology is not perfect. For example, age can only predict within a three-year range. However, research is improving the predictive capability and opening up new areas. McCormick said,

Ms. Singh and her team recently demonstrated a system that could reconstruct 60% to 70% of a person’s face just from their voice, she says. Ms. Singh says voice-analysis technology still has a long way to go, but its potential is enormous. “It would enable machines to understand humans a lot better than perhaps even humans can,” she says.[iv]

Summary AI and related technologies have awesome power to uncover a lot from human voices including the ability to determine who we are, and what our emotions of the moment are. The tools can empower others to identify and profile us. The concerns about privacy are obvious. On the other hand, we have seen in the posts how voices can enhance mental health treatment, keep drivers awake, fight heart disease, enhance call center experience, improve job recruiting, combat fraud, and investigate crimes. The challenge facing technologists and policymakers will be to achieve a balance which can be tolerated.

[i] Simon Brandon, “How to Catch a Criminal Using Only Milliseconds of Audio,”  World Economic Forum (2018), https://www.weforum.org/agenda/2018/01/catch-criminal-milliseconds-audio-rita-singh-carnegie/
[ii] McCormick, “What Ai Can Tell from Listening to You”.
[iii] Brandon, “How to Catch a Criminal Using Only Milliseconds of Audio”.
[iv] McCormick, “What Ai Can Tell from Listening to You”.