What Does a Cough Look Like? – Some Examples
Okay, so sound is pretty to look at, and it contains a lot of information in its visual form. But what does that have to do with coughing and what does a cough look like?
A lot, actually. Converting the sound of a cough to a visual representation allows for the use of advanced image recognition techniques. This has applications not only in differentiating between a cough and background noise, but also potentially in using each cough’s acoustic signature to better understand the health of the person who coughs.
By teaching AI how to distinguish between a cough and ambient noise, we can track cough frequency over time and space. And since visualized audio offers so much data (which we over-efficient humans don’t even notice), we can also teach AI to recognize patterns in coughs. Whereas a human might categorize a cough in simplistic terms like “wet” or “dry”, a machine can generate categories which are imperceptible to humans, unbounded by our linguistic limitations, far beyond our abilities.
But to get there, we have to turn sounds into pictures. Let’s have a look.
Keyboard Clickety-Clackety
Here is some clickety-clackety on a keyboard:
You’re probably used to seeing the wave-form of sounds (on the left), and not the spectogram (right). In both, one can distinguish clearly the 4 clicks. But volume only is not sufficient for high-quality classification. We also need frequency (pitch). Unlike a wave-form plot, a spectogram shows a third dimension (through color).
A Baby’s Squeal
Let’s have a look at another sound: a baby’s squeal:
In terms of decibels-only, it looks like this:
But the primary differentiator between a baby’s squeal and an adult’s (do adults squeal?) is not volume, but pitch. Thus, the utility of the specotgram.
Coughs
Let’s have a look at 3 coughs, from 3 different people. You’ll note some similarities, both listening and seeing, across all coughs. They start with an explosive increase in volume, and fade-out more slowly than they fade-in.
Cough 1 (below) is a prolonged cough with a slight uptick in volume towards the end, withone final expiratory contraction from the diaphragm.
Cough 2 (below) is more archetypical. A steep abrupt explosion in sound followed by a disminuendo.
Cough 3 (below) is also fairly typical, albeit less pronounced than 2, both in terms of duration and decibel variation.
What’s notable between coughs 2 and 3 is how similar they are in terms of decibel profiles, but how different they are in terms of frequency.
Wrap-up
Maybe by now the novelty of seeing sounds as spectograms has worn off. But that feeling you have – that desire to go do something else rather than keep staring at spectograms – computers don’t get that feeling. And that’s why we use computers, and not humans, to do deep learning. They can look at thousands and thousands of images of sounds and detect patterns in them, patterns which we are neither patient nor detailed-oriented enough to perceive. Once they’ve seen enough examples, they can “predict” on images (made from sounds) they’ve never seen. Just as a child can hear a bark from a dog species they’ve never encountered and still say “that’s a bark”, a computer can be trained to detect a cough and, with time and sufficient examples, perhaps differentiate between different kinds of coughs. There are a lot practical applications to this, ranging from diagnostics, to medication adherence, to public health surveillance. But it all starts with turning sound into pictures.