ElroyJetson
Getting tired of all the stupidity.
It's interesting that the very nature of IMBE family based codecs actually makes the task of breaking encryption a great deal easier, at least in theory.
The known silence attack is workable. Not cheap in computing power terms, but workable.
Any codec that works like IMBE, that is, there is a finite list of sound symbols that exist and all speech is comprised of the symbols that are "best fit" to the original audio, and then sent to encryption, produces a relatively limited range of possible output data as compared to, say, direct linear PCM encoded audio going into the same encryption scheme.
I don't know the actual numbers, but presume for a moment that the latest version of IMBE audio has 2400 "building blocks" from which all audio is synthesized. That's a lot of blocks to correlate but it's a lot less than direct encryption of analog audio after being sampled via linear PCM.
Take that to extremes in a theoretical sense. Say that you have a codec that only outputs TWO possible data values. Then encrypt that output. It'll be blatantly obvious by looking at the encrypted output that there are only two values encrypted. What the specific data strings of those output values are does not matter. What matters is that you can differentiate them. Try a billion different encryption keys and the output will STILL consist of two clearly differentiated values, and no more and no less.
You may not HAVE to fully decrypt. In some cases, you only have to figure out the patterns in the output that repeat. Then you can start correlating the different output data strings to such things as common human speech patterns. Given a long enough sample time, you should be able to know what's being said by processing the output even WITHOUT having the decryption key.
The known silence attack is workable. Not cheap in computing power terms, but workable.
Any codec that works like IMBE, that is, there is a finite list of sound symbols that exist and all speech is comprised of the symbols that are "best fit" to the original audio, and then sent to encryption, produces a relatively limited range of possible output data as compared to, say, direct linear PCM encoded audio going into the same encryption scheme.
I don't know the actual numbers, but presume for a moment that the latest version of IMBE audio has 2400 "building blocks" from which all audio is synthesized. That's a lot of blocks to correlate but it's a lot less than direct encryption of analog audio after being sampled via linear PCM.
Take that to extremes in a theoretical sense. Say that you have a codec that only outputs TWO possible data values. Then encrypt that output. It'll be blatantly obvious by looking at the encrypted output that there are only two values encrypted. What the specific data strings of those output values are does not matter. What matters is that you can differentiate them. Try a billion different encryption keys and the output will STILL consist of two clearly differentiated values, and no more and no less.
You may not HAVE to fully decrypt. In some cases, you only have to figure out the patterns in the output that repeat. Then you can start correlating the different output data strings to such things as common human speech patterns. Given a long enough sample time, you should be able to know what's being said by processing the output even WITHOUT having the decryption key.
Last edited: