A solution would have to be along the lines of using a virtual audio cable to pipe the UniTrunker output to a program that listened for the end of message signal. It would then mute the audio until UniTrunker issued a new SDRSHARPTRUNKING.log file, which is where the voice channel activity is printed.
No. The low speed datastream and disconnect words are not present in that recording, e.g. Unitrunker has already filtered them out. If that filtering can be disabled, then yes, one could take that approach.
Obviously, it would be better to do it in the decoder itself. When the disconnect words are being sent, there is no other audio (other than noise) present, i.e. no user is talking over the disconnect words, so only minimal noise filtering is required (and should be done anyway for audio sent to the speakers) and the processing is simple and audio samples can be directly converted to bits. The data pattern being sent is a repeating 10101100 sequence sent at 300 BPS. To detect it, one can take the noise filtered audio stream and store it in a short circular buffer. Scanning the buffer yields the highest and lowest sample values. Their mean forms an artificial zero line; this negates the effects of tuning errors, which create a DC shift in the FM demodulated audio. Walk through the buffer in steps that match the number of audio samples per low speed data bit. Obviously, the step size is affected by the audio sampling rate in use. For each audio sample stepped on, use the artificial zero line to convert the sample to a zero or one bit and slide that bit into a 32 bit shift register. After 32 bits have been shifted in, if you are processing disconnect words, and if the first bit you shifted in was at the beginning of a 10101100 sequence, your shift register will contain four copies of 00110101 (reversed b/c you left shifted), i.e. the value 0x35353535. If your shift register ever holds 0x35353535, the call is over, mute it. Obviously, you need to deal with misalignment - you may not be stepping on the bit centers or initially stepping on the first bit of a disconnect word, so you have to keep checking as you add new audio samples to the buffer. If you're concerned about false positives (I wouldn't be), use a 64 bit shift register on twice the data. Some efficiencies can be had - use a weighted sum to constantly adjust your zero line and use it to buffer bits rather than audio samples. And instead of a circular buffer, one could implement multiple shift registers (4 are probably enough), each working with the data at different offsets such that one of them will be hitting the bit centers. With them, no circular buffer required - just stuff incoming bits directly into shift registers. If any shift register hits 0x35353535, mute.