Decoding TV Teddy – Part Four: The 2018 Hifi Version?

Nick Lansley
Jan 9, 2019
5 min read

In this ‘Decoding TV Teddy’ series of articles, I’ve found that the best audio bandwidth available is no better than that of an AM radio.

But now it’s 2018 and I’m using software to do this, could I improve on that bandwidth?

Here’s a challenging audio source which I have created from two music tracks available under creative commons, plus a pure sinosoidal wavefor from 1Hz to 20KHz over 20 seconds. Pragmatically I’ve changed this source to mono – I can try for stereo later on, but the idea is you can compare the quality output at each stage with this one at the top of this article. This MP3 is created at 320Kbps mono and retains all of the information in the original edit, including the full 0 to 20 KHz sweep at the current amplitude. Download and import it into Audacity to see for yourself!

00:00 to 01:06 – Piano – Excerpt from ‘Ravel: Sonatine’ by Nico de Napoli licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 license
A complex piano track with subtelty and quiet places which will be sensitive to any noise being introduced.
01:06 to 02:19 – Electronic/Jazz – Excerpt from ‘Galaxies’ by Split Phase licensed under a Attribution-Noncommercial-Share Alike 3.0 United States License.
A loud electronic jazz track with clean & precise bass and percussion, sensitive to any muddying or phase distortion.
02:19 to 02:18 – Linear sinusoidal audio sweep tone from 1Hz to 20KHz at 0.5 amplitude generated by Audacity.
A pure tone that will test the bandwidth limits, and will be sensitive to harmonic disturbances caused by imperfections in the encoding / decoding process.
20 KHz over 20 seconds means that you simply count some seconds and the frequency will match (e.g. count 5 seconds and the frequency will be 5 KHz)

https://lansley.com/wp-content/uploads/2018/03/tvt2018_320kbps44100mono.mp3

Let’s take this step by step, starting with TV Teddy quality and moving on.

1 – Use the original TV Teddy method

TV Teddy’s designers tried to retain higher frequencies by halving the pitch in the encoding, then doubling it again when decoding. This is a helpful method for retaining ‘higher than AM quality’ frequenciesvoice, but it could not be any use for music. This is because halving the pitch removes detail from the higher frequencies. Simply doubling the halved-pitch again to restore those higher frequencies but also introduces acoustic noise that sounds like a flanging, phasing distortion.

I’ll use the code I wrote in Part Three, so I’m

Halving the pitch
Reducing the audio bitrate from 44.1Kbps to 11988 bps
Saving as a 8-bit unsigned PCM WAV file
Feeding it through the encoder
Extracting it with the decoder
Increasing the bitrate back to 44.1 Kbps*
Doubling the pitch

* Careful how you do this in Audacity. Changing the project rate on the imported 11988 bps WAV file isn’t enough to increase the bitrate. You need to change the project rate to 44.1 KHz then add a new mono track, then select all the existing track and copy its contents to the new track. Delete the old track then apply the pitch doubling

…and here’s the result, which I’ve encoded to an MP3 with a bitrate that way exceeds any level that would itself affect the output:

https://lansley.com/wp-content/uploads/2018/03/testresultsout.mp3

There’s a large amount of background hiss, and the pitch halving-then-doubling has played havoc with some of the more complex parts of the piano music (and most of the jazz!) created by phase distortions. The highest loud frequency in the sweep is 11 KHz after which it quickly fades to zero amplitude, so the bandwidth is 11 KHz too.

Let’s repeat the test but leave out the halving and doubling pitch (leaving out steps 1 and 7 above). Here’s the output:

https://lansley.com/wp-content/uploads/2018/03/Test-1A-output.mp3

This sounds much like listening to an AM broadcast: The treble is lost to a bandwidth ceiling of 5.7 KHz (which is where the sweep drops to from 0.5 to zero amplitude suddenly). However there’s no weird phasing effects and the audio is more listenable, to my ears anyway.

2 – Introduce colour rather than greyscale mapping

Something may have occured to you by now if you’ve been analysing the Python code used to generate the TV Teddy audio track in the video picture:

For each pixel I’m setting each of the R, G, and B values to the same value, as identical values for each of red, green and blue always create a grey (or all black at 0, 0, 0 or all white at 255, 255, 255). Supposing I used these values differently? After all, 256 x 256 x 256 = 16,777,216 different values (16 million colours). We would, surely, get a better quality audio if we were to apply the audio by mapping the RGB differently? Even if we just used R and B we could store 16-bit audio samples which would improve the sound quality even if the bitrate was unchanged?

Hold that thought.

The TV Teddy designers used greyscale (that is, brightness or luma) mapping to the audio waveform for a very good reason: If you think that VHS recorders really soften the detail in a TV picture, they all but wipe away colour definition! The best way to describe VHS colour is “a gentle wash of basic colour over the picture that generally conforms to what the eye expects”. Even analogue broadcast TV behaved like this, but the tight bandwidth restrictions meant that the colour or chroma parts of the signal had to be filtered even more. Fortunately the human eye finds this acceptable and your brain uses the basic wash of colour to rebuild the picture in your mind, as long as you are sitting across the room from your 26″ 4:3 tube TV screen.

Here’s an example (source : Wikiledia Commons) which shows that the buttons on the screen are just being ‘washed’ with their yellow colour that spills out around them into the blue background. This effect has been worsened with ‘dot crawl’ where the chroma part of the video signal has mixed in with parts of the luma (brightness) signal generating interference patterns. If the TV Teddy box was aiming to get a colour signal, what hope of accuracy would there be? None.

Still holding that thought? Good, let’s try it anyway! After all we are keeping our testing in the digital domain with MP4 files, so why don’t we store 16-bit audio samples in the R and G values of the pixel – 8 bits in R and 8 bits in G? Our plan is now that we are:

Halving the pitch
Reducing the audio bitrate from 44.1Kbps to 11988 bps
Saving as a 16-bit signed PCM WAV file (I can’t find a way to get Audacity to save as unsigned)
Feeding it through the encoder with these changes:
Add 32768 to each sample to make it ‘unsigned’
Split the 16-bit number into two 8-bit numbers
Add the first 8-bit number (value 0 to 255) to ‘R’ and the second to ‘G’. ‘B’ will be 127 (or any number as it’s unused, but 127 is fine)
Extracting it with the decoder with these changes:
Combine the two 8-bit values together to make a 16-bit value
Subtract 32768 from the resulting number to return it to a ‘signed’ 16-bit sample.
Increasing the bitrate back to 44.1 Kbps
Doubling the pitch

*** TIME PASSES ***

Well, my experimenting continues although I did take a break when work had to intrude but I’ve had some graet feedback from others trying to make this work.

I’ll leave this entry here with a continuation marker:

TODO: Keep Going!

Decoding TV Teddy – Part Four: The 2018 Hifi Version?

Recent Posts

Commentaires