Why this website is coined Pseudo Voice

Psuedo voice: just an idea for a radical different approach to digital voice over limited channels.

One of the reasons I started this blog and gave this website its name is that I want to share an idea I have for HAM-radio (and other radio communication). Starting point is that on HF ‘voice’ or ‘phone’ needs a relatively large bandwidth compared to many digimodes. Sometimes you just need the excellent SNR-handling and/or narrow bandwidth of digital modes to get the message through. Some digimodes can do this. Voice not.

There are experiments to encode and send coded voice digitally (like freeDV), but that is still a bandwidth-hungry mode compared to -say- PSK31 or even MT63-500

Here my idea of ‘pseudo-voice’ steps in. Psuedo voice is (no surpise!) not real voice, but is in essence it is a series of code-labelled words. That’s why it has the potential to be used in narrow bandwidth channels. They basic idea is you just transmit short codes that represent whole words or even whole sentences. How? First let a computer algorith (AI??) convert speech to ‘words’ and then label each word (or standard sentence) with a unique and (as) short (as possible) code (a so called ‘label’). Then only transmit the sequence of labels. After receiving the labels, they can be converted back to words/sentences (and speech) by using a special look up table: a dictionary. This concept only works if both sides use with the same ‘label dictionary’. that associates the same labels with the same words.

For each type of communication you can develop a seperate tailor-made-dictionary. Like a dictionary for EMCOM, a RAG CHEWING dictionary, et cetera. It’s all about standardisation.

The basic idea behind pseudo voice and its use of dedicated dictionaries is that a meaningfull basic conversation in a normal language only needs a vocabulary of about thousand words (the so called ‘language level A1’). If the conversation is limited to just a few topics (like EMCOM) even less words are needed. Higher laguage levels need more words, complicated grammar and therefore larger dictionaries. By the way: a normal conversational speech speed al level A1 is about 120 words per minute, some of these words form standard sentences that can also have their own label.

Pseudo voice could work ‘like real time’ voice if we use a digimode that does 120 labels per minute. If a label constists of max 5 characters, we would need (netto) max 600 characters per minute = max 10 characters per second. Which small band digimodes can do that? A lot!

More and older/less matured musings:

So: what if each word (and regularly used sentences) has it’s own, short and unique digital ‘label’? And this label is -obviously- shorter than the word itself. You could pair words and labels in a specially written dictonary, that consists of carefully selected words and sentences. Lets say a ‘dictionary HAMQSO’, or a dictionary ‘EMCOM’. Then you could use a speech-to-label engine (STL-e, a variant on the speech-to-text engine) to code speech. And vice versa with a label-to-speech engine (LTSe) . A 120 wpm speech could than be converted to a much lower wpm datastream of ‘labels’.

For use in radio we could -instead of sending real voice- use a STL-engine loaded with a certain dictionary, and send a series of short labels (characters) using a narrow digimode. Say MT63-500 (speed: 5 cps). Labels receveived can be converted back to words (/sentences) and speech using a ‘label to speech’-engine also loaded with this dictionary. Hence ‘pseudo voice’.

NB The sound quality of the pseudo speech is independent from noise in the analog radio channel (as long as the labels are decoded properly): the sound quality depends solely on the quality of sound clips in the LTS-engine. It could be anything from low level computervoices to hifi-speech clips. There are many possibilities to optimise the system to ones needs. One could even have qso’s between diffent foreign languages as long as the labels refer to the same content-dictionary in your own language. ;-). You could use the voice you like (male/female) on the receiving end.

One could use pseudovoice in circumstances where normal HF voice communication is difficult or even impossible (like in situation with low power, compromise antenna’s, bad snr, bad propagation, QRM, etc) and where dedicated digital modes work still OK.

In short: in essence not that the voice itself is digitized and then transmitted, but the use of a dedicated dictionary that labels words (or whole standard sentences) and transmit corrsponding short codes. And vice versa.

For-pupose-dictionaries (aimed at specialized communications) could be developed; for HAM, Emergency, tactical engagements, etc.

Only sending labels (i.e. maximal shorted codes) would greatly reduce HF bandwidth needed (albeit at the cost of a limited dictionary). It doesn’t produce a real conversation, but it comes close (pseudo 😉 ) And it is better than nothing!

That’s why I called this webpage ‘pseudo-voice’.

By the way, I may have coined the idea in this blog, but that doesn’t mean I am building or developing it. I am not an engineer or programmer by any means. So I am not able to bring this idea to life in the real world. But it would be nice to hear from you what you think about this idea and even how to bring it forward in the HAM-community. Please post your comments!

Leave a Reply

Your email address will not be published. Required fields are marked *