Telefol is a fast, multi-threaded keyword spotting engine that listens to live audio streams and triggers events when it recognizes configured combinations of words or phrases. It can process many streams of audio without as much CPU or memory as full transcription requires.
ASR systems (such as Cubic) include a complex language model to predict what sequences of words are most likely in order to differentiate between similar-sounding phrases. Telefol builds a language model dynamically based on the keywords of the events the system is configured to recognize, rebuilding the model at run-time when events are added or modified. The model is faster, requiring less processing time to keep up with a real-time stream, because it does not have to predict the probability of every possible sequence and can focus on the words that are most relevant.
More than just a simple search for a word, Telefol allows defining complex events: combining multiple phrases, specifying the time segment of audio, specifying that phrases must occur a number of times or within a number of seconds of each other. Events can also be chained together, allowing support for all kinds of sophisticated application logic.