If there's a good use-case for overriding `Event._matches_condition`
with a logic that also parses the event arguments, then those arguments
should be accessed directly from the event object, not from the match
result.
Initializing `EventMatchResult` with the arguments from the event means
that, if `EventMatchResult.parsed_args` are populated with custom
extracted arguments, then the upstream event arguments will also be
modified.
If the event is matched against multiple conditions, this will result in
the extracted tokens getting modified by each `matches_condition`
iteration.
Instead of being a list, the hooks in the hook processor should be
backed by by-name and by-value maps.
Don't insert a hook if its exact backing method has already been
inserted. This is actually very common when hooks are defined as Python
snippets imported in other scripts too.
The assistant object now runs in its own thread and leverages an
external `SpeechProcessor` that uses two threads to scan for both
intents and speech in parallel on audio frames.
We shouldn't overwrite `event._set` and `event._clear` if those values
have already been set.
Those attributes hold the original references to `Event.set` and
`Event.clear` respectively, and the `OrEvent` logic overwrites them with
a callback-based logic.
This shouldn't happen if those attributes are already present.
- Added `intent_model_path` parameter.
- Always apply `expanduser` to configuration paths.
- Better logic to infer the fallback model path.
- The Picovoice Leonardo object should always be removed after
`assistant.picovoice.transcribe` is called.
The plugin should leverage `AssistantPlugin._on_mute_changed` to handle
the boilerplate state managent on mute/unmute actions instead of
re-implementing the same logic.
- The `Responding` state should be modelled as an extra event/binary
flag, not as an assistant state. The assistant may be listening for
hotwords even while the `tts` plugin is responding, and we don't want
the two states to interfere with each either - neither to build a more
complex state machine that also needs to take concurrent states into
account.
- Stop any responses being rendered upon the `tts` plugin when a new
hotword audio is detected. If e.g. I say "Ok Google", I should always
be able to trigger the assistant and stop any concurrent audio
process.
- `SpeechRecognizedEvent` should be emitted even if `cheetah`'s latest
audio frame results weren't marked as final, and the speech detection
window timed out. Cheetah's `is_final` detection seems to be quite
buggy sometimes, and it may not properly detect the end of utterances,
especially with non-native accents. The workaround is to flush out
whatever text is available (if at least some speech was detected) into
a `SpeechRecognizedEvent` upon timeout.
- Added wiring between `assistant.picovoice` and `tts.picovoice`.
- Added `RESPONDING` status to the assistant.
- Added ability to override the default speech model upon
`start_conversation`.
- Better handling of conversation timeouts.
- Cache Cheetah objects in a `model -> object` map - at least the
default model should be pre-loaded, since model loading at runtime
seems to take a while, and that could impact the ability to detect the
speech in the first seconds after a hotword is detected.
`AssistantEvent.assistant` is now modelled as an opaque object that
behaves the following way:
- The underlying plugin name is saved under `event.args['_assistant']`.
- `event.assistant` is a property that returns the assistant instance
via `get_plugin`.
- `event.assistant` is reported as a string (plugin qualified name) upon
event dump.
This allows event hooks to easily use `event.assistant` to interact with
the underlying assistant and easily modify the conversation flow, while
event hook conditions can still be easily modelled as equality
operations between strings.
Explicitly use a `CastBrowser` object initialized at plugin boot instead
of relying on blocking calls to `pychromecast.get_chromecasts`.
1. It enables better event handling via callbacks instead of
synchronously waiting for scan batches.
2. It optimizes resources - only one Zeroconf and one CastBrowser object
will be created in the plugin, and destroyed upon stop.
3. No need for separate `get_chromecast`/`_refresh_chromecasts` methods:
all the scanning is run continuously, so we can just return the
results from the maps.
- `pychromecast.get_chromecasts` returns both a list of devices and a
browser object. Since the Chromecast plugin is the most likely culprit
of the excessive number of open MDNS sockets, it seems that we may
need to explicitly stop discovery on the browser and close the
ZeroConf object after the discovery is done.
- I was still using an ancient version of pychromecast on my RPi4, and I
didn't notice that more recent versions implemented several breaking
changes. Adapted the code to cope with those changes.
It seems that the process keeps a lot of open connections to Chromecast
devices during playback.
The most likely culprit is the `_refresh_chromecasts` logic.
We should start a `cast` object and register a status listener only if a
Chromecast with the same identifier isn't already registered in the
plugin.
The project hasn't seen a commit in three years and it's probably been
abandoned by Mozilla.
New and better maintained speech-to-text integrations will be
investigated.