[assistant.picovoice] Conversation flow improvements.

- The `Responding` state should be modelled as an extra event/binary flag, not as an assistant state. The assistant may be listening for hotwords even while the `tts` plugin is responding, and we don't want the two states to interfere with each either - neither to build a more complex state machine that also needs to take concurrent states into account. - Stop any responses being rendered upon the `tts` plugin when a new hotword audio is detected. If e.g. I say "Ok Google", I should always be able to trigger the assistant and stop any concurrent audio process. - `SpeechRecognizedEvent` should be emitted even if `cheetah`'s latest audio frame results weren't marked as final, and the speech detection window timed out. Cheetah's `is_final` detection seems to be quite buggy sometimes, and it may not properly detect the end of utterances, especially with non-native accents. The workaround is to flush out whatever text is available (if at least some speech was detected) into a `SpeechRecognizedEvent` upon timeout.
Added more default imports under the `platypush` module root.
2024-04-13 20:01:21 +02:00 · 2024-04-10 23:33:48 +02:00 · 2024-04-10 22:26:45 +02:00 · 2024-04-10 20:32:32 +02:00 · 2024-04-10 20:31:38 +02:00 · 2024-04-09 00:19:51 +02:00
11 changed files with 404 additions and 75 deletions
--- a/platypush/init.py
+++ b/platypush/init.py
@ -7,10 +7,13 @@ Platypush
 from .app import Application
 from .config import Config
-from .context import get_backend, get_bus, get_plugin
+from .context import Variable, get_backend, get_bus, get_plugin
 from .cron import cron
 from .event.hook import hook
 from .message.event import Event
 from .message.request import Request
 from .message.response import Response
 from .procedure import procedure
 from .runner import main
 from .utils import run
@ -19,14 +22,18 @@ __author__ = 'Fabio Manganiello <fabio@manganiello.tech>'
 __version__ = '0.50.3'
 __all__ = [
    'Application',
    'Variable',
    'Config',
    'Event',
    'Request',
    'Response',
    'cron',
    'get_backend',
    'get_bus',
    'get_plugin',
    'hook',
    'main',
    'procedure',
    'run',
 ]
--- a/platypush/message/event/init.py
+++ b/platypush/message/event/init.py
@ -257,26 +257,29 @@ class Event(Message):
        return result
    def as_dict(self):
        """
        Converts the event into a dictionary
        """
        args = copy.deepcopy(self.args)
        flatten(args)
        return {
            'type': 'event',
            'target': self.target,
            'origin': self.origin if hasattr(self, 'origin') else None,
            'id': self.id if hasattr(self, 'id') else None,
            '_timestamp': self.timestamp,
            'args': {'type': self.type, **args},
        }
    def __str__(self):
        """
        Overrides the str() operator and converts
        the message into a UTF-8 JSON string
        """
        args = copy.deepcopy(self.args)
        flatten(args)
-
+        return json.dumps(self.as_dict(), cls=self.Encoder)
        return json.dumps(
            {
                'type': 'event',
                'target': self.target,
                'origin': self.origin if hasattr(self, 'origin') else None,
                'id': self.id if hasattr(self, 'id') else None,
                '_timestamp': self.timestamp,
                'args': {'type': self.type, **args},
            },
            cls=self.Encoder,
        )
@dataclass
--- a/platypush/message/event/assistant/init.py
+++ b/platypush/message/event/assistant/init.py
@ -1,27 +1,53 @@
 import re
 import sys
-from typing import Optional
+from typing import Optional, Union
 from platypush.context import get_plugin
 from platypush.message.event import Event
 from platypush.plugins.assistant import AssistantPlugin
 from platypush.utils import get_plugin_name_by_class
 class AssistantEvent(Event):
    """Base class for assistant events"""
-    def __init__(self, *args, assistant: Optional[str] = None, **kwargs):
+    def __init__(
        self, *args, assistant: Optional[Union[str, AssistantPlugin]] = None, **kwargs
    ):
        """
        :param assistant: Name of the assistant plugin that triggered the event.
        """
-        super().__init__(*args, assistant=assistant, **kwargs)
+        assistant = assistant or kwargs.get('assistant')
        if assistant:
            assistant = (
                assistant
                if isinstance(assistant, str)
                else get_plugin_name_by_class(assistant.__class__)
            )
            kwargs['_assistant'] = assistant
        super().__init__(*args, **kwargs)
    @property
-    def _assistant(self):
+    def assistant(self) -> Optional[AssistantPlugin]:
-        return (
+        assistant = self.args.get('_assistant')
-            get_plugin(self.args.get('assistant'))
+        if not assistant:
-            if self.args.get('assistant')
+            return None
-            else None
+
-        )
+        return get_plugin(assistant)
    def as_dict(self):
        evt_dict = super().as_dict()
        evt_args = {**evt_dict['args']}
        assistant = evt_args.pop('_assistant', None)
        if assistant:
            evt_args['assistant'] = assistant
        return {
            **evt_dict,
            'args': evt_args,
        }
 class ConversationStartEvent(AssistantEvent):
@ -95,8 +121,8 @@ class SpeechRecognizedEvent(AssistantEvent):
        """
        result = super().matches_condition(condition)
-        if result.is_match and self._assistant and 'phrase' in condition.args:
+        if result.is_match and self.assistant and 'phrase' in condition.args:
-            self._assistant.stop_conversation()
+            self.assistant.stop_conversation()
        return result
--- a/platypush/plugins/init.py
+++ b/platypush/plugins/init.py
@ -122,6 +122,12 @@ class Plugin(EventGenerator, ExtensionWithManifest):  # lgtm [py/missing-call-to
        assert entities, 'entities plugin not initialized'
        return entities
    def __str__(self):
        """
        :return: The qualified name of the plugin.
        """
        return get_plugin_name_by_class(self.__class__)
    def run(self, method, *args, **kwargs):
        assert (
            method in self.registered_actions
--- a/platypush/plugins/assistant/init.py
+++ b/platypush/plugins/assistant/init.py
@ -8,24 +8,7 @@ from typing import Any, Collection, Dict, Optional, Type
 from platypush.context import get_bus, get_plugin
 from platypush.entities.assistants import Assistant
 from platypush.entities.managers.assistants import AssistantEntityManager
-from platypush.message.event.assistant import (
+from platypush.message.event import Event as AppEvent
    AlarmEndEvent,
    AlarmStartedEvent,
    AlertEndEvent,
    AlertStartedEvent,
    AssistantEvent,
    ConversationEndEvent,
    ConversationStartEvent,
    ConversationTimeoutEvent,
    HotwordDetectedEvent,
    MicMutedEvent,
    MicUnmutedEvent,
    NoResponseEvent,
    ResponseEvent,
    SpeechRecognizedEvent,
    TimerEndEvent,
    TimerStartedEvent,
 )
 from platypush.plugins import Plugin, action
 from platypush.utils import get_plugin_name_by_class
@ -182,6 +165,17 @@ class AssistantPlugin(Plugin, AssistantEntityManager, ABC):
        self.publish_entities([self])
        return asdict(self._state)
    @action
    def render_response(self, text: str, *_, **__):
        """
        Render a response text as audio over the configured TTS plugin.
        :param text: Text to render.
        """
        self._on_response_render_start(text)
        self._render_response(text)
        self._on_response_render_end()
    def _get_tts_plugin(self):
        if not self.tts_plugin:
            return None
@ -201,11 +195,13 @@ class AssistantPlugin(Plugin, AssistantEntityManager, ABC):
        audio.play(self._conversation_start_sound)
-    def _send_event(self, event_type: Type[AssistantEvent], **kwargs):
+    def _send_event(self, event_type: Type[AppEvent], **kwargs):
        self.publish_entities([self])
        get_bus().post(event_type(assistant=self._plugin_name, **kwargs))
    def _on_conversation_start(self):
        from platypush.message.event.assistant import ConversationStartEvent
        self._last_response = None
        self._last_query = None
        self._conversation_running.set()
@ -213,66 +209,98 @@ class AssistantPlugin(Plugin, AssistantEntityManager, ABC):
        self._play_conversation_start_sound()
    def _on_conversation_end(self):
        from platypush.message.event.assistant import ConversationEndEvent
        self._conversation_running.clear()
        self._send_event(ConversationEndEvent)
    def _on_conversation_timeout(self):
        from platypush.message.event.assistant import ConversationTimeoutEvent
        self._last_response = None
        self._last_query = None
        self._conversation_running.clear()
        self._send_event(ConversationTimeoutEvent)
    def _on_no_response(self):
        from platypush.message.event.assistant import NoResponseEvent
        self._last_response = None
        self._conversation_running.clear()
        self._send_event(NoResponseEvent)
-    def _on_reponse_rendered(self, text: Optional[str]):
+    def _on_response_render_start(self, text: Optional[str]):
        from platypush.message.event.assistant import ResponseEvent
        self._last_response = text
        self._send_event(ResponseEvent, response_text=text)
        tts = self._get_tts_plugin()
    def _render_response(self, text: Optional[str]):
        tts = self._get_tts_plugin()
        if tts and text:
            self.stop_conversation()
            tts.say(text=text, **self.tts_plugin_args)
    def _on_response_render_end(self):
        pass
    def _on_hotword_detected(self, hotword: Optional[str]):
        from platypush.message.event.assistant import HotwordDetectedEvent
        self._send_event(HotwordDetectedEvent, hotword=hotword)
    def _on_speech_recognized(self, phrase: Optional[str]):
        from platypush.message.event.assistant import SpeechRecognizedEvent
        phrase = (phrase or '').lower().strip()
        self._last_query = phrase
        self._send_event(SpeechRecognizedEvent, phrase=phrase)
    def _on_alarm_start(self):
        from platypush.message.event.assistant import AlarmStartedEvent
        self._cur_alert_type = AlertType.ALARM
        self._send_event(AlarmStartedEvent)
    def _on_alarm_end(self):
        from platypush.message.event.assistant import AlarmEndEvent
        self._cur_alert_type = None
        self._send_event(AlarmEndEvent)
    def _on_timer_start(self):
        from platypush.message.event.assistant import TimerStartedEvent
        self._cur_alert_type = AlertType.TIMER
        self._send_event(TimerStartedEvent)
    def _on_timer_end(self):
        from platypush.message.event.assistant import TimerEndEvent
        self._cur_alert_type = None
        self._send_event(TimerEndEvent)
    def _on_alert_start(self):
        from platypush.message.event.assistant import AlertStartedEvent
        self._cur_alert_type = AlertType.ALERT
        self._send_event(AlertStartedEvent)
    def _on_alert_end(self):
        from platypush.message.event.assistant import AlertEndEvent
        self._cur_alert_type = None
        self._send_event(AlertEndEvent)
    def _on_mute(self):
        from platypush.message.event.assistant import MicMutedEvent
        self._is_muted = True
        self._send_event(MicMutedEvent)
    def _on_unmute(self):
        from platypush.message.event.assistant import MicUnmutedEvent
        self._is_muted = False
        self._send_event(MicUnmutedEvent)
--- a/platypush/plugins/assistant/picovoice/init.py
+++ b/platypush/plugins/assistant/picovoice/init.py
@ -1,7 +1,10 @@
 import os
 from typing import Optional, Sequence
 from platypush.context import get_plugin
 from platypush.plugins import RunnablePlugin, action
 from platypush.plugins.assistant import AssistantPlugin
 from platypush.plugins.tts.picovoice import TtsPicovoicePlugin
 from ._assistant import Assistant
 from ._state import AssistantState
@ -96,7 +99,12 @@ class AssistantPicovoicePlugin(AssistantPlugin, RunnablePlugin):
            using a language other than English, you can provide the path to the
            model file for that language. Model files are available for all the
            supported languages through the `Picovoice repository
-            <https://github.com/Picovoice/porcupine/tree/master/lib/common>`_.
+            <https://github.com/Picovoice/cheetah/tree/master/lib/common>`_.
            You can also use the `Picovoice console
            <https://console.picovoice.ai/cat>`_
            to train your custom models. You can use a base model and fine-tune
            it by boosting the detection of your own words and phrases and edit
            the phonetic representation of the words you want to detect.
        :param endpoint_duration: If set, the assistant will stop listening when
            no speech is detected for the specified duration (in seconds) after
            the end of an utterance.
@ -146,15 +154,43 @@ class AssistantPicovoicePlugin(AssistantPlugin, RunnablePlugin):
            'on_hotword_detected': self._on_hotword_detected,
        }
    @property
    def tts(self) -> TtsPicovoicePlugin:
        p = get_plugin('tts.picovoice')
        assert p, 'Picovoice TTS plugin not configured/found'
        return p
    def _get_tts_plugin(self) -> TtsPicovoicePlugin:
        return self.tts
    def _on_response_render_start(self, text: Optional[str]):
        if self._assistant:
            self._assistant.set_responding(True)
        return super()._on_response_render_start(text)
    def _on_response_render_end(self):
        if self._assistant:
            self._assistant.set_responding(False)
        return super()._on_response_render_end()
    @action
-    def start_conversation(self, *_, **__):
+    def start_conversation(self, *_, model_file: Optional[str] = None, **__):
        """
-        Programmatically start a conversation with the assistant
+        Programmatically start a conversation with the assistant.
        :param model_file: Override the model file to be used to detect speech
            in this conversation. If not set, the configured
            ``speech_model_path`` will be used.
        """
        if not self._assistant:
            self.logger.warning('Assistant not initialized')
            return
        if model_file:
            model_file = os.path.expanduser(model_file)
        self._assistant.override_speech_model(model_file)
        self._assistant.state = AssistantState.DETECTING_SPEECH
    @action
@ -166,6 +202,8 @@ class AssistantPicovoicePlugin(AssistantPlugin, RunnablePlugin):
            self.logger.warning('Assistant not initialized')
            return
        self._assistant.override_speech_model(None)
        if self._assistant.hotword_enabled:
            self._assistant.state = AssistantState.DETECTING_HOTWORD
        else:
@ -215,7 +253,8 @@ class AssistantPicovoicePlugin(AssistantPlugin, RunnablePlugin):
            with Assistant(**self._assistant_args) as self._assistant:
                try:
                    for event in self._assistant:
-                        self.logger.debug('Picovoice assistant event: %s', event)
+                        if event is not None:
                            self.logger.debug('Picovoice assistant event: %s', event)
                except KeyboardInterrupt:
                    break
                except Exception as e:
--- a/platypush/plugins/assistant/picovoice/_assistant.py
+++ b/platypush/plugins/assistant/picovoice/_assistant.py
@ -9,11 +9,13 @@ import pvleopard
 import pvporcupine
 import pvrhino
 from platypush.context import get_plugin
 from platypush.message.event.assistant import (
    ConversationTimeoutEvent,
    HotwordDetectedEvent,
    SpeechRecognizedEvent,
 )
 from platypush.plugins.tts.picovoice import TtsPicovoicePlugin
 from ._context import ConversationContext
 from ._recorder import AudioRecorder
@ -25,6 +27,7 @@ class Assistant:
    A facade class that wraps the Picovoice engines under an assistant API.
    """
    @staticmethod
    def _default_callback(*_, **__):
        pass
@ -60,12 +63,14 @@ class Assistant:
        self.keywords = list(keywords or [])
        self.keyword_paths = None
        self.keyword_model_path = None
        self._responding = Event()
        self.frame_expiration = frame_expiration
        self.speech_model_path = speech_model_path
        self.endpoint_duration = endpoint_duration
        self.enable_automatic_punctuation = enable_automatic_punctuation
        self.start_conversation_on_hotword = start_conversation_on_hotword
        self.audio_queue_size = audio_queue_size
        self._speech_model_path = speech_model_path
        self._speech_model_path_override = None
        self._on_conversation_start = on_conversation_start
        self._on_conversation_end = on_conversation_end
@ -103,11 +108,32 @@ class Assistant:
                self.keyword_model_path = keyword_model_path
-        self._cheetah: Optional[pvcheetah.Cheetah] = None
+        # Model path -> model instance cache
        self._cheetah = {}
        self._leopard: Optional[pvleopard.Leopard] = None
        self._porcupine: Optional[pvporcupine.Porcupine] = None
        self._rhino: Optional[pvrhino.Rhino] = None
    @property
    def is_responding(self):
        return self._responding.is_set()
    @property
    def speech_model_path(self):
        return self._speech_model_path_override or self._speech_model_path
    @property
    def tts(self) -> TtsPicovoicePlugin:
        p = get_plugin('tts.picovoice')
        assert p, 'Picovoice TTS plugin not configured/found'
        return p
    def set_responding(self, responding: bool):
        if responding:
            self._responding.set()
        else:
            self._responding.clear()
    def should_stop(self):
        return self._stop_event.is_set()
@ -130,12 +156,18 @@ class Assistant:
            return
        if prev_state == AssistantState.DETECTING_SPEECH:
            self.tts.stop()
            self._ctx.stop()
            self._speech_model_path_override = None
            self._on_conversation_end()
        elif new_state == AssistantState.DETECTING_SPEECH:
            self._ctx.start()
            self._on_conversation_start()
        if new_state == AssistantState.DETECTING_HOTWORD:
            self.tts.stop()
            self._ctx.reset()
    @property
    def porcupine(self) -> Optional[pvporcupine.Porcupine]:
        if not self.hotword_enabled:
@ -159,7 +191,7 @@ class Assistant:
        if not self.stt_enabled:
            return None
-        if not self._cheetah:
+        if not self._cheetah.get(self.speech_model_path):
            args: Dict[str, Any] = {'access_key': self._access_key}
            if self.speech_model_path:
                args['model_path'] = self.speech_model_path
@ -168,20 +200,22 @@ class Assistant:
            if self.enable_automatic_punctuation:
                args['enable_automatic_punctuation'] = self.enable_automatic_punctuation
-            self._cheetah = pvcheetah.create(**args)
+            self._cheetah[self.speech_model_path] = pvcheetah.create(**args)
-        return self._cheetah
+        return self._cheetah[self.speech_model_path]
    def __enter__(self):
        """
        Get the assistant ready to start processing audio frames.
        """
        if self.should_stop():
            return self
        if self._recorder:
            self.logger.info('A recording stream already exists')
-        elif self.porcupine or self.cheetah:
+        elif self.hotword_enabled or self.stt_enabled:
            sample_rate = (self.porcupine or self.cheetah).sample_rate  # type: ignore
            frame_length = (self.porcupine or self.cheetah).frame_length  # type: ignore
            self._recorder = AudioRecorder(
                stop_event=self._stop_event,
                sample_rate=sample_rate,
@ -190,6 +224,9 @@ class Assistant:
                channels=1,
            )
            if self.stt_enabled:
                self._cheetah[self.speech_model_path] = self.cheetah
            self._recorder.__enter__()
            if self.porcupine:
@ -200,15 +237,18 @@ class Assistant:
        return self
    def __exit__(self, *_):
        """
        Stop the assistant and release all resources.
        """
        if self._recorder:
            self._recorder.__exit__(*_)
            self._recorder = None
        self.state = AssistantState.IDLE
-
+        for model in [*self._cheetah.keys()]:
-        if self._cheetah:
+            cheetah = self._cheetah.pop(model, None)
-            self._cheetah.delete()
+            if cheetah:
-            self._cheetah = None
+                cheetah.delete()
        if self._leopard:
            self._leopard.delete()
@ -223,9 +263,15 @@ class Assistant:
            self._rhino = None
    def __iter__(self):
        """
        Iterate over processed assistant events.
        """
        return self
    def __next__(self):
        """
        Process the next audio frame and return the corresponding event.
        """
        has_data = False
        if self.should_stop() or not self._recorder:
            raise StopIteration
@ -242,10 +288,10 @@ class Assistant:
                )
                continue  # The audio frame is too old
-            if self.porcupine and self.state == AssistantState.DETECTING_HOTWORD:
+            if self.hotword_enabled and self.state == AssistantState.DETECTING_HOTWORD:
                return self._process_hotword(frame)
-            if self.cheetah and self.state == AssistantState.DETECTING_SPEECH:
+            if self.stt_enabled and self.state == AssistantState.DETECTING_SPEECH:
                return self._process_speech(frame)
        raise StopIteration
@ -262,6 +308,7 @@ class Assistant:
            if self.start_conversation_on_hotword:
                self.state = AssistantState.DETECTING_SPEECH
            self.tts.stop()
            self._on_hotword_detected(hotword=self.keywords[keyword_index])
            return HotwordDetectedEvent(hotword=self.keywords[keyword_index])
@ -275,23 +322,20 @@ class Assistant:
        partial_transcript, self._ctx.is_final = self.cheetah.process(frame)
        if partial_transcript:
-            self._ctx.partial_transcript += partial_transcript
+            self._ctx.transcript += partial_transcript
            self.logger.info(
                'Partial transcript: %s, is_final: %s',
-                self._ctx.partial_transcript,
+                self._ctx.transcript,
                self._ctx.is_final,
            )
        if self._ctx.is_final or self._ctx.timed_out:
-            phrase = ''
+            phrase = self.cheetah.flush() or ''
-            if self.cheetah:
+            self._ctx.transcript += phrase
-                phrase = self.cheetah.flush()
+            phrase = self._ctx.transcript
            self._ctx.partial_transcript += phrase
            phrase = self._ctx.partial_transcript
            phrase = phrase[:1].lower() + phrase[1:]
-            if self._ctx.is_final or phrase:
+            if phrase:
                event = SpeechRecognizedEvent(phrase=phrase)
                self._on_speech_recognized(phrase=phrase)
            else:
@ -304,5 +348,8 @@ class Assistant:
        return event
    def override_speech_model(self, model_path: Optional[str]):
        self._speech_model_path_override = model_path
 # vim:sw=4:ts=4:et:
--- a/platypush/plugins/assistant/picovoice/_context.py
+++ b/platypush/plugins/assistant/picovoice/_context.py
@ -9,7 +9,7 @@ class ConversationContext:
    Context of the conversation process.
    """
-    partial_transcript: str = ''
+    transcript: str = ''
    is_final: bool = False
    timeout: Optional[float] = None
    t_start: Optional[float] = None
@ -24,7 +24,7 @@ class ConversationContext:
        self.t_end = time()
    def reset(self):
-        self.partial_transcript = ''
+        self.transcript = ''
        self.is_final = False
        self.t_start = None
        self.t_end = None
@ -32,11 +32,17 @@ class ConversationContext:
    @property
    def timed_out(self):
        return (
-            not self.partial_transcript
+            not self.transcript
            and not self.is_final
            and self.timeout
            and self.t_start
            and time() - self.t_start > self.timeout
        ) or (
            self.transcript
            and not self.is_final
            and self.timeout
            and self.t_start
            and time() - self.t_start > self.timeout * 2
        )
--- a/platypush/plugins/assistant/picovoice/manifest.yaml
+++ b/platypush/plugins/assistant/picovoice/manifest.yaml
@ -12,7 +12,14 @@ manifest:
    - platypush.message.event.assistant.ResponseEvent
    - platypush.message.event.assistant.SpeechRecognizedEvent
  install:
    apk:
      - ffmpeg
    apt:
      - ffmpeg
    dnf:
      - ffmpeg
    pacman:
      - ffmpeg
      - python-sounddevice
    pip:
      - pvcheetah
--- a/platypush/plugins/tts/picovoice/init.py
+++ b/platypush/plugins/tts/picovoice/init.py
@ -0,0 +1,138 @@
 import os
 from threading import RLock
 from typing import Optional
 import numpy as np
 import pvorca
 import sounddevice as sd
 from platypush.config import Config
 from platypush.plugins import action
 from platypush.plugins.tts import TtsPlugin
 class TtsPicovoicePlugin(TtsPlugin):
    """
    This TTS plugin enables you to render text as audio using `Picovoice
    <https://picovoice.ai>`_'s (still experimental) `Orca TTS engine
    <https://github.com/Picovoice/orca>`_.
    Take a look at
    :class:`platypush.plugins.assistant.picovoice.AssistantPicovoicePlugin`
    for details on how to sign up for a Picovoice account and get the API key.
    Also note that using the TTS features requires you to select Orca from the
    list of products available for your account on the `Picovoice console
    <https://console.picovoice.ai>`_.
    """
    def __init__(
        self,
        access_key: Optional[str] = None,
        model_path: Optional[str] = None,
        **kwargs,
    ):
        """
        :param access_key: Picovoice access key. If it's not specified here,
            then it must be specified on the configuration of
            :class:`platypush.plugins.assistant.picovoice.AssistantPicovoicePlugin`.
        :param model_path: Path of the TTS model file (default: use the default
            English model).
        """
        super().__init__(**kwargs)
        if not access_key:
            access_key = Config.get('assistant.picovoice', {}).get('access_key')
            assert (
                access_key
            ), 'No access key specified and no assistant.picovoice plugin found'
        self.model_path = model_path
        self.access_key = access_key
        if model_path:
            model_path = os.path.expanduser(model_path)
        self._stream: Optional[sd.OutputStream] = None
        self._stream_lock = RLock()
    def _play_audio(self, orca: pvorca.Orca, pcm: np.ndarray):
        with self._stream_lock:
            self.stop()
            self._stream = sd.OutputStream(
                samplerate=orca.sample_rate,
                channels=1,
                dtype='int16',
            )
        try:
            self._stream.start()
            self._stream.write(pcm)
        except Exception as e:
            self.logger.warning('Error playing audio: %s: %s', type(e), str(e))
        finally:
            try:
                self.stop()
                self._stream.close()
            except Exception as e:
                self.logger.warning(
                    'Error stopping audio stream: %s: %s', type(e), str(e)
                )
            finally:
                if self._stream:
                    self._stream = None
    def get_orca(self, model_path: Optional[str] = None):
        if not model_path:
            model_path = self.model_path
        if model_path:
            model_path = os.path.expanduser(model_path)
        return pvorca.create(access_key=self.access_key, model_path=model_path)
    @action
    def say(
        self,
        text: str,
        *_,
        output_file: Optional[str] = None,
        speech_rate: Optional[float] = None,
        model_path: Optional[str] = None,
        **__,
    ):
        """
        Say some text.
        :param text: Text to say.
        :param output_file: If set, save the audio to the specified file.
            Otherwise play it.
        :param speech_rate: Speech rate (default: None).
        :param model_path: Path of the TTS model file (default: use the default
            configured model).
        """
        orca = self.get_orca(model_path=model_path)
        if output_file:
            orca.synthesize_to_file(
                text, os.path.expanduser(output_file), speech_rate=speech_rate
            )
            return
        self._play_audio(
            orca=orca,
            pcm=np.array(
                orca.synthesize(text, speech_rate=speech_rate),
                dtype='int16',
            ),
        )
    @action
    def stop(self):
        """
        Stop the currently playing audio.
        """
        with self._stream_lock:
            if not self._stream:
                return
            self._stream.stop()
 # vim:sw=4:ts=4:et:
--- a/platypush/plugins/tts/picovoice/manifest.yaml
+++ b/platypush/plugins/tts/picovoice/manifest.yaml
@ -0,0 +1,22 @@
 manifest:
  events: {}
  install:
    apk:
      - ffmpeg
      - py3-numpy
    apt:
      - ffmpeg
      - python3-numpy
    dnf:
      - ffmpeg
      - python-numpy
    pacman:
      - ffmpeg
      - python-numpy
      - python-sounddevice
    pip:
      - numpy
      - pvorca
      - sounddevice
  package: platypush.plugins.tts.picovoice
  type: plugin
Author	SHA1	Message	Date
Fabio Manganiello	2b287b569f	[assistant.picovoice] Conversation flow improvements. continuous-integration/drone/push Build is passing Details - The `Responding` state should be modelled as an extra event/binary flag, not as an assistant state. The assistant may be listening for hotwords even while the `tts` plugin is responding, and we don't want the two states to interfere with each either - neither to build a more complex state machine that also needs to take concurrent states into account. - Stop any responses being rendered upon the `tts` plugin when a new hotword audio is detected. If e.g. I say "Ok Google", I should always be able to trigger the assistant and stop any concurrent audio process. - `SpeechRecognizedEvent` should be emitted even if `cheetah`'s latest audio frame results weren't marked as final, and the speech detection window timed out. Cheetah's `is_final` detection seems to be quite buggy sometimes, and it may not properly detect the end of utterances, especially with non-native accents. The workaround is to flush out whatever text is available (if at least some speech was detected) into a `SpeechRecognizedEvent` upon timeout.	2024-04-13 20:01:21 +02:00
Fabio Manganiello	24e93ad160	Added more default imports under the `platypush` module root. These objects can now also be imported in scripts through `from platypush import <name>`: - `Variable` - `cron` - `hook` - `procedure`	2024-04-10 23:33:48 +02:00
Fabio Manganiello	3b73b22db9	[assistant.picovoice] More features. - Added wiring between `assistant.picovoice` and `tts.picovoice`. - Added `RESPONDING` status to the assistant. - Added ability to override the default speech model upon `start_conversation`. - Better handling of conversation timeouts. - Cache Cheetah objects in a `model -> object` map - at least the default model should be pre-loaded, since model loading at runtime seems to take a while, and that could impact the ability to detect the speech in the first seconds after a hotword is detected.	2024-04-10 22:26:45 +02:00
Fabio Manganiello	9761cc2eef	Added `tts.picovoice` plugin.	2024-04-10 20:32:32 +02:00
Fabio Manganiello	6bd20bfcf6	Added ffmpeg requirement for `assistant.picovoice`.	2024-04-10 20:31:38 +02:00
Fabio Manganiello	8702eaa25b	s/partial_transcript/transcript/g	2024-04-09 00:19:51 +02:00
Fabio Manganiello	6feb824c04	Refactored `AssistantEvent`. `AssistantEvent.assistant` is now modelled as an opaque object that behaves the following way: - The underlying plugin name is saved under `event.args['_assistant']`. - `event.assistant` is a property that returns the assistant instance via `get_plugin`. - `event.assistant` is reported as a string (plugin qualified name) upon event dump. This allows event hooks to easily use `event.assistant` to interact with the underlying assistant and easily modify the conversation flow, while event hook conditions can still be easily modelled as equality operations between strings.	2024-04-09 00:15:51 +02:00