Offline speech recorder

8/17/2023

This can be used to avoid having to explicitly exit (zero disables). Time out recording when no speech is processed for the time in seconds. Only used when -defer-output is disabled. Without this enabled, the entirety of this dictation session will be processed on every update. This prevents text being typed during speech (implied with -output=STDOUT)Įnable this option, when you intend to keep the dictation process enabled for extended periods of time. When enabled, output is deferred until exiting. The sample rate to use for recording (in Hz). See the output of "pactl list sources" to find device names (using the identifier following "Name:"). The name of the pulse-audio device to use for recording. See vosk_recognizer_new_grm in the API reference: This restricts the phrases recognized by VOSK forīetter accuracy. Use an empty string to prevent the users configuration being read. Override the file used for the user configuration. Location for writing a temporary cookie (this file is monitored to begin/end dictation). This creates the directory used to store internal data, so other commands such as sync can be performed. While it could use any system currently it uses the VOSK-API. This is a utility that activates speech to text on Linux. Note that -vosk-model-dir=PATH can be used to override the default. Paths Local Configuration ~/.config/nerd-dictation/nerd-dictation.py Language Model Context sensitive actions can be implemented using command line utilities to access the active window.Simply return a blank string if you have implemented your own text handling. The processing function can be used to implement your own actions using keywords of your choice.# ~/.config/nerd-dictation/nerd-dictation.py def nerd_dictation_process( text):Ī more comprehensive configuration is included in the examples/ directory. dotool command to simulate input anywhere (X11/Wayland/TTYs).See the setup guide: Using ydotool with nerd-dictation. ydotool command to simulate input anywhere (X11/Wayland/TTYs).xdotool command to simulate input in X11.You may select one of the following input simulation utilities. sox command as alternative, see the guide: Using sox with nerd-dictation.parec command for recording from pulse-audio.You may select one of the following tools. An input simulation utility ( xdotool by default).An audio recording utility ( parec by default).See nerd-dictation begin -help for details on how to access these options. While suspended all data is kept in memory and the process is stopped.Īudio recording is stopped and restarted on resume. In this case suspend/resume can be useful. Suspend/Resume Initial load time can be an issue for users on slower systems or with some of the larger language-models, User Configuration Script User configuration is just a Python script which can be used to manipulate text using Python's full feature set. Output Type Output can simulate keystroke events (default) or simply print to the standard output. (without an explicit call to end which is otherwise required). Time Out Optionally end speech to text early when no speech is detected for a given number of seconds. So Three million five hundred and sixty second becomes 3,000,562nd.Ī series of numbers (such as reciting a phone number) is also supported. Optional conversion from numbers to digits. Specific features include: Numbers as Digits

Nerd-dictation -help and nerd-dictation begin -help. For details on how this can be used, see:

0 Comments

Offline speech recorder

Leave a Reply.

Author

Archives

Categories