From Voice Technology Wiki
Jump to navigation Jump to search

License MIT License

Internetaccess required No, but services like weather and Wiki won't work

Supported Languages en, de + partial support for 15 more (beta)

Default Wakeword Hey SEPIA, customizable

Default TTS Engine eSpeak, picoTTS, MBROLA, MaryTTS + compatible APIs

Operating System Windows, Linux, Mac OS X, Raspberry Pi

Programming Language Java (Assist-Server), HTML/Javascript (Client), Python (STT-Server)

Latest Version 2.6.2

Latest Update 2022-05-08

Introduction: S.E.P.I.A. Open Assistant Framework[edit | edit source]

S.E.P.I.A. is an acronym for: self-hosted, extendable, personal, intelligent assistant. It is a modular, open-source framework equipped with all the required tools to build your own, full-fledged digital voice-assistant, including speech recognition (STT), wake-word detection, text-to-speech (TTS), natural-language-understanding (NLU), dialog-management, SDK(s), a cross-platform client app and much more.

The framework consists of several, highly customizable micro-services that work together seamlessly to form the SEPIA Open Assistant. It follows the client-server principle using a lightweight Java server and Elasticsearch DB as "brain" and a JavaScript based client that works for example as smart-speaker, smart-display or full-fledged digital assistant app.

All components work on Linux, Windows and Mac and have been optimized to even run smoothly on a Raspberry Pi.

Out-of-the-box SEPIA currently has smart-services for: news, music (radio), timers, alarms, reminders, to-do and shopping lists, smart home (e.g. using open-source tools like openHAB), navigation, places, weather, Wikipedia, web-search, soccer-results (Bundesliga), a bit of small-talk and more. To build custom services there is a Java SDK and a code editor integrated into the SEPIA Control HUB web-app. The client can be extended with custom HTML widgets.[1]

Components[edit | edit source]

Some of the core SEPIA framework components:

  • SEPIA Assist-Server - The "brain" of SEPIA responsible for: user-accounts, database integration, NLU, dialog-management, open-source TTS, smart-services (weather, navigation, alarms, news etc.), remote actions and more[2].
  • SEPIA Chat-Server - A WebSocket based Chat-Server that takes care of real-time, asynchronous user-to-user and user-to-assistant communication[3].
  • SEPIA Cross-Platform Client - The primary SEPIA client based on HTML5 web technology that runs on all modern browsers (mobile and desktop) and communicates with other SEPIA components via HTTP or WebSocket connections[4]. It supports headless- and kiosk-mode via CLEXI server to build independent smart-speaker or smart-display like devices[5]. The client is available as Android app as well via Play Store or direct download.
  • SEPIA STT-Server - A WebSocket based, full-duplex Python server for real-time automatic speech recognition (ASR) supporting multiple open-source ASR engines. It can receive a stream of audio chunks via the secure WebSocket connection and return transcribed text almost immediately as partial and final results[6].

There are many more SEPIA components and tools to discover on the official GitHub project page.

Installation[edit | edit source]

The official documentation page has the most recent installation guides.