The Pony Preservation Project is an umbrella project by the My Little Pony fan community to make AI technologies available to content creators. The project has a broad focus encompassing dataset creation, speech generation, text generation, and animation generation. Project contributors have developed techniques for collecting and cleaning audio data, alongside guidelines for making datasets available for compatibility with text-to-speech training tools, and they have developed scripts to enable others to train their own speech models using state-of-the-art techniques. The Pony Preservation Project's data, tools, and techniques have been used by numerous other fan groups and speech generation enthusiasts.

History[edit | edit source]

The Pony Preservation Project began on the 4chan /mlp/ board on April 5, 2019[1] . The original purpose of the project was to create a text-to-speech engine for the characters of the My Little Pony television show. The first tortured prototypes, using the DeepVoice3 text-to-speech engine, were demonstrated 9 days after the project's formation, demonstrating the task's feasibility. At around the same time, the group began substantially increasing efforts to collect and clean speech transcription data using source material in multiple languages, subtitles, and transcript data from fan wikis. The first dataset creation tools were created to align this data and support interoperability with Audacity.

Techniques[edit | edit source]

Models[edit | edit source]

Datasets[edit | edit source]

Impact[edit | edit source]

External links[edit | edit source]

Pony Preservation Project - Quick Start Guide

References[edit | edit source]