HPR4188: Re: HPR4172 Comment by Ken Fallon
Hacker Public Radio - A podcast by Hacker Public Radio
Categories:
Hello HPR, this is your host Archer72, for another episode of Hacker Public Radio. My subject today is Piper voice synthesis, continued. In response to Ken's comment on my show, hpr4172 Re: hpr4072 Piper voice synthesis, I am responding to his comment with a solution. I'm glad that Ken commented, because I had put the problem on the back burner and forgot about it. Both of us had the make command for the Piper github repo fail at 22%. I ignored it for the time being and compiled Piper for the Raspberry Pi instead. Now here is the comment that got me to figure out how to get Piper working on my own Fedora 40 laptop. Comment #1 posted on 2024-07-24 14:04:27 by Ken Fallon Fails on Fedora 40 [ 22%] Linking C shared library libespeak-ng.so /usr/bin/ld: ../ucd-tools/libucd.a(case.c.o): relocation R_X86_64_32S against `.rodata' can not be used when making a shared object; recompile with -fPIC /usr/bin/ld: failed to set dynamic section sizes: bad value collect2: error: ld returned 1 exit status [snip] make: *** [Makefile:5: all] Error 2 Comment #2 posted on 2024-07-26 09:38:17 by Archer72 Re:Fails on Fedora 40 Hi Ken, I get the same failure, and found that there was a release in 2023 Github Piper repo I put the downloaded piper directory in /opt , along with the piper-voices/en directory and have a successful voice output. uname -r 6.9.5-200.fc40.x86_64 [piper] [info] Loaded voice in 0.33 second(s) [piper] [info] Initialized piper Output directory: /home/mark/./output.wav End of comments In further conversations with Ken, I found that a vital part of Piper voice synthesis had been forgotten. For example, if you are using Piper to convert a text file to .wav, the command need to include the following: input.txt piper executable location, i.e. /opt/piper/piper --model, and the model location, i.e. /opt/piper-voices/ note, the voice used need to include the voice in .onnx format and also the voice configuration in .json format --output_file output.wav The final script is included in the show notes. #!/bin/bash cat "$1" | /opt/piper/piper --model /opt/piper-voices/en_US/kusal/medium/en_US-kusal-medium.onnx --output_file output.wav That's it! Now if you had downloaded the voices from hpr4172, there should be a successful voice to text output. To put the final touches on 'my' voice, which is the en_US-kusal-medium voice, processed just a bit with the sox program. ## ~/bin/make-my-voice.sh #!/bin/bash # Add 2 seconds of silence to the beginning of the file sox $1 output.wav pad 2 ## Reduce clipping sox output.wav output-mid.wav vol 0.99 # Reduce the tempo by 12% sox output-mid.wav final_output.wav tempo 0.88 One last thing. It was brought to my attention that the Piper voice, Bryce 'may' sound like William Shatner in the original Star Trek tv series. I will put that clip in here and see what the community thinks. Feel free to leave a comment saying 'yay' or 'nay' on this opinion. If this tool works for you, feel free to leave comments on this show. Better yet, record a show of your own. Looking forward to hearing from the next host, whether it be by text to speech, or a microphone. Remember to support free software and apps in the F-Droid store if you use Android. This has been your host Archer72; Bye