.
Here we will learn how to convert the human language text into human-like speech and save it as an audio file.
Sometimes we prefer listening to the content instead of reading. We can do multitasking while listening to the critical file data.
Python provides many APIs to convert text to speech.
Required pip
Installation required pip package API
Open command or terminal and perform the following steps one by one
pip install gtts
pip install playsound
pip install pyttsx3
Here gTTS API is ‘The Google Text to Speech’ API .
It is very easy to use the tool and provides many built-in functions which used to save the text file as an mp3 file.
We don't need to use a neural network and train the model to covert the file into speech, as it is also hard to achieve. Instead, we will use these APIs to complete a task.
The gTTS API provides the facility to convert text files into different languages such as English, Hindi, German, Tamil, French, and many more. We can also play the audio speech in fast or slow mode.
However, as its latest update, we cannot change the speech file; it will generate by the system and not changeable.
To convert text files into, we will use another offline library called pyttsx3.
Example:
Example for Text to Speech as English as Save Audio File.
import gtts
from playsound import playsound
# We need to import it and pass the gTTS object that is an interface to the Google Translator API.
# make a request to google to get synthesis
t1 = gtts.gTTS("Convert text to speech ans save as audio file.")
# In the above line, we have sent the data in text and received the actual audio speech. Now, save this an audio file as welcome.mp3.
# save the audio file
t1.save("welcome.mp3")
# It will save it into a directory, we can listen this file as follow:
# play the audio file
playsound("welcome.mp3")
Please turn on the system volume, listen the text as we have saved earlier.
Now, we will define the complete Python program of text into speech.
Example for Text to Speech as Hindi as Save Audio File.
# Import the gTTS module for text to speech conversion
from gtts import gTTS
# This module is imported so that we can play the converted audio
from playsound import playsound
# It is a text value that we want to convert to audio
text_val = 'All the best for your exam.'
# Here are converting in English Language
language = 'en'
# Passing the text and language to the engine, here we have assign slow=False. Which denotes the module that the transformed audio should have a high speed
obj = gTTS(text=text_val, lang=language, slow=False)
#Here we are saving the transformed audio in a mp3 file named exam.mp3
obj.save("exam.mp3")
# Play the exam.mp3 file
playsound("exam.mp3")
Explanation:
In the above code, we have imported the API and use the gTTS function. The gTTS() function which takes three arguments -
The first argument is a text value that we want to convert into a speech.
The second argument is a specified language. It supports many languages. We can convert the text into the audio file.
The third argument represents the speed of the speech. We have passed slow value as false; it means the speech will be at normal speed.
We saved this file as exam.py, which can be accessible anytime, and then we have used the playsound() function to listen the audio file at runtime.
Available languages
The list of available languages
To get the available languages, use the following functions –
Code Language
'af': 'Afrikaans',
'sq': 'Albanian',
'ar': 'Arabic',
'hy': 'Armenian',
'bn': 'Bengali',
'bs': 'Bosnian',
'ca': 'Catalan',
'hr': 'Croatian',
'cs': 'Czech',
'da': 'Danish',
'nl': 'Dutch',
'en': 'English',
'et': 'Estonian',
'tl': 'Filipino',
'fi': 'Finnish',
'fr': 'French',
'de': 'German',
'el': 'Greek',
'en-us': 'English (US)',
'gu': 'Gujarati',
'hi': 'Hindi',
'hu': 'Hungarian',
'is': 'Icelandic',
'id': 'Indonesian',
'it': 'Italian',
'ja': 'Japanese',
'en-ca': 'English (Canada)',
'jw': 'Javanese',
'kn': 'Kannada',
'km': 'Khmer',
'ko': 'Korean',
'la': 'Latin',
'lv': 'Latvian',
'mk': 'Macedonian',
'ml': 'Malayalam',
'mr', 'en-in': 'English (India)'
We have mentioned a few important languages and their code. You can find almost every language in this library.
Offline API
We have used the Google API, but what if we want to convert text to speech using offline. Python provides the pyttsx3 library, which looks for TTS engines pre-installed in our platform.
Let's understand how to use pyttsx3 library:
Example
import pyttsx3
# initialize Text-to-speech engine
engine = pyttsx3.init()
# convert this text to speech
text = "Python is a great programming language"
engine.say(text)
# play the speech
engine.runAndWait()
# In the above code, we have used the say() method and passed the text as an argument. It is used to add a word to speak to the queue, while the runAndWait() method runs the real event loop until all commands queued up. It also provides some additional properties that we can use according to our needs. Let's get the details of speaking rate:
# get details of speaking rate
rate = engine.getProperty("rate")
print(rate)
# We can change rate of speed as we want:
# setting new voice rate (faster)
engine.setProperty("rate", 300)
engine.say(text)
engine.runAndWait()
# If we pass the 100 then it will be slower.
engine.setProperty("rate", 100)
engine.say(text)
engine.runAndWait()
engine.say(text)
engine.runAndWait()
# Now, we can hear the text file in the voices.
# get details of all voices available
voices = engine.getProperty("voices")
print(voices)
Output:
rate 200
[<pyttsx3.voice.Voice object at 0x000002D617F00A20>, <pyttsx3.voice.Voice object at 0x000002D617D7F898>, <pyttsx3.voice.Voice object at 0x000002D6182F8D30>]
In this tutorial, we have discussed the transformation of text file into speech using the third-party library. We also discussed the offline library. By using this, we can build own virtual assistance.