Convert Speech To Text using Python

Packages Needed:

To Convert Speech to Text using Python with Existing Libraries, We need to have following packages installed
SpeechRecognition, pipwin, pyaudio. Take Note that we are going to implement it using Jupytor Notebook with all prebuilt python libraries and google speech to text APIs for speech to text conversion.

Run Following Commands one by one to Install Required Packages

Run following commands in Anaconda CMD Prompt with administrator previlages.

pip install SpeechRecognition
pip install pipwin
conda install -c anaconda pyaudio
pip install pyaudio

About PyAudio

PyAudio is required to Connect our microphones with jupytor notebook. To Read more about pyaudio classes visit below links
https://anaconda.org/anaconda/pyaudio 
https://www.lfd.uci.edu/~gohlke/pythonlibs/#pyaudio
To install PyAudio, Run in the Anaconda Terminal CMD: conda install -c anaconda pyaudio #Pre-requisite for running PyAudio installation – Microsoft Visual C++ 14.0 or greater will be required. Get it with “Microsoft C++ Build Tools” : https://visualstudio.microsoft.com/visual-cpp-build-tools/ #To run PyAudio on Colab, please install PyAudio.whl in your local system and give that path to colab for installation.

About Google Speech

Key features that the Google Speech API is capable of are the adaptation of speech. This means that the API understands the domain of the speech. For instance, currencies, addresses, years are all prescribed into the speech-to-text conversion. There are domain-specific classes defined in the algorithm that recognize these occurrences in the input speech. The API works with both on-prem, pre-recorded files as well as live recordings on the microphone in the present working environment. We will analyze live speech through microphonic input in the next section.

Code:

import speech_recognition as speech_recog
# Creating a recording object to store input
rec = speech_recog.Recognizer()
# Importing the microphone class to check availabiity of microphones
mic_test = speech_recog.Microphone()
# List the available microphones
speech_recog.Microphone.list_microphone_names()
# We will now directly use the microphone module to capture voice input. Specifying the second microphone to be used for a duration of 3 seconds. The algorithm will also adjust given input and clear it of any ambient noise
with speech_recog.Microphone(device_index=1) as source:
    rec.adjust_for_ambient_noise(source, duration=3)
    print("Reach the Microphone and say something!")
    audio = rec.listen(source)
    #Reach the Microphone and say something!
# Use the recognize function to transcribe spoken words to text
try:
    txt = rec.recognize_google(audio,language = 'en-US')
    print("I think you said: \n" + txt)
except Exception as e:
    print(e)

Second Parameter in recognize_google function is language, you can change it to your local language like ur-PK is for urdu-Pakistan to translate your audio in desired language. ByDefault it translates in English-Roman.

Output:

Reach the Microphone and say something!
I think you said:
Hello, Welcome to Code Seekers Website Thank you for reading my blog

Related Post

1 Comment

  • […] In this Article, We will learn how to convert Text to Speech using Python Google Text to Speech API. Before this, I have already posted article and video about how to convert speech to text using python. […]

Leave a Reply

Your email address will not be published. Required fields are marked *