#SpeechRecognition #VoiceBasedAI #DeepLearning #SpeechToText #Abzooba
Created a simple Voice based AI Assistant using speech recognition library in python.
It does the following :
1) Understands voice
2) Can convert speech to text ( My starting point , as wanted an Assistant who can help me write my stories )
3) Perform simple actions ,Like ?
3A) You ask for the current time and get's it instantly. 3B) Understands the intent or context in the speech and can perform specific action e.g. When I ask Rudra ( Have named my AI after Lord Shiva ) to find the location of a place , it automatically opens Google Map with the particular location , or when I ask to find a similar kind of shirt by providing the picture of a blue shirt , it scans the image and then find similar looking shirt in #Amazon or #Myntra.
Interesting, isn't it?
Pray to Lord Rudra and get started !
Let's look into how I did it.
Some Important Stuff FIRST -
For Speech Recognition, you need Speech Recognition library.
Do a pip install and get it installed.
pyaudio will also be required.
I am using Keras ( which using tensorflow backend) here for further processing .
Google has a great Speech Recognition API. This API converts spoken text (microphone) into written text (Python strings), briefly Speech to Text.
A text-to-speech (TTS) system converts normal language text into speech.
Let's Look into the WorkFlow :
A) I say what TIME is it ....
Good, let's move to more complex stuff !
B) I say to find me a Location like let's say I ask - Where is Abzooba ?
Great , now let's move to more Complex things .
C) I show an Image to Rudra - I have kept the image in a folder as for now but once you create a simple UI along with it you can just upload the image .
I ask Rudra to search for similar item in #Myntra
Rudra scans the image and then automatically opens up Myntra for the possible choices.
3) Perform simple actions ,Like ?
3A) You ask for the current time and get's it instantly. 3B) Understands the intent or context in the speech and can perform specific action e.g. When I ask Rudra ( Have named my AI after Lord Shiva ) to find the location of a place , it automatically opens Google Map with the particular location , or when I ask to find a similar kind of shirt by providing the picture of a blue shirt , it scans the image and then find similar looking shirt in #Amazon or #Myntra.
Interesting, isn't it?
Pray to Lord Rudra and get started !
Let's look into how I did it.
Some Important Stuff FIRST -
For Speech Recognition, you need Speech Recognition library.
Do a pip install and get it installed.
pyaudio will also be required.
I am using Keras ( which using tensorflow backend) here for further processing .
Google has a great Speech Recognition API. This API converts spoken text (microphone) into written text (Python strings), briefly Speech to Text.
A text-to-speech (TTS) system converts normal language text into speech.
Let's Look into the WorkFlow :
Let's see how it works :
A) I say what TIME is it ....
Good, let's move to more complex stuff !
B) I say to find me a Location like let's say I ask - Where is Abzooba ?
Automatically opens up Chrome browser with Goggle Map location the particular address.
Great , now let's move to more Complex things .
C) I show an Image to Rudra - I have kept the image in a folder as for now but once you create a simple UI along with it you can just upload the image .
I ask Rudra to search for similar item in #Myntra
Rudra scans the image and then automatically opens up Myntra for the possible choices.