Speech Recognition with TensorFlow.js

Speech Recognition with TensorFlow.js

Build a small speech recognition sample using TensorFlow.js When we usually talk about AI, deep learning, machine learning we automatically think of Python, R, or C , but what about JavaScript? Well… turns out, one of the most popular libraries for machine learning in Python is available for JavaScript as well, we are talking about Tensorflow, and today we will do a short introduction into the library, and we will build a fun project together. What is Tensorflow.js and for what can be used? TensorFlow.js is a JavaScript library developed by Google for training and deploying machine learning models in the browser and in Node.js. It’s a companion library to TensorFlow, the popular ML library for Python. TensorFlow.js is not just a toy library, it is serious business, the performance is surprising, especially when using hardware acceleration through WebGL, but should we train models with it? Perhaps no, even though you can achieve great performance, it’s Python counterpart is even faster, and when working with Python you will find more libraries to support your code like Numpy and Pandas. In addition to learning materials, where there’s not as much for TensorFlow.js as there is for TensorFlow. Now, this doesn’t mean you shouldn’t use TensorFlow.js, on the contrary, I think it’s a great library for deploying and running ML models, and it is what we are going to focus for the rest of the article. Deploying a sample model with TensorFlow.js As we said, TensorFlow.js is a powerful library, and we can work on a lot of different things like image classification, video manipulation, and speech recognition among others. For today I decided to work on a basic speech recognition example. Our code will be able to listen through the microphone and identify what the user is saying, at least up to a few words as we have some limitations on the sample model I’m using. But rather than explaining, I think it’s cool if we see it first in action: Please enable the microphone checkbox and authorize this site to access the microphone. Once the process finished loading speak one of the word bellow and see the magic happen. Pretty cool? I know it can be a bit erratic, and it’s limited to a few words, but if you use the right model, the possibilities are endless. Enough talking, let’s start coding. The first thing we need to do is to install the library and get our model. For installing TensorFlow.js there are a few options that can be reviewed here, in our case to keep it simple we will import it from CDN. Then we would use some HTML to show the list of words: Microphone Loading… So far nothing strange, we have our checkbox, a loading element and a wrapper element which we will use to render the list of words, so let’s do that next: const wrapperElement = document.getElementById(‘sp-cmd-wrapper’); for (let word of wordList) { wrapperElement.innerHTML = `${word}`; } In order for the demo to start working we need to click on the Microphone checkbox, let’s set an event listener there to trigger the loading and listening processes. document.getElementById(“audio-switch”).addEventListener(‘change’, (event) ={ if(event.target.checked) { if(modelLoaded) { startListening(); }else{ loadModel(); } } else { stopListening(); } }); When the checkbox changes its value we have 3 different possibilities, the user enabled the checkbox and the model is not loaded,  » Read More

Like to keep reading?

This article first appeared on livecodestream.dev. If you'd like to keep reading, follow the white rabbit.

View Full Article

Leave a Reply