ai mlbeginner 2m2024-11-12

AI Voice chatting to help with Customer support use-cases

Using current speech augmented LLMs (SpeechLLMs) with realtime voice modality to understand user issues and to provide support and solutions.

Using Realtime speech LLMs to assist with service requests

Provide solutions and raise service tickets.

Initial Architecture - Version V1

Link to the architecture diagram : User draw.io to render

alt text

Using webapp get the audio input from user with record button from UI
Pass to a audio to text conversion tool (google or other services)
Use LLM model - Open AI/Gemini to get the response back for the converted text
Use text to speech to convert back the text generated from the Large language model
Return the audio file to webapp

High level initial implementation for version 1 , later we will introduce chat like functionality, having context to the model on the topic, raising service tickets.(in future blogs)

Part 1 : Recording audio from user and generating a audio file

Using replit agent created a voice recording flask app which we can leverage

replit-voice-recorder

Setting up the code base locally to test and to deploy this as webapp

alt text

Pushing the code to the repo : link
Creating a webapp and resource group to deploy and run this app in azure : web app link
Enabling the deployment and attaching to the above repo, workflow yml : yml
Following the initial blog steps to add startup command in app portal and configuring the app service to run the flask application