Using Realtime speech LLMs to assist with service requests

Provide solutions and raise service tickets.

Initial Architecture - Version V1

Link to the architecture diagram : User draw.io to render

alt text

Using webapp get the audio input from user with record button from UI
Pass to a audio to text conversion tool (google or other services)
Use LLM model - Open AI/Gemini to get the response back for the converted text
Use text to speech to convert back the text generated from the Large language model
Return the audio file to webapp

High level initial implementation for version 1 , later we will introduce chat like functionality, having context to the model on the topic, raising service tickets.(in future blogs)

Part 1 : Recording audio from user and generating a audio file

Using replit agent created a voice recording flask app which we can leverage

replit-voice-recorder

Setting up the code base locally to test and to deploy this as webapp

alt text

Pushing the code to the repo : link
Creating a webapp and resource group to deploy and run this app in azure : web app link
Enabling the deployment and attaching to the above repo, workflow yml : yml
Following the initial blog steps to add startup command in app portal and configuring the app service to run the flask application

gunicorn --bind 0.0.0.0:$PORT main:app

Successfully deploying the app to app service using github actions

deployment-success

Verified the deployment by accessing the app service link, the web server is running successfully

Link : https://techvistara-ai-voice.azurewebsites.net/

running-app

Flow diagram until completed portion

alt text

Next steps

Generating text from the generated audio file from the user

Exploring Azure Speech to Text service

Project Suspension Notice

`Due to budget constraints and the need to prioritize other projects, I have decided to temporarily suspend the AI Voice chatting application. The web application will be shut down until the next steps are designed and finalized. All progress has been archived and can be accessed here:` Project Archive