AI and Deep Learning Solutions by AWS to Streamline Operations and Enhance Customer Experience
AWS AI services bring deep learning technologies within reach of every developer. Whether you are getting started with AI or you are an expert in deep learning, this blog post will provide meaningful insight into the AI services of AWS and demonstrate their impressive functionality.
Amazon’s computer vision service, AWS Rekognition, helps embed visual analysis to various applications. It can search, verify, and organize tens of thousands of images and analyze motion-based content. The service offering is developed on a highly scalable deep learning technology, which allows the detection of objects, faces, and scenes, reads textual content and identifies any inappropriate content. Moreover, it can also perform face comparison.
Rekognition Image detects any objects and scenes in images, whereas Rekognition Video detects the movement of objects in a frame. It is capable of detecting human activity even when the face is not visible. Therefore, it makes up a use case in receiving notification if a delivery person comes near your entry gate for delivery. A few other widely implemented use cases for Rekognition Image include face recognition, sentimental analysis, and those for Rekognition Video are search indexes and explicit content filtering in online media.
The best part about this service is that you need not have the expertise to build, maintain or upgrade pipelines. To attain accuracy in any computer vision task, a considerable amount of labelled ground truth data and Graphics Processing Units (GPU) should be used for training to allow high computations. However, all this is automatically taken care of while using Rekognition. It is pre-trained for recognition related tasks, hence, leading to a fully managed deep learning pipeline. This helps in keeping your focus on the design and development of the core application.
The following figure shows the functions supported by the Rekognition service.
Fig. Comparison of top computer vision APIs by public cloud providers
Nowadays, the chatbot market is growing expansively, and almost every kind of business is benefitting from it. Majorly, the chatbot can be of three types: rule-based, AI bots, and hybrid bots.
The rule-based/linguistic chatbots offer fine-tuned control and are highly flexible. They use if-else logical conditions to direct conversation flow using a linguistic model; hence, their interaction capabilities are very structured. You can easily come across these chatbots on e-commerce platforms and social networking sites.
On the other hand, though higher in complexity, AI bots offer more real-life conversations. Over time, they learn and become contextually better. Due to machine learning capabilities, these bots are capable of learning from previous conversations.
Lastly, the hybrid bots take up the best of both- rule-based and AI. They make use of ML integrations that go beyond linguistic rules.
Amazon Lex is an AWS offering for making conversational bots capable of interacting through voice and text and is backed by ASR (Automatic Speech Recognition). Using Lex, you can publish chatbots on various chat services and mobile devices.
The best part about using Lex is that the developers need not have machine learning expertise. The language model is automatically built as per the prompts given. Also, there are no bandwidth constraints and Lex auto-scales as per your needs. It uses deep learning to get smarter with time. Several Lex use cases include information bots, bots to control devices, order placing or travel bots, self-service bots, etc.
Lex has iOS and Android SDKs for mobile development, and you need not certify the bot before deployment. Currently, the maximum speech input time is 15 seconds for slot filling and the languages supported are US Spanish, Canadian French, British English, Australian English, German, US English, Latin American Spanish and French.
Lex V2 has an additional streaming API where the bot will listen continuously and respond proactively. Messages like “Take your time to respond” can be further added to make the conversation human-like.
According to a 2019 Gartner report, the top 16 market vendors for conversational chatbots are-
Other than the above-mentioned, Gensim, Textblob, PyNLPL, CoreNLP, spaCy Python NLP libraries are also widely used to build AI chatbots.
Fig. Comparison of major chatbot building frameworks
With Amazon Lex, MIND helped develop a Covid assessment tool consisting of a genie interacting as a chatbot for voice and text-based interactions. It leverages multiple features of AI such as voice, Natural Language Processing and decision making to provide a solution that can scale up to serve multiple people. The solution interacts with the users and suggests whether they are prone to COVID-19 based on a set of related questions. The bot is conversational and responds to the user’s voice and text replies in real-time. It is accessible on the following link: https://master.dcjrsebf77vbs.amplifyapp.com/Chat
Fig. Architecture diagram
Fig. Deployed COVID Chatbot
Using Amazon Lex, MIND helped develop a virtual environment consisting of an interactive chatbot for voice and text-based interactions. It includes real-time conversations with the bot, and the users can see the captioning of the responses, which further enhances the user experience. The scene consisted of a wall television that plays a video and adds to the aesthetics of the environment.
The host of the scene in Sumerian is in sync with Amazon Lex – which is used for creating the bot. It helps create the conversational interface and build chatbots without any heavy lifting. Amazon Polly then turns the text response by Lex into speech. The Lambda function is written in Python to act as the backend of the whole task by initializing and validating the user input.
Fig. A close view of the VR scene
Here, the client required an attendance tracking system for its employees. For this, MIND used the facial recognition capabilities of AWS Rekognition along with DynamoDB and Lambda. A collection of image libraries was created in Rekognition, comprising each employee’s training images.
Corresponding to this set of pictures, the video frame is captured that helps match the image for the attendance marking. Amazon S3 is used to store the collection, and the workflow comprises two sections- Indexing and Analysis.
Indexing helps populate the collection in Rekognition using IndexFacesAPI, whereas Analysis comprises queries that run on this collection using SearchFacesByImage API. DynamoDB is used to store the key-value pair like “emp_id-present” for later reference and UI display. Similarly, the Lambda function is created in Python to act as the backend of the service interactions.
Fig. Architecture diagram
If you want to learn more about Amazon AI solutions, request a demo or contact us here.
Prachi Gulihar – ML Engineer, AI / ML Practice
Rajat Dwivedi – ML Engineer, AI / ML Practice
Nishu Malik – ML Engineer, AI / ML Practice
Trends and insights from our IT Experts