NVIDIA

An explainer on how DefinedAI and NVIDIA train conversational AI to sound human.

Client

NVIDIA

Project

NVIDIA & DefinedAI

Industry

Software

Style

Runtime

4 minutes

The brief

DefinedAI and NVIDIA needed conversational AI explained in plain terms. The explainer shows how DefinedAI's data and NVIDIA's GPUs train voice AI to understand people and answer like one.

The script

For conversational AI to provide a seamless, natural, and human-like experience, it needs to be trained on substantial amounts of data representative of the problem the model is trying to solve.

The difficulty for machine learning teams is the scarcity of this high-quality, domain-specific data.

Companies are trying to solve this problem and accelerate the widespread adoption of conversational AI with innovative solutions that guarantee the scalability and internationality of models.

This video will show you how DefinedAI and NVIDIA integrate to create world-class conversational AI, easily, simply, and quickly.

STEP 1: DEVELOP THE MODEL

NeMo is a toolkit built by NVIDIA which makes it possible for you to quickly create and train complex, state-of-the-art neural network architectures with three lines of code.

The NeMo model is composed of modules which are simple to combine and re-use, making it easy to train, build, and manipulate AI models.

But there is still one large, and vital component missing from the pipeline.

Data.

STEP 2: GET THE DATA

No matter how good a model, it will never reach its full potential if it is not trained on high-quality, domain-specific, language-relevant data.

It’s a well-known adage among machine learning teams: garbage in, garbage out.

Collecting, annotating, and validating this data, however, can take even longer than building a model from scratch.

It is also notoriously difficult to generate high-quality datasets that are suitably diverse and large.

This is where DefinedAI comes in.

DefinedAI leverages a global crowd of over 500,000 contributors to collect, annotate, and validate training data by dividing the work into a series of micro-tasks through our online platform, Neevo.

This data is then used to train conversational AI models in a variety of languages, accents, and domains.

Our quality metrics guarantee native speakers, domain accuracy, as well as gender and age distribution, all of which minimize the bias in the corpus, and in your model.

Developers building models with NeMo can integrate with DefinedAI to train their models with either off-the-shelf data or custom data.

DefinedAI’s off-the-shelf datasets are ready-to-use and available through an online catalog called DefinedData.

There are currently over 70 datasets available for instant download, in a variety of languages, dialects and domains.

On the other hand, if you need a custom speech dataset, DefinedWorkflows allows you to set the exact parameters of the data collection.

Simply use our enterprise portal to create your project by selecting the type of speech data you need:

The number of hours and contributors;

The speaker set-up, and style;

The environment conditions and communication band;

And the language and demographics of your contributors.

Then focus your time on other tasks while we do the work for you.

Follow the progress of your project from the portal.

And once complete, download your dataset.

Now it is time to train your model.

Preparation

In this step, also called data injection, we convert data into the format that NeMo will use.

Model Training

NeMo allows you to use the best-in-class machine learning algorithms in an efficient way, and by running multiple iterations, you can train a powerful neural network to learn patterns. The result? A robust model able to perform new predictions for speech recognition.

Model Evaluation

After training, it is important to evaluate the performance of your model. To do this, measure the Word Error Rate (WER) on a held-out set of speech data to determine how many errors your model makes, and where you need to adjust.

And then, you are ready to launch.

Build world-class conversational AI with NVIDIA’s NeMo and DefinedAI.

Fonts & colors

Typeface

Palette