How to Do Sentiment Analysis – Intro to Deep Learning #3

How to Do Sentiment Analysis – Intro to Deep Learning #3


Hello world, it’s Siraj and today we’re
going to use machine learning to help us understand our emotions. Our emotional intelligence distinguishes
us from every other known living being on Earth. These emotions can be simple
like when you get so hyped, all you can hear is
Gasolina by Daddy Yankee. And we’ve invented language to
help us express them to others. But sometimes words are not enough, some emotions have no
direct English translation. For example, in German, [FOREIGN] is
the feeling experience when you’re alone in the woods,
connecting with nature. In Japanese, [FOREIGN] is the awareness
of the impermanence of all things and the gentle sadness at their passing. Emotions are hard to express,
let alone understand, but that’s where AI can help us. And AI can understand us better than we
do analyzing our emotional data to help us make optimal decisions for
goals that we specify, like a personal life coach slash
therapist slash Denzel Washington. But how would it do this? There are generally two main
approaches to Sentiment Analysis. The first one is
the Lexicon Based Approach. We first want to split some given text
into smaller tokens, be that words, phrases or whole sentences. This process is called Tokenization, then we count the number of
times each word shows up. This resulting tally is called
the Bag of Words model. Next we look up the subjectivity of
each word from an existing lexicon, which is a database of emotional values
for words prerecorded by researchers. And once we have those values, we can then compute the overall
subjectivity of our text. The other approach
uses machine learning. If we have a corpus of say, tweets,
that are labeled either positive or negative, we can train
a classifier on it and then given a new tweet, it will classify
it as either positive or negative. So which approach is better? Don’t ask me. No yeah, totally ask me. Well, using a Lexicon is easier, but
the learning approach is more accurate. There are subtleties in language that
Lexicons are bad at, like sarcasm. It seems to be one thing, but it really
means another but deep neural nets can understand these subtleties because
they don’t analyze text at face value. They create abstract representations
of what they learned. These generalizations are called vectors
and we can use them to classify data. Let’s learn more about vectors by
building sentiment classifier for movie reviews and ill show you
how to run it into the cloud. The only dependency we’ll need is
tflearn, and I’m using it since it’s the easiest way to get started
building deep neural networks. We’ll import a couple of helper
functions that are built into it as well and I’ll explain those
when we get to them. The first step in our process
is to collect our data set. tflearn has a a bunch of pre-processed
data sets we can use and we’re going to use a data
set of IMDB movie ratings. [MUSIC] We’ll load it using
the load_data function, this will download our
data set from the web. We’ll name the path where we want to
save it, the extension being pkl, which means it’s a byte stream. This makes it easier to convert to
other Python objects like lists or two pulls later. We want 10,000 words from the database,
and we only want to use 10% of the data for our validation set, so
we’ll set the last argument to 0.1. Load data will return our movie review
split into a training and testing set. We can then further split those
sets into reviews and labels and set then equal to X and Y values. Training data is the portion
our model learns from, validation data is a part
of the training process. While training data helps us fit our
weights, validation data helps prevent over fitting by letting us tune
our hyper parameters accordingly. And testing data is
what our model uses to test itself by comparing its
predictive labels to actual labels. So test yourself before
you wreck yourself. Now that we have our data split into
sets, let’s do some pre-processing. We can’t just feed text strings
into a neural network directly, we have to vectorize our inputs. Neural nets are algorithms that
essentially just apply a series of computations to your matrices. So, converting them to numerical
representations or vectors is necessary. The pad_sequences function will
do that for our view text. It’ll convert each review
into a matrix and pad it. Padding is necessary to ensure
consistency in our inputs dimensionality. It will pad each sequence with a zero
at the end which we specify until it reaches the max possible sequence
length which we’ll set to 100. We also want to convert our
labels to vectors as well and we can easily do that using
the two categorical function. These are binary vectors with two
classes, 1 which is positive or 0 which is negative. Yo hold up. Vectors got me feeling like. [MUSIC] We can intuitively define each layer
in our network as our own line of code. First will be our impro layer, this is
where we feed data into our network. The only perameter we’ll
specify is the input shape. The first element is the batch size,
which we’ll set to none and then the length, which is 100, since
we set our max sequence length to 100. Our next layer is our embedding layer. The first perameter would be
the output vector we receive from the previous layer, and by the way, for every layer we write we’ll be using
the previous layer’s outputs as inputs. This is how data flows
through a neural network, at each layer it’s transformed like
a seven layer dip of computation. We’ll set dimensions to 10,000 since
that’s how many words we loaded from our data set earlier. And the output dimension to 128, which is the number of dimensions
in our resulting embedding’s. Next, we’ll feed those
values to our LSTM layer. This layer allows our network to
remember data from the beginning of the sequences,
which will improve our prediction. We’ll set dropout to 0.08 which
is a technique that helps prevent over fitting by randomly turning on and
off different pathways in our network. Our next layer is fully connected
which means that every neuron in the previous layer is connected
to every neuron in this layer. We have a set of learned feature
vectors from previous layers, and adding a fully connected layer
is a computationally cheap way of learning non-linear
combinations of them. It’s got two units, and it’s using the softmax function
as its activation function. This will take in a vector of values and
squash it into a vector of output probabilities between 0 and
1, that sum to 1. We’ll use those values in our last
layer, which is our regression layer. This will apply a regression
operation to the input. We’re going to specify an optimizer
method that will minimize a given loss function,
as well as the learning rate, which specifies how fast we
want our network to train. The optimizer we’ll use is adam,
which performs gradient descent. And categorical cross entropy is our
loss, it helps to find the difference between our predicted output and
the expected output. After building our neural network,
we can go ahead and initialize it using tflearn’s
deep neural net function. Then we can call our models fit
function, which will launch the training process for
our given training and validation data. We’ll also set show metric to true so we can view the log of
accuracy during training. So to demo this we’re going to
run this in the cloud using AWS. What we’ll do is use a prebuilt
amazon machine image. This AMI can be used to launch an
instance and it’s got every dependency we need built in, including tensor glow,
Buddha, Lil Wayne’s deposition video. If we click on the orange
Continue button, we can select the type
of instance we want. I’ll go for the smallest because
I’m poor still, but ideally, we use a larger instance, with GPUs. Then we can accept
the terms in one click. Next, we go to our AWS console by
clicking this button, and after a while, our instance will start running. We can copy and
paste the public DNS into our browser, followed by, which is the port
we specified for access. For the password,
we’ll use the Instance ID. Now we’re in our Instance environment,
built with our AMI and we can play with a Jupyter Notebook,
hosted on AWS. We’ll create a new notebook,
and paste our code in there. And now we can run it and
it will start training just like that. So to break it down, there are two
main approaches to sentiment analysis, using a Lexicon of pre-recorded
sentiment or using state of the art but more computationally expensive
deep learning to learn generalized vector representation from words. Feedforward net accepts fixed
sized inputs like binary numbers. But recurrent neural nets helps us
learn from sequences data, like texts. And you can use AWS with a pre-built AMI to easily train your models in the cloud
without dealing with dependency issues. The Coding Challenge Winner
from last week is Ludo Bouan. Ludo architected his neural net so
that stacking layers was as easy as a line of code per layer,
Wizard of the Week. And the Runner Up is See Jie Xun, he accurately modified my code to
reflect multilayer back propagation. The Coding Challenge for this week is to
use tflearn to train a neural network to recognize sentiment from a video
game review data set that I’ll provide. Details are in the read me, post your GitHub link the comments, and
I’ll announce the winner in one week. Please click that Subscribe button. If you want to see more videos like
this, check out this related video. And, for now, I gotta figure out what
the F high torch is, so thanks for watching.

Leave a Reply

Your email address will not be published. Required fields are marked *