Natural Language Processing

In this quick tutorial, we go over the basics of Natural Language Processing, what it is, and a few key applications of it. Machines can’t simply read and interpret language innately like we humans can. So how can machines understand sarcasm, or if a sentence is posed as a question, or even just to find the main topic and re-occurring themes in the words? If you think machines learning from numbers is interesting, then machines learning from words is even more interesting.

Welcome to this short introduction to natural language processing.
If you think machines learning from numbers is interesting, machines learning from words
is even more interesting. Natural language processing is how computer
programs are able to make sense of words in the surrounding context.
For example, you could write a computer program to pick up on sarcasm such as
“That’s funny… not.” Or to understand “The world will end!” as an exclamation verses “The world will
end?” as a question. But machines can’t simply read and interpret language
innately like humans can. So how can machines understand sarcasm or if a
sentence is posed as a question, or even just find the main topic and reoccurring
themes in the words. Well, they do this through the best means
they can. Through calculations. If there’s one thing machines do very well it’s
calculations. And so calculations on words and textual features is what allows
machines to determine if a piece of text contains sarcasm, or if it’s more negative
than positive in a sentiment, or contains more rhetoric rather than factual
statements. Or is on this topic versus that topic. Counting the frequency of
words and taking into account the surrounding context and then doing
calculations is the basis of how machines make sense of natural language.
So then in order to count or calculate words and textual features the raw text,
or natural language itself, first needs to be processed in a way that allows
machines to work with the more structured data format. This basically
means cleaning up the text and then organizing it into tables of word
counts across documents. It could also mean tabling pairs of words that occur
together taking into account surrounding context of the words. Cleaning up raw
text and organizing it into a table is absolutely an essential step in natural
language processing. The word processing should be emphasized in natural language
processing. Without processing you’re just left with natural language which is
mentioned machines cannot easily interpret like you and I. They needed to
be processed first, and then do calculations on.
Some key applications of natural language processing are categorizing
texts into negative or positive sentiment to automatically identify
unsatisfied customers from satisfied customers. Or categorizing texts into
topics to recommend articles on the same topic. Or to answer user questions by
retrieving relevant information in the documentation. Or summarizing the most
important information and lengthy documents. All these rely on a machines
ability to understand words and textual features. And that quickly sums up
natural language processing. Thanks for watching. If you found this video useful
give us a like. Or, you can check out our other videos at tutorials.datasciencedojo.com

Next video:
N-Grams in Minutes

Previous video:
Introduction to Clustering

More Data Science Material:
[Video] Community Talk: NLP 101 + Chatbots
[Video] Community Talk: Introduction to Natural Language Processing
[Blog] Natural Language Processing with R Programming Books

(674)

Rebecca Merrett
About The Author
- Rebecca holds a bachelor’s degree of information and media from the University of Technology Sydney and a post graduate diploma in mathematics and statistics from the University of Southern Queensland. She has a background in technical writing for games dev and has written for tech publications.

1 Comment

Avatar

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>