NLP(Natural Language Processing) with NLTK

NLP(Natural Language Processing) is a subfield of AI(Artificial Intelligence) which deals with the understanding, interpreting, and manipulating human languages using computers(machines).

NLP enables the computer to interact with humans in a natural manner. It helps the computer to understand the human language and derive meaning from it.


Applications of NLP:

  • • Machine Translation.
  • • Speech Recognition.
  • • Sentiment Analysis.
  • • Summarization of Text.
  • • Chatbot.
  • • Intelligent Systems.
  • • Text Classifications.
  • • Character Recognition.
  • • Spell Checking.
  • • Spam Detection.
  • • Autocomplete.
  • • Named Entity Recognition.
  • • Predictive Typing.


Rule-based NLP vs. Statistical NLP:

Natural Language Processing is separated in two different approaches:

Rule-based Natural Language Processing:

It uses common sense reasoning for processing tasks. For instance, the freezing temperature can lead to death, or hot coffee can burn people’s skin, along with other common sense reasoning tasks. However, this process can take much time, and it requires manual effort.

Statistical Natural Language Processing:

It uses large amounts of data and tries to derive conclusions from it. Statistical NLP uses machine learning algorithms to train NLP models. After successful training on large amounts of data, the trained model will have positive outcomes with deduction.


Components of Natural Language Processing (NLP):

a. Lexical Analysis:

With lexical analysis, we divide a whole chunk of text into paragraphs, sentences, and words. It involves identifying and analyzing words’ structure.

b. Syntactic Analysis:

Syntactic analysis involves the analysis of words in a sentence for grammar and arranging words in a manner that shows the relationship among the words. For instance, the sentence “The shop goes to the house” does not pass.

c. Semantic Analysis:

Semantic analysis draws the exact meaning for the words, and it analyzes the text meaningfulness. Sentences such as “hot ice-cream” do not pass.

d. Disclosure Integration:

Disclosure integration takes into account the context of the text. It considers the meaning of the sentence before it ends. For example: “He works at Google.” In this sentence, “he” must be referenced in the sentence before it.

e. Pragmatic Analysis:

Pragmatic analysis deals with overall communication and interpretation of language. It deals with deriving meaningful use of language in various situations.


NLTK(Natural Language Tool Kit)

The Natural Language Toolkit, or more commonly NLTK, is a suite of libraries and programs for symbolic and statistical natural language processing for English written in the Python programming language. The parameter here is the shape of the new resized image.

img

Features:

Tokenization

Part-of-speech Tagging

Named-entity Recognition

Sentiment Analysis

Here we have used Jupyter Notebook IDE.


#import nltk package
import nltk

#To get the version of nltk
print(nltk.__version__)
Output : 3.4.4

#Download nltk Corpora

nltk.download()

showing info https://raw.githubusercontent.com/nltk/nltk_data/gh-pages/index.xml

True


img

from nltk.corpus import names

names.words()

img

Note:

A corpus is a huge collection of written texts. A compilation of corpuses is known as Corpora. It is a body of written or spoken texts used for the development of NLP tools.


About the Author



Silan Software is one of the India's leading provider of offline & online training for Java, Python, AI (Machine Learning, Deep Learning), Data Science, Software Development & many more emerging Technologies.

We provide Academic Training || Industrial Training || Corporate Training || Internship || Java || Python || AI using Python || Data Science etc





 PreviousNext