NLP(Natural Language Processing) is a subfield of AI(Artificial Intelligence) which deals with the understanding, interpreting, and manipulating human languages using computers(machines).
NLP enables the computer to interact with humans in a natural manner. It helps the computer to understand the human language and derive meaning from it.
Rule-based NLP vs. Statistical NLP:
Natural Language Processing is separated in two different approaches:
Rule-based Natural Language Processing:
It uses common sense reasoning for processing tasks. For instance, the freezing temperature can lead to death, or hot coffee can burn people’s skin, along with other common sense reasoning tasks. However, this process can take much time, and it requires manual effort.
Statistical Natural Language Processing:
It uses large amounts of data and tries to derive conclusions from it. Statistical NLP uses machine learning algorithms to train NLP models. After successful training on large amounts of data, the trained model will have positive outcomes with deduction.
Components of Natural Language Processing (NLP):
a. Lexical Analysis:
With lexical analysis, we divide a whole chunk of text into paragraphs, sentences, and words. It involves identifying and analyzing words’ structure.
b. Syntactic Analysis:
Syntactic analysis involves the analysis of words in a sentence for grammar and arranging words in a manner that shows the relationship among the words. For instance, the sentence “The shop goes to the house” does not pass.
c. Semantic Analysis:
Semantic analysis draws the exact meaning for the words, and it analyzes the text meaningfulness. Sentences such as “hot ice-cream” do not pass.
d. Disclosure Integration:
Disclosure integration takes into account the context of the text. It considers the meaning of the sentence before it ends. For example: “He works at Google.” In this sentence, “he” must be referenced in the sentence before it.
e. Pragmatic Analysis:
Pragmatic analysis deals with overall communication and interpretation of language. It deals with deriving meaningful use of language in various situations.
The Natural Language Toolkit, or more commonly NLTK, is a suite of libraries and programs for symbolic and statistical natural language processing for English written in the Python programming language. The parameter here is the shape of the new resized image.
Features:
Tokenization
Part-of-speech Tagging
Named-entity Recognition
Sentiment Analysis
Here we have used Jupyter Notebook IDE.
#import nltk package
import nltk
#To get the version of nltk
print(nltk.__version__)
Output : 3.4.4
nltk.download()
showing info https://raw.githubusercontent.com/nltk/nltk_data/gh-pages/index.xml
True
from nltk.corpus import names
names.words()
Note:
A corpus is a huge collection of written texts. A compilation of corpuses is known as Corpora. It is a body of written or spoken texts used for the development of NLP tools.
Silan Software is one of the India's leading provider of offline & online training for Java, Python, AI (Machine Learning, Deep Learning), Data Science, Software Development & many more emerging Technologies.
We provide Academic Training || Industrial Training || Corporate Training || Internship || Java || Python || AI using Python || Data Science etc