Last Updated on November 7, 2020
NLP(Natural language processing) is a branch of artificial intelligence that helps computers how to understand, interpret with the human language. It was first introduced in the 1950s. Alan Turing mentioned it in the article “Computing Machinery and Intelligence” which we know today as Turing Test. We have used many devices or software which are being made using NLP technology. These are voice assistant in chrome, Cortana in Microsoft, Siri, Alexa, chat with chatbots, Google translate, etc. These are those things which have made our life much easier.No need to learn multiple languages. NLP(natural language processing) will translate the language for you.
NLP is divided into two major components :-
1. NLU(Natural language understanding)
It refers to map given input in natural language(the language we speak or write) into useful representation and analyze the different aspects of the language. (SEE five processes of NLP)
2. NLG(Natural language generation)
After understanding the input(natural language), NLG produces meaningful phrases and sentences from the representation that NLU does. It involves
- Text planning:- It finds the relevant content from the representation.
- Sentence planning:-It choose the required word and then forms meaningful phrases.
- Text realisation:-At last, it maps the sentence planning and makes the full structure.
NLU is harder than NLG because it takes lots of time and processes to understand a particular language, especially for machines.
How does NLP work?
There are five processes in NLP:-
It divides(tokenizes) the given sentence into words. Input: Students are going to school. Output: [‘Students’ , ‘are’ , ‘going’ , ‘to’ , ‘school’]
With the help of stemming, we can find the root of any word by removing the end or beginning of the word. For example, ‘go’ root has variations like ‘gone’ , ‘going’ , ‘goes’ etc. The problem of stemming is that it doesn’t get success always.
The role of stemming and lemmatization is almost similar but in a different way. Let me make you clear. In stemming, to find the root of the word, it removes the beginning or end of the word like going–> go, worked–>work. But in terms of ‘went’, it won’t work. Here lemmatization will help to find the root of the word. For example:- went–>go , drunk–>drink , brought–>bring etc.
4. POS Tags
POS stands for Parts of speech. What does it do? It identifies the part of speech of each word. Ex:- Input: Rahul went to market. Output: [(‘Rahul’, NNP),(‘went’, VBD), (‘to’, TO), (‘market’, NN)]. Here NNP refers to a Proper noun, VBD refers to verb past tense, TO refers to infinite, NN refers to a singular common noun.
5. Named Entity Recognition
NER is used to detect the named entity like the given word is related to which part of speech. Is it the name of a person, company name, quantity, or location? For example:- “Google something on the Internet”. Here Google is not a company, it is a verb. “Bat flies at night”. Here Bat is a mammal, not that one used in cricket. Named entity recognition does all that stuff.
Chunking is used to add more structures. It picks up an individual piece of a word and grouping them together into the full sentence. For example:-“Farmers killed the snake”. According to the structure or part of speech, Chunking forms the sentence, which helps in getting insights and meaningful information.
Which programming language is suitable for NLP?
Python programming language is the leading language for AI and machine learning. It contains a library called NLTK(natural language toolkit) for natural language processing. Computer vision and NLP are the leading technologies of AI. For machines, computer vision is equivalent to eye and NLP is equivalent to ear and mouth.
It needs a high configuration of computers to run the NLP and other technologies of machine learning. Many people think that all great ideas are already taken by Galileo, Einstein, and the rest. AI has still openings for several full-time Einstein and Edisons.
I hope you like the article.
If you have any question, please mention it in the comment box.