PREDICTING THE REVIEWS OF THE RESTAURANT USING NATURAL LANGUAGE PROCESSING TECHNIQUE python project


if you want the project pls call @8125424511



PREDICTING THE REVIEWS OF THE RESTAURANT USING NATURAL LANGUAGE PROCESSING TECHNIQUE

ABSTRACT
In the era of the web, a huge amount of information is now flowing over the network. Since the range of web content covers subjective opinion as well as objective information, it is now common for people to gather information about products and services that they want to buy. However since a considerable amount of information exists as text-fragments without having any kind of numerical scales, it is hard to classify their evaluation efficiently without reading full text. Here we will focus on extracting scored ratings from textfragments on the web and suggests various experiments in order to improve the quality of a classifier.

EXISTING SYSTEM
Many researchers have done experiments to classify the sentiments of the customers on different datasets earlier. Like Turney (2002) used a semantic orientationalgorithm to classify reviews based on the numbersof positively oriented and negatively oriented phrasesin each review.Pang et al. (2002) used machine learning tools such as Naïve Bayes, Maximum Entropy and Support Vector Machine (SVM) classifiers to classify movie reviews using a number of simple textual features.

DISADVANTAGES

·        This type of classification is only done when the classifier has to work on the binary data which is not the case with Restaurant Reviews.

·        However, from a practical point of view perhaps the most serious problem with SVMs is the high algorithmic complexity and extensive memory requirements of the required quadratic programming in large-scale tasks.

·        If categorical variable has a category (in test data set), which was not observed in training data set,then model will assign a 0 (zero) probability and will be unable to make a prediction. This is oftenknown as “Zero Frequency”.
PROPOSED SYSTEM

Our proposed system is to apply natural language processing techniques to classify a set of restaurant reviews based on the number of stars that each review received.We develop a maximum entropy classifier to categorize each review from 1-star to 5-stars. We implement a set of features that we believe to be relevant to the sentiment expressed in reviews and analyze their effect on performance, providing insights into what works and why sentiment categorization can be so difficult.We analyze how a review’s conformance to a particular language model can be affected by the sentiment of the review We experiment with different linguistically motivated models of sentiment expression, again using the results to improve the performance of our classifier We examine the effects of part-of-speech tagging on our ability to predict sentiment.We experimented with different methods of preprocessing the data. Because the reviews are unstructured in terms of user input, reviews can look like anything from a paragraph of well-formatted text to a jumble of seemingly unrelated words to a run-on sentence with no apparent regard for grammar or
Punctuation. Our initial pass over the data simply tokenized the reviews based on whitespace and treated each token as a unigram, but we were able to improve performance by removing punctuation in addition to the whitespace and converting all letters to lowercase. In this way, we treat the occurrences of “good”, “Good”, and “good.” all as the same, which gives better predictive power to any test set review containing any of these three forms.Before converting into the unigram stemming was also done which means the various forms (tenses, verbs) of the words were removed and treated as a single word. After the matrix is build the non-frequent words are removed by setting a threshold in order to improve the accuracy. So our matrix includes relevant unigrams as well as bigrams which are occurring more than the threshold times.




ADVANTAGES

·        Good at pattern recognition problems
·        Data-driven, and performance is high in many problems
·        End-to-End training: little or no domain knowledge is needed in system construction
·        Learn of representations: cross-modal processing is possible
·        Gradient-based learning: learning algorithm is simple
·        Mainly supervised learning methods

ARCHITECTURE



ALGORITHMS

NATURAL LANGUAGE PROCESSING

Natural Language Processing (NLP) is a sub-field of Artificial Intelligence that is focused on enabling computers to understand and process human languages, to get computers closer to a human-level understanding of language. That being said, recent advances in Machine Learning (ML) have enabled computers to do quite a lot of useful things with natural language! Deep Learning has enabled us to write programs to perform things like language translation, semantic understanding, and text summarization.Since, text is the most unstructured form of all the available data, various types of noise are present in it and the data is not readily analysable without any pre-processing. The entire process of cleaning and standardization of text, making it noise-free and ready for analysis is known as text pre-processing.To analyse a pre-processed data, it needs to be converted into features. Depending upon the usage, text features can be constructed using assorted techniques – Syntactical Parsing, Entities / N-grams / word-based features, Statistical features, and word embedding’s.Text classification, in common words is defined as a technique to systematically classify a text object (document or sentence) in one of the fixed category. It is really helpful when the amount of data is too large, especially for organizing, information filtering, and storage purposes.

SYSTEM REQUIREMENTS

SOFTWARE REQUIREMENTS:
·        OS                                :       Windows
·        Python IDE                  :        python 2.7.x and above
                                      :       Pycharm IDE
·        setup tools and pip to be installed for 3.6.x and above




HARDWARE REQUIREMENTS:
·        RAM                               :      4GB and Higher
·        Processor                         :      Intel i3 and above
·        Hard Disk                        :      500GB: Minimum







Share this

Related Posts

Previous
Next Post »

thank you for your comment

pls call me on 8125424511