This study leverages the syntactic, semantic and contextual features of online hotel and restaurant reviews to extract information aspects and summarize them into meaningful feature groups. We have designed a set of syntactic rules to extract aspects and their descriptors. Further, we test the precision of a modified algorithm for clustering aspects into closely related feature groups, on a dataset provided by Yelp.com. Our method uses a combination of semantic similarity methods- distributional similarity, co-occurrence and knowledge base based similarity, and performs better than two state-of-the-art approaches. It is shown that opinion words and the context provided by them can prove to be good features for measuring the semantic similarity and relationship of their product features. Our approach successfully generates thematic aspect groups about food quality, décor and service quality.
Learn More