Main Article Content
Social networks and media are becoming increasingly important sources for knowing people's opinions and sentiments on a wide variety of topics. The huge number of messages published daily in these media makes it impractical to analyze them without the help of natural language processing systems.This article presents an approach to cluster texts by similarity and identifying the sentiments expressed by comments on then (positive, negative and neutral, among others) in an integrated manner. Unlike most of the available studies that focus on the English language and use Twitter as a data source, we treat Brazilian Portuguese posts and comments published on Facebook. The proposed approach employs an unsupervised learning algorithm to group posts and a supervised algorithm to identify the sentiments expressed in comments to posts. In an experimental evaluation, a system that implements the proposed approach showed similar accuracy to that of human evaluators in the tasks of clustering and sentiment analysis, but performed the tasks in much less time.
This work is licensed under a Creative Commons Attribution 4.0 International License.
Submission of an article implies that the work described has not been published previously (except in the form of an abstract or as part of a published lecture or academic thesis), that it is not under consideration for publication elsewhere, that its publication is approved by all authors and tacitly or explicitly by the responsible authorities where the work was carried out, and that, if accepted, will not be published elsewhere in the same form, in English or in any other language, without the written consent of the Publisher. The Editors reserve the right to edit or otherwise alter all contributions, but authors will receive proofs for approval before publication.
Copyrights for articles published in IJIER journals are retained by the authors, with first publication rights granted to the journal. The journal/publisher is not responsible for subsequent uses of the work. It is the author's responsibility to bring an infringement action if so desired by the author.
 R. Feldman, “Techniques and applications for sentiment analysis”, Commun. ACM, vol. 56, no 4, p. 82–89, 2013.
 M. Á. García-Cumbreras, A. Montejo-Ráez, e M. C. Díaz-Galiano, “Pessimists and optimists: Improving collaborative filtering through sentiment analysis”, Expert Syst. Appl., vol. 40, no 17, p. 6758–6765, 2013.
 X. Zhang e Y. LeCun, “Text Understanding from Scratch”, Prepr. ArXiv150201710 Cs, fev. 2015.
 M. Zhang, Y. Zhang, e D.-T. Vo, “Gated Neural Networks for Targeted Sentiment Analysis”, 2016.
 P. Grandin e J. M. Adan, “Piegas: A Systems for Sentiment Analysis of Tweets in Portuguese”, IEEE Lat. Am. Trans., vol. 14, no 7, p. 3467–3473, 2016.
 A. Ortigosa, J. M. Martín, e R. M. Carro, “Sentiment analysis in Facebook and its application to e-learning”, Comput. Hum. Behav., vol. 31, p. 527–541, 2014.
 S. S. Dasgupta, S. Natarajan, K. K. Kaipa, S. K. Bhattacherjee, e A. Viswanathan, “Sentiment analysis of Facebook data using Hadoop based open source technologies”, in Data Science and Advanced Analytics (DSAA), 2015. 36678 2015. IEEE International Conference on, 2015, p. 1–3.
 V. Franzoni, Y. Li, P. Mengoni, e A. Milani, “Clustering Facebook for Biased Context Extraction”, in International Conference on Computational Science and Its Applications, 2017, p. 717–729.
 B. Pang, L. Lee, e S. Vaithyanathan, “Thumbs up?: sentiment classification using machine learning techniques”, in Proceedings of the ACL-02 conference on Empirical methods in natural language processing-Volume 10, 2002, p. 79–86.
 S. Rosenthal, P. Nakov, S. Kiritchenko, S. M. Mohammad, A. Ritter, e V. Stoyanov, “Semeval-2015 task 10: Sentiment analysis in twitter”, in Proceedings of the 9th International Workshop on Semantic Evaluation, SemEval, 2015.
 C. D. Manning, P. Raghavan, e H. Schtze, Introduction to information retrieval. Cambridge University Press New York, NY, USA, 2008.
 F. Ceci, A. L. Goncalves, e R. Weber, “A model for sentiment analysis based on ontology and cases”, IEEE Latin America Transactions, vol. 14, nº 11, p. 4560–4566, 2016.
 P. O. L. Junior, L. G. de Castro Junior, e A. L. Zambalde, “Applying Textmining to Classify News About Supply and Demand in the Coffee Market”, IEEE Latin America Transactions, vol. 14, nº 12, p. 4768–4774, 2016.
 L. P. Del Bosque e S. E. Garza, “Prediction of aggressive comments in social media: an exploratory study”, IEEE Latin America Transactions, vol. 14, nº 7, p. 3474–3480, 2016.
 R. G. Rodrigues, R. M. das Dores, C. G. Camilo-Junior, e T. C. Rosa, “SentiHealth-Cancer: a sentiment analysis tool to help detecting mood of patients in online social networks”, International journal of medical informatics, vol. 85, nº 1, p. 80–95, 2016.
 B. Ma, H. Yuan, e Y. Wu, “Exploring performance of clustering methods on document sentiment analysis”, Journal of Information Science, vol. 43, nº 1, p. 54–74, 2017.
 Y. Wang, K. Kim, B. Lee, e H. Y. Youn, “Word clustering based on POS feature for efficient twitter sentiment analysis”, Human-centric Computing and Information Sciences, vol. 8, nº 1, p. 17, 2018.
 M. T. AL-Sharuee, F. Liu, e M. Pratama, “Sentiment analysis: An automatic contextual analysis and ensemble clustering approach and comparison”, Data & Knowledge Engineering, vol. 115, p. 194–213, 2018.
 A. R. Afonso e C. G. Duque, “Automated text clustering of newspaper and scientific texts in brazilian portuguese: analysis and comparison of methods”, JISTEM-Journal of Information Systems and Technology Management, vol. 11, nº 2, p. 415–436, 2014.
 A. R. Afonso, “Brazilian Portuguese Text Clustering Based on Evolutionary Computing”, IEEE Latin America Transactions, vol. 14, nº 7, p. 3370–3377, 2016.
 A. Ceron, L. Curini, e S. M. Iacus, “iSA: a fast, scalable and accurate algorithm for sentiment analysis of social media content”, Inf. Sci., 2016.
 N. F. da Silva, E. R. Hruschka, e E. R. Hruschka, “Tweet sentiment analysis with classifier ensembles”, Decis. Support Syst., vol. 66, p. 170–179, 2014.