Defining and Evaluating Classification Algorithm for High-Dimensional Data Based on Latent Topics

My bibliography Save this article

Defining and Evaluating Classification Algorithm for High-Dimensional Data Based on Latent Topics

Author

Listed:

Le Luo
Li Li

Registered:

Abstract

Automatic text categorization is one of the key techniques in information retrieval and the data mining field. The classification is usually time-consuming when the training dataset is large and high-dimensional. Many methods have been proposed to solve this problem, but few can achieve satisfactory efficiency. In this paper, we present a method which combines the Latent Dirichlet Allocation (LDA) algorithm and the Support Vector Machine (SVM). LDA is first used to generate reduced dimensional representation of topics as feature in VSM. It is able to reduce features dramatically but keeps the necessary semantic information. The Support Vector Machine (SVM) is then employed to classify the data based on the generated features. We evaluate the algorithm on 20 Newsgroups and Reuters-21578 datasets, respectively. The experimental results show that the classification based on our proposed LDA+SVM model achieves high performance in terms of precision, recall and F1 measure. Further, it can achieve this within a much shorter time-frame. Our process improves greatly upon the previous work in this field and displays strong potential to achieve a streamlined classification process for a wide range of applications.

Suggested Citation

Le Luo & Li Li, 2014. "Defining and Evaluating Classification Algorithm for High-Dimensional Data Based on Latent Topics," PLOS ONE, Public Library of Science, vol. 9(1), pages 1-9, January.

Handle: RePEc:plo:pone00:0082119
DOI: 10.1371/journal.pone.0082119

Download full text from publisher

References listed on IDEAS

Bruno Schivinski & Dariusz Dabrowski, 2013. "The Effect Of Social-Media Communication On Consumer Perceptions Of Brands," GUT FME Working Paper Series A 12, Faculty of Management and Economics, Gdansk University of Technology.

Full references (including those not matched with items on IDEAS)

Citations

Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.

Cited by:

Chao Wei & Senlin Luo & Xincheng Ma & Hao Ren & Ji Zhang & Limin Pan, 2016. "Locally Embedding Autoencoders: A Semi-Supervised Manifold Learning Approach of Document Representation," PLOS ONE, Public Library of Science, vol. 11(1), pages 1-20, January.
Charemza, Wojciech & Makarova, Svetlana & Rybiński, Krzysztof, 2022. "Economic uncertainty and natural language processing; The case of Russia," Economic Analysis and Policy, Elsevier, vol. 73(C), pages 546-562.
repec:plo:pone00:0110331 is not listed on IDEAS

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Mohammad Hakkak & Hojjat Vahdati & Seyyed Hadi Mousavi Nejad, 2015. "Study the Role of Customer-Based Brand Equity in the Brand Personality Effect on Purchase Intention," International Journal of Asian Social Science, Asian Economic and Social Society, vol. 5(7), pages 369-381, July.
Thomas Clauss & Thomas Niemand & Sascha Kraus & Patrick Schnetzer & Alexander Brem, 2019. "Increasing Crowdfunding Success Through Social Media: The Importance Of Reach And Utilisation In Reward-Based Crowdfunding," International Journal of Innovation Management (ijim), World Scientific Publishing Co. Pte. Ltd., vol. 24(03), pages 1-30, May.
Hofman-Kohlmeyer Magdalena, 2021. "Brand-Related User-Generated Content in Simulation Video Games: Qualitative Research Among Polish Players," Journal of Management and Business Administration. Central Europe, Sciendo, vol. 29(1), pages 61-87, March.
Riccardo Rialti & Lamberto Zollo & Maria Carmen Laudano & Cristiano Ciappei, 2018. "Social media brand communities and brand value co-creation: Evidences from Italy," MERCATI & COMPETITIVIT?, FrancoAngeli Editore, vol. 2018(3), pages 111-133.
Charitha Harshani Perera & Rajkishore Nayak & Long Thang Van Nguyen, 2019. "Role of social word-of-mouth on emotional brand attachment and brand choice intention: A study on private educational institutes in Vietnam," Proceedings of Business and Management Conferences 8611115, International Institute of Social and Economic Sciences.
Anna Lou Abatayo & John Lynham & Katerina Sherstyuk, 2020. "Communication, Expectations, and Trust: An Experiment with Three Media," Games, MDPI, vol. 11(4), pages 1-26, October.
- Anna Lou Abatayo & John Lynham & Katerina Sherstyuk, 2020. "Communication, Expectations and Trust: an Experiment with Three Media," Working Papers 202021, University of Hawaii at Manoa, Department of Economics.
Alberto Lopez & Eva Guerra & Beatriz Gonzalez & Sergio Madero, 2020. "Consumer sentiments toward brands: the interaction effect between brand personality and sentiments on electronic word of mouth," Journal of Marketing Analytics, Palgrave Macmillan, vol. 8(4), pages 203-223, December.
Widmar, Nicole Olynk & Bir, Courtney & Clifford, McKenna & Slipchenko, Natalya, 2020. "Social media sentimentas an additional performance measure? Examples from iconic theme park destinations," Journal of Retailing and Consumer Services, Elsevier, vol. 56(C).
Saeed M.Z.A. Tarabieh, 2017. "The Synergistic Impact of Social Media and Traditional Media on Purchase Decisions: The Mediating Role of Brand Loyalty," International Review of Management and Marketing, Econjournals, vol. 7(5), pages 51-62.
Janarthanan Balakrishnan & Pantea Foroudi, 2020. "Does Corporate Reputation Matter? Role of Social Media in Consumer Intention to Purchase Innovative Food Product," Corporate Reputation Review, Palgrave Macmillan, vol. 23(3), pages 181-200, August.
Agnieszka Izabela Baruk & Grzegorz Wesołowski, 2021. "The Effect of Using Social Media in the Modern Marketing Communication on the Shaping an External Employer’s Image," Energies, MDPI, vol. 14(14), pages 1-23, July.
Jalal Rajeh Hanaysha, 2021. "Impact of Price Promotion, Corporate Social Responsibility, and Social Media Marketing on Word of Mouth," Business Perspectives and Research, , vol. 9(3), pages 446-461, September.
Li Zhao & Stacy H. Lee & Muzhen Li & Peng Sun, 2022. "The Use of Social Media to Promote Sustainable Fashion and Benefit Communications: A Data-Mining Approach," Sustainability, MDPI, vol. 14(3), pages 1-14, January.
Muhammad Naeem & Wilson Ozuem, 2021. "Understanding the social consumer fashion brand engagement journey: insights about reputed fashion brands," Journal of Brand Management, Palgrave Macmillan, vol. 28(5), pages 510-525, September.
Haeok Liz Kim & Sunghyup Sean Hyun, 2019. "The Relationships among Perceived Value, Intention to Use Hashtags, eWOM, and Brand Loyalty of Air Travelers," Sustainability, MDPI, vol. 11(22), pages 1-12, November.
Jalal Rajeh Hanaysha, 2017. "Impact of Social Media Marketing, Price Promotion, and Corporate Social Responsibility on Customer Satisfaction," Jindal Journal of Business Research, , vol. 6(2), pages 132-145, December.
Zeynep Birce Ergor & Elif Akagun Ergin, 2016. "The Role of Social Media on Establishing Brand Value: A Content Analysis on Banks in Turkey," International Journal of Economics and Finance, Canadian Center of Science and Education, vol. 8(3), pages 97-102, March.
Dijkmans, Corné & Kerkhof, Peter & Beukeboom, Camiel J., 2015. "A stage to engage: Social media use and corporate reputation," Tourism Management, Elsevier, vol. 47(C), pages 58-67.
José Ramón Sarmiento-Guede & Arta Antonovica & Rebeca Antolín-Prieto, 2021. "The Green Image in the Spanish Hotel Sector: Analysis of Its Consequences from a Relational Perspective," Sustainability, MDPI, vol. 13(9), pages 1-17, April.
Iesha Khajuria & Rachna, 2017. "Impact of Social Media Brand Communications on Consumer-Based Brand Equity," Indian Journal of Commerce and Management Studies, Educational Research Multimedia & Publications,India, vol. 8(3), pages 124-131, September.

More about this item

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0082119. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

Defining and Evaluating Classification Algorithm for High-Dimensional Data Based on Latent Topics

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Citations

Most related items

More about this item

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data