Research on dimension reduction methods facing massive highdimensional web text data based on cloud computing

Research on dimension reduction methods facing massive highdimensional web text data based on cloud computing

Deng Hui

COMPUTER MODELLING & NEW TECHNOLOGIES 2013 17(5B) 76-79

Library of North Sichuan Medical College, Nanchong, Sichuan, China, 637000

The cloud model is introduced in the clustering dimension reduction process of the text data. In order to make the feature words selected meet this requirement, the cloud model theory is used for text feature selection, and association cloud filter together with distinction cloud filter is separately done for each feature in the training set; finally, the cloud feature space is obtained. Adopting the cloud computing model can not only allow the text information to be reflected more rationally but also ensure that the vector dimension will not be oversized to influence the machine learning ability. The cloud computing model can be introduced in the massive high-dimensional web text data; on one hand, speed of choosing the feature space can be increased, on the other hand, the data dimension reduction effect can also be enhanced.