A Review on Unstructured Data using k-Mean Algorithm
Unstructured data are the data without identifiable structure, audio, video and images are few examples. Clustering one of the best techniques in the knowledge extraction process. It is nothing but a grouping of similar data to form a cluster. The distance between the data in one cluster and the other should not be less. Many algorithms are practiced for clustering, in that k-mean clustering is one of the popular terms for cluster analysis. The main aim of the algorithm is to partition the dataset into k clusters based on some computational value. The limitation of k-mean clustering is that it can be applied to either structured or unstructured, not in combination with both. This paper overcomes that limitation by proposing a new k –mean algorithm for extracting hidden knowledge by forming clusters from the combination of unstructured datasets.
 Takanobu Nakahara, Takeaki Uno, Yukinobu Hamuro,” Prediction model using micro-clustering”, 18th International Conference on Knowledge-Based and Intelligent Information & Engineering Systems - KES2014, Elsevier, ScienceDirect, vol. 35, pp. 1488 – 1494, 2014.
 V. Duon, M. Phayung. ”Fast K-Means Clustering for very large datasets based on Map Reduce Combined with New Cutting Method (FMR KMeans)”, Springer International Publishing Switzerland, 2015.
 M. Li and al. “An improved k-means algorithm based on Map reduce and Grid”, International Journal of Grid Distribution Computing, (2015)
 Omar Kettani, Faical Ramdani, Benaissa Tadili, ”AKmeans: An Automatic Clustering Algorithm based on Kmeans “, Journal of Advanced Computer Science & Technology, vol. 4, issue 2 2015.
 Sk Ahammad Fahad “A Modified K-Means Algorithm for Big Data Clustering” April 2016.?
 "IBM What is big data? — bringing big data to the enterprise". www.ibm.com. Retrieved 2013- 08-26.
 Francis, Matthew (2012-04-02). "Future telescope array drives development of exabyte processing". Retrieved 2012-10-24.
 Vladimír Holý, Ond?ej Sokol, Michal ?erný,”Clustering Retail Products Based on Customer Behaviour”, Applied Soft Computing, Elsevier Vol 60, PP: 752-762, 2017.
 Zhexue Huang,CSIRO Mathematical and Information sciences ,Australia ”clustering Large datasets with mixed Numeric And Categorical Values” * This Work was supported by the Cooperative Research Centre for Advanced Computational Systems (ACSys) established under the Australian Government’s Cooperative Research Centres Program.
How to Cite
Copyright (c) 2020 Akhilesh Sharma
This work is licensed under a Creative Commons Attribution 4.0 International License.
IJOSCIENCE follows an Open Journal Access policy. Authors retain the copyright of the original work and grant the rights of publication to the publisher with the work simultaneously licensed under a Creative Commons CC BY License that allows others to distribute, remix, adapt, and build upon your work, even commercially, as long as they credit you for the original creation. Authors are permitted to post their work in institutional repositories, social media or other platforms.
Under the following terms:
- No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.