Abstract

Author(s): Marupaka Nagaraju,Dr. Dhanalaxmi Vadlakonda

In the present scenario, the massive growth in the scale of data has been observed as a key factor of the big data. Now-adays data is considered to be the biggest assets. As the enormous amount of data is making its space inside the world there is a new evolution of data. Big Datasets are endemic, but are often notoriously difficult to analyze because of their size, heterogeneity and quality. Data sets with sizes beyond the capability of usually used software tools to capture, manage, and process data within a tolerable elapsed time, are included in Big Data. One who has maximum relevant data is considered to be rich in the information industry. But only the collection of data is not enough, it needs to be analyzed. This huge amount of data which is termed as big data cannot be analyzed by traditional tools and techniques; rather it requires more advanced techniques which can make data retrieval, management and storage much faster are required. The rapid evolution and adoption of big data by industry has leapfrogged the discourse to popular outlets, the statistical methods in practice were devised to infer from sample data. The heterogeneity, noise and the massive size of structured big data calls for developing computationally efficient algorithms that may avoid big data pitfalls, such as spurious correlation. Addressing big data is a challenging and time-demanding task that requires a large computational infrastructure to ensure successful data processing and analysis. In this paper, we connote the basic concepts of big data environment