Data Collection for Machine Learning in Big Data