Online Social Networks (OSN) has become highly popular, where users are more and more lured to reveal their private information. To balance privacy and utility, many privacy preserving approaches have been proposed which does not well meet users personalized requirements. Most social networks based data sources such as Twitter, Facebook etc., have unstructured data and no analytics or processing tools can work directly on this unstructured data. Commonly, users lack in data privacy and the access control mechanisms available to remove the risk of disclosure. Thus, the privacy preserving paradigm is required that automatically preserves the user privacy to find the sensitive attribute and reduce the risk of sensitive information leakage. In this paper, we present a Privacy Preserved Hadoop Environment (PPHE) which automatically detects sensitive attribute using data mining techniques. This work considers Twitter which enable users to post messages. The content of the posted tweets are wide ranging and contains private information such as email addresses, mobile numbers, physical addresses, and date of births. In this context, the purpose of our work is fourfold. First, we authenticate each twitter users using the integrated algorithm RSA and Elgamal Algorithm. Second, we categorize the tweets into private and non-private attributes based on Type-2 Fuzzy Logic System. Third, we apply data suppression technique for private tweets and finally sharing users content based on their similarity information. Content similarity has evaluated using Cosine Similarity. Finally we evaluate the system performance in terms of accuracy, precision, recall, and F-measure.
Kumaran U. and Neelu Khare, “PPHE – Automatic Detection of Sensitive Attribute in Privacy Preserved Hadoop Environment using Data Mining Techniques”, International Journal of Computer Aided Engineering and Technology, (Accepted), 2018.