Abstract

Author(s): Aslam Khan, Rahul Sharma

Now in these days the communication is frequently performed using digital channels. These channels are not much secure due to attackers and phishing. To prevent losses of the social and financial then security is required in communication. For phishing URL classification a machine learning based data model is proposed and discussed proposed work on this context. To find the features of phishing URLs, we work on the phish tank dataset with the proposed classification model, and using these features to understand new URLs, it is phishing or not. Therefore two technique of data mining is employed for train the model first the phishTank dataset is transformed into a binary dataset. In further, on the dataset the C4.5 algorithm is applied and then transformed data generates the rules using C4.5. URLs classification can be used these rules but to speed up the classification purpose required to reduce the amount of rules. Therefore the Bayesian classifier is implemented on C4.5 decision tree algorithm. For identifying the phishing URLs the Bayesian classifier prune the classification performed and the C4.5 generated rules. Java technique is used for the proposed implementation and Apriori algorithm based technique is used for comparative study. The comparative performance study demonstrates the efficient outcomes as compared to the traditional method of phishing URL classification