Cloud based privacy preserving data mining model using hybrid k-anonymity and partial homomorphic encryption

dc.contributor.authorMansour Osman, Huda Osman
dc.date.accessioned2024-01-15T04:49:50Z
dc.date.available2024-01-15T04:49:50Z
dc.date.issued2022
dc.descriptionThesis (PhD. (Computer Science))
dc.description.abstractThe evolution of information and communication technologies have encourage numerous organizations to outsource their business and data to cloud computing to perform data mining and other data processing operations. Despite the great benefits of the cloud, it has a real problem in the security and privacy of data. Many studies explained that attackers often reveal the information from third-party services or third-party clouds. When a data owners outsource their data to the cloud, especially the SaaS cloud model, it is difficult to preserve the confidentiality and integrity of the data. Privacy-Preserving Data Mining (PPDM) aims to accomplish data mining operations while protecting the owner's data from violation. The current models of PPDM have some limitations. That is, they suffer from data disclosure caused by identity and attributes disclosure where some private information is revealed which causes the success of different types of attacks. Besides, existing solutions have poor data utility and high computational performance overhead. Therefore, this research aims to design and develop Hybrid Anonymization Cryptography PPDM (HAC-PPDM) model to improve the privacy-preserving level by reducing data disclosure before outsourcing data for mining over the cloud while maintaining data utility. The proposed HAC-PPDM model is further aimed reducing the computational performance overhead to improve efficiency. The Quasi-Identifiers Recognition algorithm (QIR) is defined and designed depending on attributes classification and Quasi-Identifiers dimension determine to overcome the identity disclosure caused by Quasi-Identifiers linking to reduce privacy leakage. An Enhanced Homomorphic Scheme is designed based on hybridizing Cloud-RSA encryption scheme, Extended Euclidean algorithm (EE), Fast Modular Exponentiation algorithm (FME), and Chinese Remainder Theorem (CRT) to minimize the computational time complexity while reducing the attribute disclosure. The proposed QIR, Enhanced Homomorphic Scheme and k-anonymity privacy model have been hybridized to obtain optimal data privacy-preservation before outsourced it on the cloud while maintaining the utility of data that meets the needs of mining with good efficiency. Real-world datasets have been used to evaluate the proposed algorithms and model. The experimental results show that the proposed QIR algorithm improved the data privacy-preserving percentage by 23% while maintaining the same or slightly better data utility. Meanwhile, the proposed Enhanced Homomorphic Scheme is more efficient comparing to the related works in terms of time complexity as represented by Big O notation. Moreover, it reduced the computational time of the encryption, decryption, and key generation time. Finally, the proposed HAC-PPDM model successfully reduced the data disclosures and improved the privacy-preserving level while preserved the data utility as it reduced the information loss. In short, it achieved improvement of privacy preserving and data mining (classification) accuracy by 7.59 % and 0.11 % respectively.
dc.description.sponsorshipFaculty of Engineering - School of Computing
dc.identifier.urihttp://openscience.utm.my/handle/123456789/963
dc.language.isoen
dc.publisherUniversiti Teknologi Malaysia
dc.subjectData mining—Research
dc.subjectCloud computing—Security measures
dc.subjectData privacy
dc.titleCloud based privacy preserving data mining model using hybrid k-anonymity and partial homomorphic encryption
dc.typeThesis
dc.typeDataset
Files
Original bundle
Now showing 1 - 5 of 5
Loading...
Thumbnail Image
Name:
HudaOsmanMansourOsmanPSC2022_B.pdf
Size:
213.69 KB
Format:
Adobe Portable Document Format
Description:
Excerpt of raw datasets (Bank and Adult)
Loading...
Thumbnail Image
Name:
HudaOsmanMansourOsmanPSC2022_C.pdf
Size:
215.96 KB
Format:
Adobe Portable Document Format
Description:
Excerpt of datasets after Objective 1 (used for Objective 2)
Loading...
Thumbnail Image
Name:
HudaOsmanMansourOsmanPSC2022_D.pdf
Size:
229.22 KB
Format:
Adobe Portable Document Format
Description:
Excerpt of datasets after Objective 2 (used for Objective 3)
Loading...
Thumbnail Image
Name:
HudaOsmanMansourOsmanPSC2022_E.pdf
Size:
265.94 KB
Format:
Adobe Portable Document Format
Description:
Excerpt of datasets after Objective 3
Loading...
Thumbnail Image
Name:
HudaOsmanMansourOsmanPSC2022_F.pdf
Size:
82.77 KB
Format:
Adobe Portable Document Format
Description:
Results of Computational Time of Enhanced Homomorphic Scheme on Adult and Bank Datasets
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed to upon submission
Description: