Privacy-Preserving Data Mining

by ;
Format: Hardcover
Pub. Date: 2008-07-01
Publisher(s): Springer-Verlag New York Inc
List Price: $230.99

Rent Textbook

Select for Price
There was a problem. Please try again later.

Rent Digital

Rent Digital Options
Online:30 Days access
Downloadable:30 Days
$82.44
Online:60 Days access
Downloadable:60 Days
$109.92
Online:90 Days access
Downloadable:90 Days
$137.40
Online:120 Days access
Downloadable:120 Days
$164.88
Online:180 Days access
Downloadable:180 Days
$178.62
Online:1825 Days access
Downloadable:Lifetime Access
$274.80
*To support the delivery of the digital material to you, a digital delivery fee of $3.99 will be charged on each digital item.
$178.62*

New Textbook

We're Sorry
Sold Out

Used Textbook

We're Sorry
Sold Out

How Marketplace Works:

  • This item is offered by an independent seller and not shipped from our warehouse
  • Item details like edition and cover design may differ from our description; see seller's comments before ordering.
  • Sellers much confirm and ship within two business days; otherwise, the order will be cancelled and refunded.
  • Marketplace purchases cannot be returned to eCampus.com. Contact the seller directly for inquiries; if no response within two days, contact customer service.
  • Additional shipping costs apply to Marketplace purchases. Review shipping costs at checkout.

Summary

Advances in hardware technology have increased the capability to store and record personal data about consumers and individuals. This has caused concerns that personal data may be used for a variety of intrusive or malicious purposes. Privacy Preserving Data Mining: Models and Algorithms proposes a number of techniques to perform the data mining tasks in a privacy-preserving way. These techniques generally fall into the following categories: data modification techniques, cryptographic methods and protocols for data sharing, statistical techniques for disclosure and inference control, query auditing methods, randomization and perturbation-based techniques. This edited volume contains surveys by distinguished researchers in the privacy field. Each survey includes the key research content as well as future research directions of a particular topic in privacy. Privacy Preserving Data Mining: Models and Algorithms is designed for researchers, professors, and advanced-level students in computer science. This book is also suitable for practitioners in industry.

Table of Contents

Prefacep. v
List of Figuresp. xvii
List of Tablesp. xxi
An Introduction to Privacy-Preserving Data Miningp. 1
Introductionp. 1
Privacy-Preserving Data Mining Algorithmsp. 3
Conclusions and Summaryp. 7
Referencesp. 8
A General Survey of Privacy-Preserving Data Mining Models and Algorithmsp. 11
Introductionp. 11
The Randomization Methodp. 13
Privacy Quantificationp. 15
Adversarial Attacks on Randomizationp. 18
Randomization Methods for Data Streamsp. 18
Multiplicative Perturbationsp. 19
Data Swappingp. 19
Group Based Anonymizationp. 20
The k-Anonymity Frameworkp. 20
Personalized Privacy-Preservationp. 24
Utility Based Privacy Preservationp. 24
Sequential Releasesp. 25
The l-diversity Methodp. 26
The t-closeness Modelp. 27
Models for Text, Binary and String Datap. 27
Distributed Privacy-Preserving Data Miningp. 28
Distributed Algorithms over Horizontally Partitioned Data Setsp. 30
Distributed Algorithms over Vertically Partitioned Datap. 31
Distributed Algorithms for k-Anonymityp. 32
Privacy-Preservation of Application Resultsp. 32
Association Rule Hidingp. 33
Downgrading Classifier Effectivenessp. 34
Query Auditing and Inference Controlp. 34
Limitations of Privacy: The Curse of Dimensionalityp. 37
Applications of Privacy-Preserving Data Miningp. 38
Medical Databases: The Scrub and Datafly Systemsp. 39
Bioterrorism Applicationsp. 40
Homeland Security Applicationsp. 40
Genomic Privacyp. 42
Summaryp. 43
Referencesp. 43
A Survey of Inference Control Methods for Privacy-Preserving Data Miningp. 53
Introductionp. 54
A classification of Microdata Protection Methodsp. 55
Perturbative Masking Methodsp. 58
Additive Noisep. 58
Microaggregationp. 59
Data Wapping and Rank Swappingp. 61
Roundingp. 62
Resamplingp. 62
PRAMp. 62
MASSCp. 63
Non-perturbative Masking Methodsp. 63
Samplingp. 64
Global Recodingp. 64
Top and Bottom Codingp. 65
Local Suppressionp. 65
Synthetic Microdata Generationp. 65
Synthetic Data by Multiple Imputationp. 65
Synthetic Data by Bootstrapp. 66
Synthetic Data by Latin Hypercube Samplingp. 66
Partially Synthetic Data by Cholesky Decompositionp. 67
Other Partially Synthetic and Hybrid Microdata Approachesp. 67
Pros and Cons of Synthetic Microdatap. 68
Trading off Information Loss and Disclosure Riskp. 69
Score Constructionp. 69
R-U Mapsp. 71
k-anonymityp. 71
Conclusions and Research Directionsp. 72
Referencesp. 73
Measures of Anonymityp. 81
Introductionp. 81
What is Privacy?p. 82
Data Anonymization Methodsp. 83
A Classification of Methodsp. 84
Statistical Measures of Anonymityp. 85
Query Restrictionp. 85
Anonymity via Variancep. 85
Anonymity via Multiplicityp. 86
Probabilistic Measures of Anonymityp. 87
Measures Based on Random Perturbationp. 87
Measures Based on Generalizationp. 90
Utility vs Privacyp. 94
Computational Measures of Anonymityp. 94
Anonymity via Isolationp. 97
Conclusions and New Directionsp. 97
New Directionsp. 98
Referencesp. 99
k-Anonymous Data Mining: A Surveyp. 105
Introductionp. 105
k-Anonymityp. 107
Algorithms for Enforcing k-Anonymityp. 110
k-Anonymity Threats from Data Miningp. 117
Association Rulesp. 118
Classification Miningp. 118
k-Anonymity in Data Miningp. 120
Anonymize-and-Minep. 123
Mine-and-Anonymizep. 126
Enforcing k-Anonymity on Association Rulesp. 126
Enforcing k-Anonymity on Decision Treesp. 130
Conclusionsp. 133
Acknowledgmentsp. 133
Referencesp. 134
A Survey of Randomization Methods for Privacy-Preserving Data Miningp. 137
Introductionp. 137
Reconstruction Methods for Randomizationp. 139
The Bayes Reconstruction Methodp. 139
The EM Reconstruction Methodp. 141
Utility and Optimality of Randomization Modelsp. 143
Applications of Randomizationp. 144
Privacy-Preserving Classification with Randomizationp. 144
Privacy-Preserving OLAPp. 145
Collaborative Filteringp. 145
The Privacy-Information Loss Tradeoffp. 146
Vulnerabilities of the Randomization Methodp. 149
Randomization of Time Series Data Streamsp. 151
Multiplicative Noise for Randomizationp. 152
Vulnerabilities of Multiplicative Randomizationp. 153
Sketch Based Randomizationp. 153
Conclusions and Summaryp. 154
Referencesp. 154
A Survey of Multiplicative Perturbation for Privacy-Preserving Data Miningp. 157
Introductionp. 158
Data Privacy vs. Data Utilityp. 159
Outlinep. 160
Definition of Multiplicative Perturbationp. 161
Notationsp. 161
Rotation Perturbationp. 161
Projection Perturbationp. 162
Sketch-based Approachp. 164
Geometric Perturbationp. 164
Transformation Invariant Data Mining Modelsp. 165
Definition of Transformation Invariant Modelsp. 166
Transformation-Invariant Classification Modelsp. 166
Transformation-Invariant Clustering Modelsp. 167
Privacy Evaluation for Multiplicative Perturbationp. 168
A Conceptual Multidimensional Privacy Evaluation Modelp. 168
Variance of Difference as Column Privacy Metricp. 169
Incorporating Attack Evaluationp. 170
Other Metricsp. 171
Attack Resilient Multiplicative Perturbationsp. 171
Naive Estimation to Rotation Perturbationp. 171
ICA-Based Attacksp. 173
Distance-Inference Attacksp. 174
Attacks with More Prior Knowledgep. 176
Finding Attack-Resilient Perturbationsp. 177
Conclusionp. 177
Acknowledgmentp. 178
Referencesp. 179
A Survey of Quantification of Privacy Preserving Data Mining Algorithmsp. 183
Introductionp. 184
Metrics for Quantifying Privacy Levelp. 186
Data Privacyp. 186
Result Privacyp. 191
Metrics for Quantifying Hiding Failurep. 192
Metrics for Quantifying Data Qualityp. 193
Quality of the Data Resulting from the PPDM Processp. 193
Quality of the Data Mining Resultsp. 198
Complexity Metricsp. 200
How to Select a Proper Metricp. 201
Conclusion and Research Directionsp. 202
Referencesp. 202
A Survey of Utility-based Privacy-Preserving Data Transformation Methodsp. 207
Introductionp. 208
What is Utility-based Privacy Preservation?p. 209
Types of Utility-based Privacy Preservation Methodsp. 210
Privacy Modelsp. 210
Utility Measuresp. 212
Summary of the Utility-Based Privacy Preserving Methodsp. 214
Utility-Based Anonymization Using Local Recodingp. 214
Global Recoding and Local Recodingp. 215
Utility Measurep. 216
Anonymization Methodsp. 217
Summary and Discussionp. 219
The Utility-based Privacy Preserving Methods in Classification Prob-lemsp. 219
The Top-Down Specialization Methodp. 220
The Progressive Disclosure Algorithmp. 224
Summary and Discussionp. 228
Anonymized Marginal: Injecting Utility into Anonymized Data Setsp. 228
Anonymized Marginalp. 229
Utility Measurep. 230
Injecting Utility Using Anonymized Marginalsp. 231
Summary and Discussionp. 233
Summaryp. 234
Acknowledgmentsp. 234
Referencesp. 234
Mining Association Rules under Privacy Constraintsp. 239
Introductionp. 239
Problem Frameworkp. 240
Database Modelp. 240
Mining Objectivep. 241
Privacy Mechanismsp. 241
Privacy Metricp. 243
Accuracy Metricp. 245
Evolution of the Literaturep. 246
The FRAPP Frameworkp. 251
Reconstruction Modelp. 252
Estimation Errorp. 253
Randomizing the Perturbation Matrixp. 256
Efficient Perturbationp. 256
Integration with Association Rule Miningp. 258
Sample Resultsp. 259
Closing Remarks 263 Acknowledgmentsp. 263
Referencesp. 263
A Survey of Association Rule Hiding Methods for Privacyp. 267
Introductionp. 267
Terminology and Preliminariesp. 269
Taxonomy of Association Rule Hiding Algorithmsp. 270
Classes of Association Rule Algorithmsp. 271
Heuristic Approachesp. 272
Border-based Approachesp. 277
Exact Approachesp. 278
Other Hiding Approachesp. 279
Metrics and Performance Analysisp. 281
Discussion and Future Trendsp. 284
Conclusions 285 Referencesp. 286
A Survey of Statistical Approaches to Preserving Confidentiality of Contingency Table Entriesp. 291
Introductionp. 291
The Statistical Approach Privacy Protectionp. 292
Datamining Algorithms, Association Rules, and Disclosure Limitationp. 294
Estimation and Disclosure Limitation for Multi-way Contingency Tablesp. 295
Two Illustrative Examplesp. 301
Example 1: Data from a Randomized Clinical Trialp. 301
Example 2: Data from the 1993 U.S. Current Population Surveyp. 305
Conclusionsp. 308
Acknowledgmentsp. 309
Referencesp. 309
A Survey of Privacy-Preserving Methods Across Horizontally Partitioned Datap. 313
Introductionp. 313
Basic Cryptographic Techniques for Privacy-Preserving Distributed Data Miningp. 315
Common Secure Sub-protocols Used in Privacy-Preserving Distributed Data Miningp. 318
Privacy-preserving Distributed Data Mining on Horizontally Partitioned Datap. 323
Comparison to Vertically Partitioned Data Modelp. 326
Extension to Malicious Partiesp. 327
Limitations of the Cryptographic Techniques Used in Privacy-Preserving Distributed Data Miningp. 329
Privacy Issues Related to Data Mining Resultsp. 330
Conclusionp. 332
Referencesp. 332
A Survey of Privacy-Preserving Methods Across Vertically Partitioned Datap. 337
Introductionp. 337
Classificationp. 341
Naïve Bayes Classificationp. 342
Bayesian Network Structure Learningp. 343
Decision Tree Classificationp. 344
Clusteringp. 346
Association Rule Miningp. 347
Outlier detectionp. 349
Algorithmp. 351
Security Analysisp. 352
Computation and Communication Analysisp. 354
Challenges and Research Directionsp. 355
Referencesp. 356
A Survey of Attack Techniques on Privacy-Preserving Data Perturbation Methodsp. 359
Introductionp. 360
Definitions and Notationp. 360
Attacking Additive Data Perturbationp. 361
Eigen-Analysis and PCA Preliminariesp. 362
Spectral Filteringp. 363
SVD Filteringp. 364
PCA Filteringp. 365
MAP Estimation Attackp. 366
Distribution Analysis Attackp. 367
Summaryp. 367
Attacking Matrix Multiplicative Data Perturbationp. 369
Known I/O Attacksp. 370
Known Sample Attackp. 373
Other Attacks Based on ICAp. 374
Summaryp. 375
Attacking k-Anonymizationp. 376
Conclusion 376 Acknowledgments 377 Referencesp. 377
Private Data Analysis via Output Perturbationp. 383
Introductionp. 383
The Abstract Model - Statistical Databases, Queries, and Sanitizersp. 385
Privacyp. 388
Interpreting the Privacy Definitionp. 390
The Basic Technique: Calibrating Noise to Sensitivityp. 394
Applications: Functions with Low Global Sensitivityp. 396
Constructing Sanitizers for Complex Functionalitiesp. 400
k-Means Clusteringp. 401
SVD and PCAp. 403
Learning in the Statistical Queries Modelp. 404
Beyond the Basicsp. 405
Instance Based Noise and Smooth Sensitivityp. 406
The Sample-Aggregate Frameworkp. 408
A General Sanitization Mechanismp. 409
Related Work and Bibliographic Notesp. 409
Acknowledgmentsp. 411
Referencesp. 411
A Survey of Query Auditing Techniques for Data Privacyp. 415
Introductionp. 415
Auditing Aggregate Queriesp. 416
Offline Auditingp. 417
Online Auditingp. 418
Auditing Select-Project-Join Queriesp. 426
Challenges in Auditingp. 427
Readingp. 429
Referencesp. 430
Privacy and the Dimensionality Cursep. 433
Introductionp. 433
The Dimensionality Curse and the k-anonymity Methodp. 435
The Dimensionality Curse and Condensationp. 441
The Dimensionality Curse and the Randomization Methodp. 446
Effects of Public Informationp. 446
Effects of High Dimensionalityp. 450
Gaussian Perturbing Distributionp. 450
Uniform Perturbing Distributionp. 455
The Dimensionality Curse and l-diversityp. 458
Conclusions and Research Directionsp. 459
Referencesp. 460
Personalized Privacy Preservationp. 461
Introductionp. 461
Formalization of Personalized Anonymityp. 463
Personal Privacy Requirementsp. 464
Generalizationp. 465
Combinatorial Process of Privacy Attackp. 467
Primary Casep. 468
Non-primary Casep. 469
Theoretical Foundationp. 470
Notations and Basic Propertiesp. 471
Derivation of the Breach Probabilityp. 472
Generalization Algorithmp. 473
The Greedy Frameworkp. 474
Optimal SA-generalizationp. 476
Alternative Forms of Personalized Privacy Preservationp. 478
Extension of k-anonymityp. 479
Personalization in Location Privacy Protectionp. 480
Summary and Future Workp. 482
Referencesp. 485
Privacy-Preserving Data Stream Classificationp. 487
Introductionp. 487
Motivating Examplep. 488
Contributions and Paper Outlinep. 490
Related Worksp. 491
Problem Statementp. 493
Secure Join Stream Classificationp. 493
Naive Bayesian Classifiersp. 494
Our Approachp. 495
Initializationp. 495
Bottom-Up Propagationp. 496
Top-Down Propagationp. 497
Using NBCp. 499
Algorithm Analysisp. 500
Empirical Studiesp. 501
Real-life Datasetsp. 502
Synthetic Datasetsp. 504
Discussionp. 506
Conclusionsp. 507
Referencesp. 508
Indexp. 511
Table of Contents provided by Publisher. All Rights Reserved.

An electronic version of this book is available through VitalSource.

This book is viewable on PC, Mac, iPhone, iPad, iPod Touch, and most smartphones.

By purchasing, you will be able to view this book online, as well as download it, for the chosen number of days.

Digital License

You are licensing a digital product for a set duration. Durations are set forth in the product description, with "Lifetime" typically meaning five (5) years of online access and permanent download to a supported device. All licenses are non-transferable.

More details can be found here.

A downloadable version of this book is available through the eCampus Reader or compatible Adobe readers.

Applications are available on iOS, Android, PC, Mac, and Windows Mobile platforms.

Please view the compatibility matrix prior to purchase.