Online negotiation for privacy preserving data publishing

Φόρτωση...
Μικρογραφία εικόνας

Ημερομηνία

Συγγραφείς

Pilalidou, Alexandra

Τίτλος Εφημερίδας

Περιοδικό ISSN

Τίτλος τόμου

Εκδότης

Πανεπιστήμιο Ιωαννίνων. Σχολή Θετικών Επιστημών. Τμήμα Μηχανικών Η/Υ & Πληροφορικής

Περίληψη

Τύπος

Είδος δημοσίευσης σε συνέδριο

Είδος περιοδικού

Είδος εκπαιδευτικού υλικού

Όνομα συνεδρίου

Όνομα περιοδικού

Όνομα βιβλίου

Σειρά βιβλίου

Έκδοση βιβλίου

Συμπληρωματικός/δευτερεύων τίτλος

Περιγραφή

The problem of privacy preserving data publishing is defined as the problem of publicly presenting a data set with the structured records around the activities or transactions of a set of persons, in order to accommodate the following two antagonistic goals: (a) allow a set of well-intended knowledge workers to execute data mining algorithms over the public data set in order to extract useful information of statistical nature for this data set, and, (b) prevent a malicious attacker to combine these publicly available data with background knowledge (in the sense of personal knowledge of the attacker, other publicly available data sets, etc) in order to link a specific person in the real world (and in particular sensitive information around this person) with its corresponding record in the public data set. The main technique that data curators undergo is the anonymization of data, which involves transforming the data (in one of many ways that the research community has come up with) before presenting them for public use. in our setting, we focus on the global recoding approach which is a method for data anonymization with (a) high utility for the data mining tools of the well-intended users, (b) faster times than the alternative methods (although not fast enough for an online environment), and, at the same time, (c) the problem of having to delete (a.k.a., suppress) outlier groups to attain an acceptable level of generalization. In this thesis we attack the following goals, not previously explored by the research community. The first goal of this thesis is to study the interplay of suppression, generalization and privacy criterion and record how changes to one of these parameters affect the two others. The main goal, however, of this thesis is to provide the means to negotiate the configuration of the anonymization of a data set, by allowing a target group of known well-meaning users and the data curator who is responsible for the anonymization of data to agree online on (a) the level of data generalization (and thus, the incurred information loss for the well-meaning users), (b) the number of tuples that can be omitted from the published data set and (c) the privacy criterion that the data curator imposes. Our first approach involves precomputing suitable histograms for all the different anonymization schemes that a global recoding method can follow. This allows computing exact answers extremely fact (in the order of few milliseconds). We provide both exact answers, if they exist, and suggestions for approximate answers by exploiting these histograms. However, this approach requires a pre-processing time in the orders of few dozens of minutes; whenever this is not feasible, alternative approaches must be explored. To this end, we propose a method that precomputes a small subset of the histograms in order to speed up the pre-processing time. Our experiments indicate a linear speedup along with very good or acceptable values for the quality of the proposed solutions, depending on the type of answer. Finally, to alleviate the problems of deviations from the optimal solution for two cases of approximation suggestions, we introduce a third variant, where the histogram of the top acceptable node (in terms of height constraint) is also computed at runtime. This method pays the price of 0.1-0.3 seconds to gain excellent quality of solution for all kinds of answers. This way, the data curator is equipped with alternative tools that he can use depending on the constraints in terms of user time and quality of solution.

Περιγραφή

Λέξεις-κλειδιά

-

Θεματική κατηγορία

Παραπομπή

Σύνδεσμος

Μ.Ε. ΠΗΛ 2010

Γλώσσα

en

Εκδίδον τμήμα/τομέας

Πανεπιστήμιο Ιωαννίνων. Σχολή Θετικών Επιστημών. Τμήμα Μηχανικών Η/Υ & Πληροφορικής

Όνομα επιβλέποντος

-

Εξεταστική επιτροπή

-

Γενική Περιγραφή / Σχόλια

Ίδρυμα και Σχολή/Τμήμα του υποβάλλοντος

Πανεπιστήμιο Ιωαννίνων. Σχολή Θετικών Επιστημών. Τμήμα Μηχανικών Η/Υ & Πληροφορικής

Πίνακας περιεχομένων

Χορηγός

Βιβλιογραφική αναφορά

Βιβλιογραφία: σ. 193 - 195

Ονόματα συντελεστών

Αριθμός σελίδων

196 σ.

Λεπτομέρειες μαθήματος

item.page.endorsement

item.page.review

item.page.supplemented

item.page.referenced