Although providing great convenience for users, digital libraries result in users serious concerns on personal privacy due to their more and more untrusted server sides. In fact, users privacy concerns have become one of the major obstacles to the development and application of digital libraries. In digital libraries, user privacy can be divided into data privacy and behavior privacy. Compared to data privacy, the protection of behavior privacy cannot be solved by using traditional privacy protection methods, because it is not allowed to change existing information services in digital libraries. Thus, it is more challenging to protect users behavior privacy in digital libraries.
The purpose of this paper can be described as follows. Aiming at various kinds of online behaviors (i.e., service requests) issued by users in a digital library, we aim to construct a unified framework and model for behavior privacy protection, so as to break the limitations of traditional privacy protection methods when being applied to digital libraries, i.e., to ensure the security of various kinds of behavior privacy on the untrusted server side, under the constraints of not changing the existing platform architecture and service algorithms of a digital library, and not compromising the accuracy and efficiency of information services supplied by the digital library.
In this paper, we first design a basic framework for user behavior privacy protection in a digital library. The basic idea of the framework is to lay a middleware (running at a trusted client, which is used to implement a privacy protection algorithm) between a library user interface (running at a trusted client) and the library services (running at the untrusted server); then, for a service request (i.e., a user behavior) issued by a user, the privacy algorithm would construct a group of high quality dummy behaviors, and submit them together with the user behavior to the untrusted server side, so as to cover up the sensitive preferences behind user's behaviors. Based on the framework, we then present a behavior privacy model, which formulates the constraints that ideal dummy behaviors should satisfy, to provide a reference for the privacy algorithm running at the client for the construction of dummy behaviors. Finally, we discuss the design and implementation of the privacy algorithm, under the model framework of users behavior privacy protection.
Both theoretical analysis and experimental evaluation demonstrate the feasibility of the framework and model proposed in this paper, i.e., by constructing dummy behaviors of semantically irrelevant categories, the significance of users sensitive preferences on the untrusted server side can be reduced effectively (thereby,resulting in a good cover up effect); and by constructing dummy behaviors of highly similar feature distributions with user behaviors, it is difficult for attackers to rule out the dummy behaviors (thereby,resulting in a good mix up effect).
This paper is the first research attempt to the protection of user behavior privacy in a digital library. The privacy framework proposed in this paper can ensure the security of user behaviors on the untrusted server side, without compromising the availability, accuracy and efficiency of information services in a digital library, resulting in a positive significance to the development of a privacy preserving digital library. However, this paper only describes a privacy framework at a high level of abstraction. In a digital library, there are various forms of behavior privacy (such as recommendation behavior and retrieval behavior). As the future work, we need to further study how to design and implement the corresponding privacy protection algorithm for each kind of user behavior. 5 figs. 1 tab. 25 refs.