Anonymity Enabled Secure Multi-Party Computation For Indian BPO
Anonymity Enabled Secure Multi-Party Computation For Indian BPO
D. K. Mishra Department of Computer Sc. and Engieering Acropolis Institute of Technology and Research, RGPV, Indore Madhya Pradesh Pin 453771, INDIA email-mishra_research@rediffmail.com, M. Chandwani Institute of Engineering & Technology, Devi Ahilya University Khandwa Road, Indore MP 452017, INDIA email-chandwanim1@rediffmail.com consider a scenario where a number of banks want to analyze the effect of the changes in tax structure to the borrowing pattern of their customers. To compute the results they need a large set of customer data. Even though each of them has their own data, the results computed on a larger data set will be more accurate than those computed on a smaller data set. Thus, the banks want to jointly analyze the data to find the patterns of results. But, the problem is that none of the bank wants to share individual records with other banks, as this is a breach of privacy for their customers. This problem is an instance of the SMC problem. We can formally define the SMC problem as follows: let there be n parties with their private inputs x1, x2,.., xn respectively and they want to compute the value of the public function y = f(x1, x2,.., xn) such that after the completion of computation, no party has any information about the inputs of the other parties apart from the information revealed by the computed result. The general approach to solve the above problem is to make use of a trusted third party to do computation and then to announce the results of the computation publicly. This approach can be leveraged by the web service based service providing architecture. According to [2], the major problem with this approach is that it is difficult to find a third party that is trusted by all the parties providing the inputs. We consider an application of Business Process Outsourcing (BPO), as it is known that India is a fastest growing market for BPO and the security is the major task in this. The most common example of BPO are call centers, human resources, accounting and payroll out sourcing. As global BPO industries is estimated to be worth 120-150 Billion dollars. Of this the offshore BPO is estimated to be some 11.4 Billion dollars. India has
1. ABSTRACT
In this Paper, we propose a new framework for secure multi-party computation that uses ambiguous users identity. A huge amount of a potentially sensitive data exists in large a organization that needs to be mined. This raises concerns for people involved in the increased usage of data mining tools in both the public and private sectors. The utility to be gained from widespread data mining seems to come into direct conflict with an individuals need and right to privacy. A privacy preserving data mining solution aims at achieving the somewhat paradoxical property of enabling a data-mining algorithm to use data without ever actually seeing it. Thus, the benefits of data mining can be enjoyed, without compromising the privacy of concerned individuals. The very nature of huge amount of data during the mining needs to have such a framework with security. We propose the framework in the form a protocol for secure multiparty computation during data mining for BPO application. We also provide a suitable architecture that is tuned for this protocol to function. Keywords: Anonymity, Security, Privacy, Trusted Third Party, Secure Multi-party Computation.
2. INTRODUCTION With the increased availability of Internet and grid enabled computing architectures, innumerable opportunities for cooperative computation exist. Even parties, that do not have trust in one another, want to leverage these facilities to computer results by sharing data. Most important decision that a party needs to take while sharing data is the need and right to privacy. No party would like to expose their confidential data that can be exploited by its competitors. In this process, all the parties are depending upon the third party is known as trusted third party (TTP). To understand secure multi-party computation (SMC) in a pragmatic sense,
revenues of 6.4 Billion dollars from offshore BPO and 36 Billion dollars from IT and total BPO. India thus has some 5-6% share of the total industry but a commanding 63% share of offshore component. This 63% is drop from 70% offshore share that India enjoyed last year, despite the industries growing 38% in India last year. This was because of the security breach by call center employees in Indian call centers. In this chapter we have introduce a protocol which maintain the privacy of organization those are sharing their data with call centers and this will make India regain its position in BPO market and keep the share of the pie increasing. In our protocol we can assume that the companies or client involved in BPO process is known as parties, they outsource their business process to BPO which act as TTP. For example a hardware company outsources their customer care process to some call center. In this case the company has to give their database to the call center to help customer queries. If the same call center is taking care of same process for some other hardware company then if by chance it gets allied with some of the company to misuse the data of other hardware company, this leads to the data security breach for the victim hardware company. Another example of BPO is accounting outsourcing. This is where our protocol comes in the picture. In which we have introduced anonymizer layer between client and BPO, which will prevent direct interaction between BPO and clients who outsource their process. This layer can be a process, human interaction, hardware or software. This makes almost intact security solution of data for clients from different arena. In this paper, we propose a new protocol for secure multi-party computing architecture that provides each input providing party the ability to choose its own trusted party (let us call it anonymizer). The protocol carries out the computation with the help of an third party, which need not be trusted by all the participating parties. 3. BACK GROUND AND OUR PROPOSAL
open problems for exploration. They investigate how various computational geometry problems can be solved in cooperative environment, where two parties need to solve a geometric problem based on their joint data, but neither wants to disclose its private data to the other party. Some of the problems discussed here are pertinent to the issues related to point-inclusion, intersection, closest pair and convex hull [1, 3]. Some of the existing protocols such as circuit evaluation protocol, 1-out-of-N oblivious transfer protocol, encryption with homomorphism schemes and Yaos millionaire problem protocol are explained in [4]. Security is defined relatively to an idle-world specification involving a trusted party: anything the adversary can activate in the real world (where the protocol is executed) one can also achieve in real model [10, 11]. Demgard et al. present a protocol of constantround multi-party computation, which makes a black box use of pseudo random generator. This protocol withstands an active, adaptive adversary corrupting a minority of the parties [15]. The first general constant round protocol for secure two party computations is due to Yao [16]. Yaos original protocol considered only the case of semi honest parties; Lindell [17] provided with an extension to the case of malicious party (equivalently active adversary). While Yaos original protocol makes a black box use of underlying primitives (a pseudo random generator and oblivious transfer), the protocol from [17] relies on the methodology of Goldreich et al [6] and thus makes a non-black box use of these primitives.U. Maurer considers such general SMC protocols in [7] where general means that any given specification involving a trusted party can be computed securely without the trusted party.. The entire literature uses TTP to solve SMC problems but now a day, a party can not trust on TTP so that the TTP can do the computation and the privacy is performed by the parties itself. Our assumption is using anonymizer so that the party can hide the identity of own. To resolve this situation, we make following assumptions. 1. TTP is unbiased and does rational computation. 2. 3. 4. TTP computes the result of the function y=f(x1, x2,,xn) correctly. TTP has the ability to announce the result of the computation publicly. Each party having the input can communicate with a trusted anonymizer/proxy. A trusted anonymizer (Ai) is a system that acts as an intermediately between the party having the input and the TTP which will carry out the computation. Thus, Ai hides the identity of Pi (Party) from the TTP.
Yao in [8] introduces the SMC problem. Maurer defines the different type of security in database [5] in the context of SMC. It also prescribes some applications of SMC. Goldreich et al. show a consistent solution of SMC problems [6]. Even though this solution follows a simple approach it is very complex due to the size of the protocol depending upon the number of parties involved in the computation. Privacy preserving data mining is an area in which SMC has many applications. Kargupta et al. describe the work that has been done in privacy preserving data mining using SMC [9, 14]. Du et al. review the various SMC problems and list some of the
5.
The communication channels used by the input providing parties to communicate with the anonymizers are secure. That is no intruder that can intercept the data transferred between them. The anonymizer in any condition will not disclose the identity of the data source, from which it is forwarding the data to the third party. ARCHITECTURE AND PROTOCOL
different Ai, two or more Pis can share a single Ai if they have trust in that anonymizer. Then Ai sends appended random number data di to Pi and UTP will announce the result publicly.
Third Party(TP)
y=f(x1, x2,,xn)
6.
4.
In this section, we propose the required SMC architecture that leverages the anonymity services provided by the anonymizer to carry out SMC using an un-trusted third party (UTP). Figure 1 shows the proposed architecture for the SMC. In this architecture there are three layers but in the previous available architecture only two layers were architected [2]. The topmost layer is known as computation layer and the computation done by UTP. The middle layer is known as the secure layer, it provides the security of input provided by the parties for the computation. This layer has A1...An anonymizer depending upon the number of parties involved in the computation. The anonymizer hides the identity of parties and after adding random number, transfers the input received from party to computation layer. The bottom layer is known as input layer. This layer interfaces with a set of P1Pn parties where each party can provide input for the computation to randomly select trusted anonymizer. In every situation, the connection between anonymizer and party is secure. One anonymizer will be associated with one or many parties but do not have the information about other party. They parties are not connected with the anonymizer while hide the information of the parties which are directly connected. Thus anonymizers cannot send the input of one party to others. 4.1 Informal Description of Protocol In our protocol of SMC there are three layers namely computation layer, secure layer and input layer. Each layer has its own function. Communication layer computes results through UTP. Middle layer hides the identity of the parties in input layer using anonymizer Ai. Input layer provides the input to anonymizer for computing the result by UTP. There are n parties P1Pn, each having input xi that will be used in computing y=f(x1, x2,,xn). For calculating the value of each input providing party Pi generates the random number Ri, that will serve as an identifier for data provided by Pi. Pi attaches the key Ri to the data xi. One method of doing so is to simply append Ri to the data xi, that is di=xi||Ri. Pi selects an anonymizer Ai through which Pi will communicate with the UTP. Ai has the ability to hide the identity of the parties connecting to it and Pi has trust in Ai. Also, the communication channel used for data transfer between Pi and Ai is assumed to be secure. It is not necessary for each Pi to chose a
A1
A2
An
P1
P2
Pn
Figure.1: The Proposed SMC Architecture. 4.2 Formal Description of Protocol As already indicated, our protocol is based on three layers namely computation layer, security layer and input layer. In Figure 1, there are n parties P1Pn, each having input xi that will be used in computing y=f(x1, x2,,xn). To calculate the value of, the each input providing party Pi takes the following steps in the form of an algorithm. Algorithm Anonypro 1. Define P1, P2, , Pn. as parties. 2. Define A1, A2, , An anonymizers. 3. Generate a random key, Ri, as an identifier for each Pi. 4. Compute di=xi||Ri. /* One method of doing so is to simply append Ri to the data xi */ 5. Pi. Ai (A1, A2, , An) /* Pi select an anonymizer Ai through which Pi will communicate with UTP. is the function which select Ai from the A1, A2, , An */ 6. Send di to Ai. /*Anonymizer receives the appended data from Pi.*/ 7. Ai Send di to UTP. /*The UTP takes each data unit di and extract xi and Ri from it.*/ End of Algorithm 5. PERFORMANCE AND SECURITY ACHIEVED BY THE PROTOCOL
The above protocol performs satisfactorily as per need. As Ri was generated by the Pi randomly, and Ai has hidden the identity of Pi from UTP, UTP does not have any way to find out from which Pi the current data is coming. UTP then calculates the value of the function y=f(x1, x2 xn) and announces the results publicly so that each Pi can see the results. In case the participating parties do not want the results of the computation to be
publicly accessible, UTP can return the results of the computation to Ais. It is now the job of the Ai to return the result to the Pi that submitted data through it. In the worst case scenario, the third party can, at most, publicly announce the data it got for the anonymizers. Even though this seems to be a serious flaw, but as the only identity for the actual source of data is the randomly generated key Ri for each data source Pi, TP can post only trple (xi, Ri, Ai) pairs. As there is no direct relation amongst xi, Ri and Ai, it is difficult to map each triple (xi, Ri, Ai) to a unique Pi. Now let us take the case that out of the n data providing parties, k parties cooperate to uncover the data of the remaining n-k parties. Here, the UTP has posted a set of triples M= {( x i , R i , A i ) : i = 1...n}. The k adversary parties can remove their k triples from M, leaving M containing records for remaining n-k parties. Now, in order to assign each of the remaining nk records to respective Pi, the total number of permutations that they will have to try is (n-k)!, which cannot be done in polynomial time. This non polynomial complexity of problem is desirable from the view point of the security to be achieved. 5. CONCLUSION Privacy preserving protocols are well researched area in field of two party computations but research is underway to find efficient practical solutions to SMC problems for higher number of parties. Even though theoretical protocols exist that can solve the SMC problems but they cannot be implemented practically because of their inefficiency and complexity. Existing protocol used only two layers to solve the problem but our protocol is using an extra layer to solve the problem using ambiguous identity. In this paper, we proposed a new architecture that enables SMC by hiding the identity of the parties taking part in the process. Further we may describe a class of functions that provide the additional ability to a party to split its huge data before submitting it for computation, making it almost intractable for other parties to know the actual source of the data. Using this protocol the BPO work will almost secure and privacy of individual will maintained. Further work along the line of finding the functions or transforming the functions into the special class defined in this paper is underway and can helps leverage the, proposed architecture in different areas. REFERENCES
[1] W. Du, and M. J. Attalah, Secure Multi-Party Computation Problems and Their Applications: A Review and Open Problems, Tech. Report CERIAS Tech Report 2001-51, Center for Education and Research in Information Assurance and Security and Department of Computer Sciences, Purdue University, West Lafayette, IN 47906, 2001.
[2] J. Vaidya, and Chris Clifton, Leveraging the Multi in Secure Multi-Party Computation, WPES03 October 30, 2003, Washington, DC, USA, ACM Transaction 2003, pp120-128. [3] M.J. Atallah and W. Du., Secure multi-party computation geometry, Seventh international workshop on Algorithms and data structures (WADS 2001), Providence, Rhode Island, USA, Aug 8-10 2001, pp 136-152. [4] Wenliange Du and Mikhail J. Atallah, Privacy-preserving cooperative scientific computation, In 14 IEEE Computer Security foundation workshop, Nova Scotia, Canada June 11-13 2001. [5] Ueli Maurer, The role of cryptography in database security, SIGMOD 2004, June, Paris, France June 13-18 2004, pp 29-35. [6] O. Goldreich, S. Micali, and A. Wigderson, How to Play Any Mental Game A Completeness Theorem for Protocols with Honest Majority, 19th ACM Symposium on the Theory of Computation, 1987, pp 218-229. [7] U. Maurer, Secure multi-party computation made simple, Security in computational Network (SCN02), G. Persiano (Ed.), Lecture notes in Computer Science, Springer-Verlag, Vol. 2576, 2003, pp. 14-28,. [8] A.C. Yao. Protocols for secure computations, In Proc. 23rd IEEE Symposium on the Foundation of Computer Science (FOCS), IEEE 1982, pp 160-164.. [9] Agarwal, A. Evfimievski,, R. Srikant, Information Sharing Across Private Databases, SIGMOD-2003, Sandiego CA , June 9-11 2003, pp 109-115. [10] R. Canetti, Security and composition of multi-party cryptographic protocols, Journal of Cryptography, vol. 13, no. 1, 2000, pp 143-202. [11] B. Pfitzmann, M. Schunter, and M. Waidner, Secure Reactive System, IBM Research Report RZ 3206, Feb. 14 2000. [12] J. Vaidya, C. Clifton, Privacy Preserving Association Rule Mining in Vertically partitioned Data, 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2002), pp 639644. [13] Y. Lindell, B. Pinkas, Privacy Preserving Data Mining, Advances in Cryptology CRYPTO 2000, Springer-Verlag, Aug 20-24 2000, pp 36-54. [14] H. Kargupta, B. Park, D. Herchberger, E. Johnson. Collective data mining: A new perspective toward distributed data mining, Advance in distributed data mining book, Philip chan, AAA Press, 1999. [15] Ivan Demgard, Yuval Ishai, Constant-Round multi-party computation using a black box pseudo random generator, Proceeding of Crypto 2005. [16] A.C. Yao. How to generate and exchange secretes, In Proceeding 27th IEEE Symposium on Foundation of Computer Science, 1986 p.p. 162-167. [17] Y. Lindell, Parallel Coin-Tossing and Constant-Round secure two party computations, J cryptography 16(3): (2003), pp 143184.