Recently, many countries and regions have enacted data security policies, such as the General Data Protection Regulation proposed by the EU. The release of related laws and regulations has aggravated the problem of data silos, which makes it difficult to share data among various data owners. Data federation is a possible solution to this problem. Data federation refers to the calculation of query tasks jointly performed by multiple data owners without original data leaks using privacy computing technologies such as secure multi-party computing. This concept has become a research trend in recent years, and a series of representative systems have been proposed, such as SMCQL and Conclave. However, for the core join queries in the relational database system, the existing data federation system still has the following problems. First of all, the join query type is single, which is difficult to meet the query requirements under complex join conditions. Secondly, the algorithm performance has huge improvement space because the existing systems often call the security tool library directly, which means the runtime and communication overhead is high. Therefore, this paper proposes a join algorithm under data federation to address the above issues. The main contributions of this paper are as follows: firstly, multi-party-oriented federation security operators are designed and implemented, which can support many operations. Secondly, a federated θ-join algorithm and an optimization strategy are proposed to significantly reduce the security computation cost. Finally, the performance of the algorithm proposed in this paper is verified by the benchmark dataset TPC-H. The experimental results show that the proposed algorithm can reduce the runtime and communication overhead by 61.33% and 95.26%, respectively, compared with the existing data federation systems SMCQL and Conclave.
Yuanyuan Zhang, Shuyuan Li, Yexuan Shi, Nan Zhou, Yi Xu, Ke Xu. Secure Multi-party θ-join Algorithms Toward Data Federation. International Journal of Software and Informatics, 2023,13(1):117~137Copy