A framework for privacy preserving machine learning
Privacy-Preserving Machine Learning, Secure Multi-Party Computation, Transfer learning
Machine learning (ML) applications have become increasingly frequent and pervasive in many areas of our lives. We enjoy customized services based on predictive models built with our private data. There are, however, growing concerns about privacy. This is proven by the enactment of the General Law of Data Protection in Brazil, and similar legislative initiatives in the European Union and in several countries. This trade-off between privacy and the benefits of ML applications can be mitigated with the use of techniques that allow the construction and operation of these computational models with formal guaranties of preservation of privacy. These techniques need to respond adequately to challenges posed at every stage of the typical ML application life cycle, from data discovery, through feature extraction, model training and validation, up to its effective application. This work presents a general framework for Privacy-Preserving Machine Learning (PPML) built on homomorphic cryptography primitives and Secure Multi-party Computation (MPC) protocols, which allow the adequate treatment of data and the efficient application of ML algorithms with robust guarantees of privacy preservation. It brings two case studies of applications of the proposed PPML framework: text classification for fake news detection and image classification for breast cancer detection.