In the advent(ure) of cloud computation and the move from local storage towards the public cloud, integrity and confidentiality of data becomes a desirable asset. Due to distribution mechanisms of cloud services the location, where and how the data is stored, is transparent to the user. Further security measures safeguarding the cloud infrastructure may not be specified in detail. A common practice by Cloud Storage Providers (CSP) to reduce storage space includes the screening of plaintext data and the utilization of cross-user data de-duplication mechanisms to optimize their services. Methods like these do have security implications. We rely on the CSP in keeping the infrastructure secure, up-to-date and in in auditing the infrastructure against all security threats, which may cause data breaches. Further, we need to trust CSPs that they will not abuse our data for profiling or run data mining algorithms on our data. This mutual trust may be insufficient for emerging user privacy demands also imposed by the general data protection regulation (GDPR) for identifiable personal data. The approach of this work implements data protection by design, by means of encrypting data in transit and in rest with the users’ keys.
Searchable Encryption (SE) is a privacy enhancing technology, which tries to solve the problem of an efficient search over encrypted data by leaking a minimum known amount of information; i.e. SE enables a server to search over encrypted contents on behalf of a client without access to plaintext data.
Searchable Symmetric Encryption (SSE) is used in data outsourcing applications. Most SSE schemes handle the data encryption independently by secure file encryption methods allowing a map to the encrypted data by identifiers. Therefore, we only focus on index-based search techniques in this work.
Dynamic Searchable Symmetric Encryption (DSSE) allows secure and efficient updates, such as the addition or deletion of documents contained in an encrypted index.
The goal of this work is a practical evaluation of DSSE schemes applied to an application use case of a feature extension of searchable encryption for a backup software that supports encryption.
Following requirements for the DSSE scheme and the implementation have been defined:
In order to avoid a compromise of the encrypted backups by the searchable encryption technology, known leakage abuse attacks have been studied. Concluding that file injection attacks can be mitigated by the usage of forward secure dynamic schemes [ZKP16].
For the selection of an appropriate dynamic symmetric scheme an evaluation of proposals found in literature took place. Searchable encryption schemes based on inverted indexes work at a high level such as performing a tokenization on the keywords and an encryption process on padded blocks of document identifiers.
Forward secure schemes require that the client keeps track of a state. The most appropriate scheme matching the defined requirements is Dyn2Lev [CJJ+14]. This scheme consists of two instantiations, a static construction, and a dynamic one. The static construction is based on a locally optimized inverted index augmented by an array, this serves for full database backups. The latter similarly comes with an inverted index that handles the updates, which suits the incremental backups. Alternatives of forward secure extensions have been considered. Thus, the Sophos scheme based on trapdoor permutations has been selected for a comparison of the implementations [Bos16b].
The Searchitect framework, a research prototype of a client/server architecture enabling searchable encryption has been designed and implemented. Besides the core task to accomplish the integration of different SE schemes it provides a basic user management and security measures such as query authentication. We found an implementation of DynRH2Lev in the Clusion library that supports forward secure queries by sending all pre-calculated tokens in a search query [KM17b]. Two variants of the DynRH2Lev scheme have been integrated, a memory based one and another that stores the encrypted index persistently in a key/value store named DynRH2LevRocks. The third DSSE implementation Sophos is resource revealing, whereas DynRH2Lev is based on a resource hiding approach.
The empirical evaluation is taken from the perspective of a user. Several test cases with synthetic and realistic data have been defined to gain an insight in the behavior of the dynamic and the static constructions on an increasing input size. The tests of the implemented SE schemes showed that it is important to benchmark implementations to find bugs that emerge with the growth of processed data. If the test focuses on practical applicability there is a need to test with realistic data, because real world data may have a different structure of word frequencies. During the tests the Searchitect framework soon reached its limits: processing more than 250 MB sized serialized encrypted static index failed. The main pitfall in the design was to focus just on how the SE schemes work and not on an operational level (how much data they should process). Therefore, the support of a text-based search over the whole content of a fully encrypted backup is not feasible by the means of the current implementation. Although the problem was a wrong parametrization of the bucket size of the dictionary used in the static setting. This resulted in a 5 time blow up of the encrypted index size compared to the original data size. Therefore, a formula for the parametrization of the bucket size of the dictionary and the array with increasing input was empirically derived. Finally, the DynRH2LevRocks implementation has been modified and improved - and achieved amazing results compared to the previous tests.
The DynRH2LevRocks implementation needs less than half of the runtime for a search query compared to DynRH2Lev, tested in the static test case for databases containing more than 500 000 records. In the dynamic tests the time of the DynRH2LevRocks implementation needed for updates has been reduced by a factor of 100, whereas storage space for the encrypted database is 10% more for the same input compared to Sophos. The usability of a searchable encryption scheme should be measured by its primarily goal that is to retrieve a search result for a query. The DynRH2LevRocks implementation outperforms the Sophos implementation in runtime for a search query by a factor of 2.93. This result was taken over all search queries, but the time kept constant for a small result size with increasing EDB size in both implementations. The processing time for a search query in the Sophos implementation increases more with the result size than in the DynRH2LevRocks implementation. This is caused by the sequential computation of the asymmetric trapdoor permutation in the Sophos scheme.
Pic. 1: Dynamic performance comparison of the Search protocol between the DynRH2LevRocks and the Sophos implementation. Each scheme has been iteratively queried after an update consisting of 100 extracted documents of a real-world email dataset. The figure shows 80 updates and 3589 search queries in total per scheme. The dynamic construction provides a solution for incremental backups, but to bring it into practice a modification of the goal of the application use case may be considered. The application would benefit in performance if just the metadata of the files get indexed.
[CJJ+14] David Cash, Joseph Jaeger, Stanislaw Jarecki, Charanjit Jutla, Hugo Krawczyk, and Michael Steiner. Dynamic searchable encryption in very-large databases: Data structures and implementation. In Network and Distributed System Security Symposium NDSS14, 2014.
[Bos16b] Raphael Bost. Sophos - Forward Secure Searchable Encryption, 2016. Published: Cryptology ePrint Archive, Report 2016/728.
[ZKP16] Yupeng Zhang, Jonathan Katz, and Charalampos Papamanthou. All YourQueries Are Belong to Us: The Power of File-Injection Attacks on Searchable Encryption. IACR Cryptology ePrint Archive, 2016:172, 2016.
[KM17b] Seny Kamara and Tarik Moataz. Encrypted systems lab: The clusion library. https://github.com/encryptedsystems/Clusion, 2017. [accessed on 2017-09-05].