Introduction
In this article, we will start by defining the various techniques we use in Astran S5 and explore some further applications present in the literature, such as Multiparty computing (MPC), and how we differ. The first technique is the All or Nothing Transform (AONT). The second one is the Shamir Secret Sharing. We will then explain how we combined these techniques to implement our zero-trust storage solution. Later, we will explain the Secure Multiparty Computation and how it works. And finally, how the Shamir Secret Sharing is used with the SMPC.
All or Nothing Transform (AONT)
"An AONT is an unkeyed, invertible, randomized transformation, with the property that it is hard to invert unless all of the output is known." The AONT is not encryption since it does not use any secret key information. It could be seen as preprocessing a message before encrypting it. The main idea behind the AONT is first to encrypt the message (Data) with a random key (k). Second, the hash (h) of the encrypted message will be calculated using a cryptographic hash function (Hash). Finally, the AONT package is the concatenation of the ciphered data and an exclusive or (XOR) of the key (k) and a hash (h).[1]
Shamir Secret Sharing
Secret sharing (also called secret splitting) refers to methods for distributing a secret among a group so that no individual holds any intelligible information about the secret. Whereas insecure secret sharing allows an attacker to gain more knowledge with each share, secure
secret sharing is 'all or nothing' (where 'all' means the necessary number of shares). Still, when a sufficient number of individuals combine their 'shares,' the secret may be reconstructed.[2]
Adi Shamir proposed a solution to the problem, also known as sharing a secret among multiple parties. He expanded the idea geometrically by imagining the secret S as a point in space. And
the shares as points along a secret random curve. Suppose we imagine that we have a secret number shared among n people. Shamir's key incites
us to think about this number as a secret in 2D space, where the secret S is represented by a point A, with a value of the secret S on the y-axis for a given default value on the x-axis (zero in our case). Thus, the y coordinate of that point A is the number we want to keep secret.
Suppose we want to ensure that at least two shares (or points) are needed to recover the secret. We introduce randomness by selecting a point B in the 2D plan. And out of those two points, we form a secret line that protect the value of our secret S. This trick expands the secret from a point to a line. Creating lots of redundancy, out of which we can create as many shares as we wish, where each share is a point anywhere on the secret line. With two shares (e.g. two points), we can recover the secret line. Once we know the line, we can see where it intersects the y-axis, and the y-coordinate of that point will correspond to our secret S.
To generalize this, we want to ensure that at least m people are needed to recover the secret; (m<n). We introduce randomness by selecting an m-1 point in the 2D plan. We then create a curve connecting those points with the secret point. Once we obtain our secret curve, we can generate as many shares as we want.
Our own implementation of the transformation AONT-SS (ASTRAN S5)
For the AONT, we encrypt the data with AES-CTR (and also generate along the way an authenticated tag with EAX, which we add with the nonce at the end of the data slice). The key used for AES is then hidden in a new block b, with b = K ^ h, where h is a hash of the encoded data c_0, c_1, ..., c_{s-1} (SHA-256 in our current implementation). This method allows us to ensure that decoding the data is impossible without all the codewords since, for retrieving the key (k), one needs to have the real data to compute its hash. In our implementation, the result of the AONT, will be fragmented using a secret sharing scheme.
The implementation of secret sharing for our use case is a bit different than Shamir's secret sharing. The secret is not identified only with one point but will be identified with k points. The following equation presents how the size of all the blocks constituting the secret is -when summed - equal to the secret. 𝑠𝑖 corresponds to the size of the k point representing the secret and 𝑠𝑡 represent the total size of the secret. We will then create a curve that connects those points and generate as many shares as needed.
To simplify the visualization, we decide to choose the value of 2 for k, which represents also the minimum number of shares needed to get the secret. In In the next figure, points, A and B represents the secret we want to secure with the following coordinates: A(2,s2) & B(1,s1). Based on those two points a secret line is plotted. Then we can randomly choose 3 points C, E, and F, that will be used as shares to distribute in order to protect our secret 𝑠𝑡
Secure Multiparty Computation (SMPC)
Unlike traditional constructions in cryptography, where the attacker is outside the system, the idea in SMPC is that the attacker could be one of the stakeholders within the system. Distributed computing considers the scenario where several distinct yet connected computing devices (or parties) wish to perform a joint computation of some function. Secure multiparty computation aims to enable parties to carry out such distributed computing tasks securely. Secure multiparty computation (MPC / SMPC) is a cryptographic protocol that distributes a calculation across multiple parties where no individual party can see the other parties’ data and will reveal only the outputs.[3, 4]
Whereas distributed computing often deals with questions of computing under the threat of machine crashes and other unintentional faults. Secure multiparty computation is concerned with the possibility of deliberately malicious behavior by some adversarial entity (these have also been considered in the distributed literature, where they are called Byzantine faults). It is assumed that a protocol execution may come under “attack” by an external entity or a subset of the participating parties. The attacker's aim may be to learn private information from other participants or to cause an error in the computation result.
A simple way to do calculations without sharing the initial data with the participant is to have a trusted third party (TTP) that receives the inputs. The TTP ensures to compute the correct function and returns the output while not linking the input to anyone and deleting it after the computation. In other words, the SMPC's role is to achieve the same computation while respecting the same constraints without having the TTP. In SMPC, there will be a protocol which is a set of instructions that the parties need to follow, and at the end of this protocol, the output will be computed.
An adversarial entity could control some subset of the parties and wishes to attack the protocol execution. The parties under the adversary's control are called corrupted and follow the adversary's instructions. The adversary: could passively gather information or instruct the corrupted parties to act maliciously and its corruption strategy (the adversary could choose to corrupt the parties' inputs). There are two main types of adversaries:
• Semi-honest adversaries (passive): the corrupted parties faithfully follow the protocol specification.
• Malicious adversaries (active): arbitrarily deviate from the protocol specification according to the adversary’s instructions.
Depending on the number of corrupted parties, we can identify three scenarios:
Let n denote the number of participating parties, and let t represent a bound on the number of parties that may be corrupted [3].
• t < n/3 (less than a third of the parties can be corrupted), secure multiparty protocols with fairness and guaranteed output delivery can be achieved for any function with computational security.
• For t < n/2 (honest majority), secure multiparty protocols with fairness and guaranteed output delivery can be achieved for any function with computational and information- theoretic security.Astran – Sensitive Data Cloud 7
• For t < n (dishonest majority), secure multiparty protocols (without fairness or guaranteed output delivery) can be achieved. Finding a protocol capable of guaranteeing the security of the inputs while t= n-1, is the most robust privacy guarantee. But it is tough to achieve and less efficient.
Different techniques have been developed for constructing MPC protocols with different properties, and for different settings. The following protocols target the semi-honest adversary model present in the literature [6].
• Yao’s GC Garbled Circuits Protocol
• Goldreich-Micali-Wigderson (GMW)
• Ben-Or, Goldwasser, and Wigderson (BGW)
• Constant-Round Multi-Party Computation (BMR)
• Gate Evaluation Secret Sharing (GESS)
The following protocols are designed to resist malicious adversaries:
• Cut-and-Choose
• GMW compiler
• Authenticated Secret Sharing: BDOZ & SPDZ
• Authenticated garbling
SMPC and Shamir Secret Sharing
In secret-sharing-based MPC, the goal is to obtain a secret-shared representation of the inputs to the computation. The goal is to use a secret-sharing scheme that ensures that any possible corrupted party cannot learn anything about the underlying secret. In the case of honest- majority MPC based on secret sharing, the arithmetic circuit (comprised of multiplication and addition gates) is over a finite field Zp with p > n, as above. The parties participating in the MPC need first to share their inputs using Shamir’s secret sharing as explained in the Shamir Secret Sharing paragraph. Each party will secure its secret and share the secret shares with the different participants. Each participant will receive the extra shares and make the computations. Once the calculation is done, they can obtain the outputs by simply sending their results to each other and reconstructing the outputs via interpolation. [5, 7, 8]
Let's consider A as the secret of the first person; she will apply the Shamir Secret Sharing and obtain different secret shares that she will share with other participants. The second person has the K secret, and the third has the L secret. They will also apply the Shamir Secret Sharing and obtain different secret shares. The obtained shares will be used to make the computation. The result of the first calculation will be used to make the computation of the output without knowing every person's input.
Conclusion
To summarize, both Astran S5 and SMPC uses Secret Sharing, with a different flavor, as the objective is not the same: One mainly for storage, the other for compute. In one hand, Astran S5 goal is to store data without dealing with keys bringing security, confidentially and compliance. Astran S5 is based on the AONT and a specific implementation of Secret Sharing primitives. On the other hand, SMPC goal is to apply computation on data without sharing it among the participants by using Secret Sharing primitives.
References
[1] Rivest, R. L. (1997, January). All-or-nothing encryption and the package transform. In International Workshop on Fast Software Encryption (pp. 210-218). Springer, Berlin, Heidelberg.
[2] Shamir, A. (1979). How to share a secret. Communications of the ACM, 22(11), 612-613.
[3] Lindell, Y. (2020). Secure multiparty computation. Communications of the ACM, 64(1), 86-96.
[4] Escudero, D. (2022). An Introduction to Secret-Sharing-Based Secure Multiparty Computation. Cryptology ePrint Archive.
[5] Cramer, R., & Damgård, I. B. (2015). Secure multiparty computation. Cambridge University Press.
[6] Evans, D., Kolesnikov, V., & Rosulek, M. (2018). A pragmatic introduction to secure multi- party computation. Foundations and Trends® in Privacy and Security, 2(2-3), 70-246.
[7] https://medium.com/partisia-blockchain/mpc-techniques-series-part-2-security-taxonomy-andactive-security-6b5f14a15217
[8] https://medium.com/partisia-blockchain/mpc-techniques-series-part-1-secret-sharingd8f98324674a