Tag Archive for 'encryption'

No Wonder People get Confused about Cloud Security

Another week goes by in cloud computing with more vendor hyperbole leaving us and a few others scratching our heads. ParaScale’s recent announcement is a good example. We took the liberty of pulling a few quotes for reference:

“ParaScale also provides highly scalable data encryption that secures data without requiring storage of keys within the cloud or third-party key management. Cloud storage management functions such as replication and migration are performed on encrypted content with multiple protocols and algorithms supported.”

Source: Computer Technology Review

“With ParaScale’s keyless encryption, a user’s authentication into the back-end system generates an encryption key on the fly to write to the user’s apportioned virtual file system (VFS). A similar process will allow users secure access to reading data.”

Source: SearchStorage

“The enhanced security measures also means that data on stolen discs or nodes can’t be accessed unless the thief has the dedication to make an exact copy of your storage cluster on their end, Norris says.”

Source: Computer Technology Review

From our standpoint, although ParaScale claims highly secure data encryption, no details are offered regarding how the keys are generated, or where, if at all, the encryption keys are stored within ParaScale’s system.  It would be incorrect to assume encrypting one’s data is the end of the story when it comes to storing data securely.  The biggest outstanding question, and where most people go wrong in their implementation, is in how keys are managed and secured.  Without this information it is impossible to evaluate the security of ParaScale’s approach.

Since there doesn’t appear to be much information available to the public, we attempted to fill in some of the details we thought were missing from their release.  Below we consider several possibilities that fit with scant amount of information available, and analyze the pros and cons for each approach.

The first possibility is that ParaScale is generating keys randomly.  Since the process of generating encryption keys is random, decrypting the data requires recovering the same randomly generated key from some location.  If ParaScale is taking this approach, the question is how and where are these keys being secured?  Are they stored on other disks, other nodes?  What process is able to recover these keys and what stops an attacker from mimicking that process?

One possible answer is that ParaScale’s system maintains an internal private key, perhaps for each “virtual file system” which is used to encrypt the random keys and then store those encrypted keys with the encrypted data. Should this be the case, where are the private keys being stored, and what would happen to the user’s data should there be data loss affecting the private key?  If replication is used to protect the keys against loss, then each copy is another vector for attack.

A second possibility is that the keys are generated deterministically instead of randomly.  The advantage of generating keys deterministically is that no keys need to be stored anywhere.  The way this would work is that some set of information related to the write request is entered into a function to derive a key.  Then when reading the data, the same information could be available to generate the same key and decrypt the data.  For example, that information might consist of the “virtual file system id”, “the id of the node where the data is stored”, “the name of the data chunk being stored”, etc.

This seems to fit with the quote from Norris: “data on stolen discs or nodes can’t be accessed unless the thief has the dedication to make an exact copy of your storage cluster on their end”.  If this is their approach, it would be a little easier than creating an exact copy of the storage cluster, instead the attacker just needs to be able to predict or sufficiently narrow down the set of parameters that go into the key derivation function.  In our example, if the attacker can guess the node id, the vfs id, and the chunk name, they could trivially derive the key to decrypt the data.  Predicting this meta-data information is likely to be much easier than attempting to guess a randomly generated key.  Therefore, while much simpler than managing keys, the security of this approach suffers significantly.

Finally, there was the quote from SearchStorage that the user’s authentication into the system is what generated the key.  The third possibility is that the user’s credentials are what are used to generate the encryption key.  This is similar to the above approach, except instead of using information related to the write request to generate the key, something else, like the user’s password would be used to generate the encryption key.  The limitation here is that user’s passwords are significantly easier to brute force than a 128-bit or 256-bit encryption key.  The other downside is that you wouldn’t be able to change your password without first re-encrypting the data or keys which are protected by the old password.

It is impossible for us to evaluate the security of ParaScale’s approach without more knowledge of where the keys are stored within the system. However, this much would at least be clear: since the ParaScale system itself is generating, managing and using these keys, it seems there is likely one or more control nodes within their system which represent a single point of compromise in its system.  While their encryption may protect against theft of disk or node, the remote compromise of an on-line node would yield keys or decrypted user data.  For cloud environments, the most secure approach would be for the end-user, and only the end-user to be in charge of putting their data back together, rather than having the cloud storage provider decrypt data for them.

Cleversafe’s approach to key management

Our approach is not only clearly explained, but relies on well known and analyzed techniques for achieving data security.  Moreover, it places control of assembling and decrypting data squarely in the hands of the end-user while still avoiding the need for key management.

When we completed our 2.0 announcement that included our SecureSlice™ technology, we discussed keyless encryption. With SecureSlice™ technology a service provider cannot go through a public cloud to access customer data or the encryption keys. Only the end-customers with credentials are in control of who can actually access their data. As such, the security of data comes down to the access control mechanism each customer puts in place.

Federal CIO challenges with mandate towards cloud computing

A recent InfoWorld > Cloud Computing article had a story by David Linthicum on the pending mandate on cloud computing usage for government agencies.

The article cites a December Channel Insider article that explains:

According to various published reports, the OMB will mandate in the fiscal year 2011 (which starts in October 2010) that federal agencies not using cloud computing or making cloud computing part of new IT projects explain why. By fiscal year 2013, the policy will require agencies to provide details and road maps on their plans for adopting cloud-based technologies.

With the OMB pushing towards Cloud Computing, the question is what challenges exist for CIOs?

The largest issue I see for Federal CIOs moving to the cloud is addressing security, particularly of data. Data security in stand-alone systems relies on securing the perimeter, and in a cloud, there is commingling of data on the same hardware. Current guidance for securing data in the cloud is to encrypt it, but encryption introduces additional challenges such as key management, and the requirement to unencrypt prior to search and compute.

People are already addressing these items – the Homeland Security Newswire recently discussed how researchers are working on being able to search encrypted documents.

And some storage providers (such as us) are working on different methods to actually store the data itself, for example, using Information Dispersal Algorithms (IDAs) to bit-split data into slices which results no entire copy of data residing on any hardware, and is essentially encrypted data without the key management issue.

A great read is the Cloud Security Alliance’s Security Guidance for Critical Areas of Focus in Cloud Computing, helps in understanding requirements.

Federal CIOs are going to have to take a closer look at their storage platforms to see if secure data is an intrinsic characteristic, or a bolt-on, and question if the bolt-on approach is going to work in the long run.

3 Reasons Why Encryption is Overrated

UPDATE 7-31-09:
This post caused a great deal of controversy.  Some readers left with the impression that we believe encryption to be obsolete or unnecessary.  That was not our intended message; rather it was to expose common problems with conventional approaches to data encryption and what dispersal offers to address them.  Other readers disagreed with the veracity of our claims, which is not surprising given that the post lacked technical details to backup them up.  To provide technical details in defense of the claims made in this post, we have written three follow up responses: Part 1, Part 2, and Part 3 which we invite you to see.

When it comes to storage and security, discussions traditionally center on encryption.  The reason encryption – or the use of a complex algorithm to encode information – is accepted as a best practice rests on the premise that while it’s possible to crack encrypted information, most malicious hackers don’t have access to the amount of computer processing power they would need to decrypt information.

But not so fast.  Let’s take a look at three reasons why encryption is overrated.

1) Future processing power

While processing power today may keep encrypted files (that are stored in the cloud, for example) safe, as processing power improves, archived encrypted files will require systematic re-encryption to remain safe from potential hackers. Systematic re-encryption, though, is difficult, laborious and expensive.

2) Key management

To decode the encrypted files, a user needs the encryption key.  Unfortunately, managing a large number of encryption keys can be painful. Yes, there are enterprise key management (EKM) solutions that promise the ability to manage and change keys throughout their life cycle – but these serve more as a band-aid to the fundamental pain of dealing with numerous keys. As a chain is only as strong as its weakest link, an enterprise key manager is only as good as the integrated key management systems that use it. If any system downstream from a secure key manager exposes the key, or is not designed to cover a certain threat, the whole thing becomes not secure.

3) Disclosure laws

Beyond technology, breach disclosure laws  — that require organizations to notify individuals when personal information has been or at least is reasonably believed to have been acquired by an unauthorized entity – can result in a PR nightmare for a business that encryption can’t resolve.  A quick visit to Privacy Right Clearinghouse lists the compilation of data breaches since 2005 that expose individuals to identity theft as well as breaches that qualify for disclosure under state laws.  Not a short list.

A technologist with a good understanding of encryption methods may be comfortable with some of the breaches or data losses reported due to the strengths of the encryption.  But this doesn’t matter in the court of public opinion; once data – encrypted or not – is lost, so is the trust of the general public.  Encryption is simply not enough to counter business concerns about the security of their data.

Consider Dispersal

With full disclosure – Cleversafe’s storage solution is based on Dispersal – consider its security benefits. Dispersed Storage technology divides data into slices, which are stored in different geographies.  Each slice contains too little information to be useful but any threshold can be used to recreate the original data.  Translation – a malicious party cannot recreate data from a slice, or two, or three, no matter what the advances in processing power.  And Dispersal does not require the time and energy of re-encryption to sustain data protection.

Maybe encryption alone is “good enough” in some cases now  – but Dispersal is “good always” and represents the future.