Managing Sensitive Data in memory with Java

In today’s digital age, securing important sensitive data like Primary Account Numbers (PANs), Expiration Dates, Card Security Codes (CSCs), and Personal Identification Numbers (PINs) is paramount.

Working in the Payment Card Industry, for example, means regularly handling this type of sensitive data, which poses unique challenges, especially when it comes to data security in software development, which must meet specific standards based on various certifications, including PCI SSF.

The Java Case

Java is one of the most widely used languages in the world and, of course, it is also used in many applications that handle sensitive data.

Java has built-in security features to keep applications secure and offers a powerful environment but it also presents specific challenges due to its use of immutable strings. Handling sensitive data as strings in Java carries the potential risk of exposing that data in memory dumps.

Why String is Immutable in Java?

James Gosling, the creator of Java, in an interview answering the question of why strings were immutable, said:

I would use an immutable whenever I can.

In his opinion, this design choice serves multiple purposes, including security, by preventing unwanted or unauthorized modification of string data, performance, by using a string pool (known as String Intern Pool or String Constant Pool) where identical strings are stored only once, and thread safety, which simplifies concurrent programming by eliminating the need for additional synchronization.

However, for sensitive data, immutability has a drawback: it prevents secure data disposal. As I already said, once a string is created its content cannot be deleted, which means that sensitive data remains in memory until the Garbage Collector clears it, an unpredictable and potentially risky scenario, as I have written in the past.

To mitigate this risk, a practical solution involves converting sensitive string data to byte (or char) arrays. This allows data in memory to be overwritten after use, reducing (but not eliminating) the risk of exposure in the event of a memory dump.

Converting Strings to Byte or Char Arrays

The main question is: should I use a byte or a character array? There are many debates on this topic.

For example, in the OWASP Mobile Application Security test for sensitive data in memory, it says:

To properly clean sensitive information from memory, store it in primitive data types, such as byte-arrays (byte[]) and char-arrays (char[]). You should avoid storing the information in mutable non-primitive data types.

Looking at the Java documentation, for example, the Java Cryptography Architecture (JCA) Reference Guide, it says:

It would seem logical to collect and store the password in an object of type java.lang.String. However, here’s the caveat: Objects of type String are immutable, i.e., there are no methods defined that allow you to change (overwrite) or zero out the contents of a String after usage. This feature makes String objects unsuitable for storing security sensitive information such as user passwords. You should always collect and store security sensitive information in a char array instead.

The Java methods used to handle sensitive data use char array, but in general, you can use either byte and char arrays to store sensitive data. Don’t worry. The important thing is that you remember to clean them as soon as possible.

Code Examples

To better understand secure data handling in Java, let’s dive into some practical examples. We are geeks, we want to see practical things, not read documentation.

Around the Internet

Even though it isn’t completely up to date, java2s offers a wide range of code snippets in its utility methods section that can be great for studying and experimenting with Java arrays.

The Apache Commons Lang library is also very useful for working with arrays. It provides a host of helper utilities for the java.lang API. In our case, we are particularly interested in the ArrayUtils class, which offers all the extended methods we need.

Some Custom Snippets

Here a couple of custom code snippets I created, as the library mentioned above doesn’t include these specific methods. There are probably better ways to do this, but let’s use the simplest ones for now.

If you have code like this to check if a string is a number:

1
2
3
4
5
6
7
8
public static boolean isBigInteger(String text) {
try {
new BigInteger(text);
return true;
} catch (NumberFormatException e) {
return false;
}
}

You can change it to use byte[] instead of a String object:

1
2
3
4
5
6
7
8
public static boolean isNumeric(byte[] text) {
for (byte singleChar : text)
// Check if the byte is within the ASCII range for '0' to '9'
if (singleChar < '0' || singleChar > '9')
return false;

return true;
}

We could also use the Java method Character.isDigit(char ch), which is a built-in method, instead of that if.

Another example, in this case we want to parse a byte[] into a Long:

1
2
byte[] sensitiveData = { 51, 56, 48, 52, 49, 50, 50, 48, 57, 54, 48, 49 };
Long longFromSensitiveData = Long.parseLong(new String(sensitiveData));

The problem here is that we create a new String object with the sensitive data to convert it to Long, and obviously we can’t do that because of the immutability story.

To avoid the String object, we can write a method like this:

1
2
3
4
byte[] sensitiveData = { 51, 56, 48, 52, 49, 50, 50, 48, 57, 54, 48, 49 };
long longFromSensitiveData = 0;
for (byte singleChar: sensitiveData)
longFromSensitiveData = longFromSensitiveData * 10 + singleChar - '0';

The trick here is that Java converts '0' to its ASCII value, which is 48. This value is subtracted from the singleChar ASCII value, then added to longFromSensitiveData. The * 10 operation shifts the converted number to the left, preparing for the next digit.

Clear ‘Em All!

Always remember to overwrite arrays with zeros (or random data) after their use to securely clear sensitive information! Otherwise this article would be really useless.

Here are some examples you can use:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
public static void main(String[] args) {
// Password Example
byte[] sensitiveData = { 82, 111, 109, 97, 49, 57, 50, 55, 33 };
System.out.println("Clear data: " + new String(sensitiveData));

// Fill it with all zeros using simple for loop
for (int i = 0; i < sensitiveData.length; i++)
sensitiveData[i] = (byte) '0'; // Or simply 48, in ASCII
System.out.println("Zeros with loop: " + new String(sensitiveData));

// Restored for testing
sensitiveData = new byte[] { 82, 111, 109, 97, 49, 57, 50, 55, 33 };
System.out.println("Clear data: " + new String(sensitiveData));

// Fill it with all zeros using fill method
Arrays.fill(sensitiveData, (byte) '0'); // Or simply 48, in ASCII
System.out.println("Zeros with fill: " + new String(sensitiveData));

// Restored for testing
sensitiveData = new byte[] { 82, 111, 109, 97, 49, 57, 50, 55, 33 };
System.out.println("Clear data: " + new String(sensitiveData));

Random random = new Random();
// Fill it with random numbers using simple for loop
for (int i = 0; i < sensitiveData.length; i++)
// This generate ASCII byte value from 48 to 57
sensitiveData[i] = (byte) (random.nextInt(10) + '0');

System.out.println("Random with loop: " + new String(sensitiveData));
}

There are more elegant ways to handle sensitive data, these are just a taste.

Conclusion

Effectively managing sensitive data in Java involves using techniques like byte arrays and memory clearing. These practices align with secure coding standards and help ensure that your applications handle data responsibly.

Unfortunately, I notice that many programmers are not interested in this topic, which suggests that security remains a distant concept often seen as secondary or an afterthought. However, security should be integrated into the Software Development Life Cycle, following the principle of Secure by Design.

I will never stop saying it!