Why multiple threads corrupt data?

Threads created in java uses processor-registers and per-processor caches to speed up memory access which gives good performance advantages and for the same reasons memory operations is not immediately visible to all other threads. As a result thread might be working on data which is updated by another thread but still not visible to it, and hence final output will be wrong.

How can we guarantee consistency of memory operations?

Using these concepts:

  • Synchronization
  • Volatile variables

What is synchronization?

Synchronization is a way to make some code thread safe. A code that can be accessed by multiple threads must be made thread safe.

Thread Safe code describe some code that can be called from multiple threads without corrupting the state of the object or simply doing the thing the code must do in right order.

Synchronization enforces a re-entrant mutex(mutually exclusion means 1 at a time), preventing more than one thread from executing a block of code protected by a given monitor at the same time.

Synchronization also plays a significant role in the JVM, causing the JVM to execute memory barriers when acquiring and releasing monitors (lock).

  • When a thread acquires a monitor, it executes a read barrier — invalidating any variables cached in thread-local memory (such as an on-processor cache or processor registers), which will cause the processor to re-read any variables used in the synchronized block from main memory.
  • Similarly, upon monitor release, the thread executes a write barrier — flushing any variables that have been modified back to main memory.

The combination of mutual exclusion and memory barriers means that as long as programs follow the correct synchronization rules (that is, synchronize whenever writing a variable that may next be read by another thread or when reading a variable that may have been last written by another thread), each thread will see the correct value of any shared variables it uses.

Give example of case where we need Synchronization.

For example, we can take this little class :

public class Example {
     private int value = 0;
     public int getNextValue(){
         return value++;
     }
 }

It’s really simple and works well with one thread, but absolutely not with multiple threads. An increment like this is not a simple action, but three actions:

  • Read the current value
  • Add one to the current value
  • Write that new value

Normally, if you have two threads invoking the getNextValue(), you can think that the first will get 1 and the next will get 2, but it is possible that the two threads get the value 1. Imagine this situation:

Thread 1 : read the value, get 0, add 1, so value = 1

Thread 2 : read the value, get 0, add 1, so value = 1

Thread 1 : write 1 to the field value and return 1

Thread 2 : write 1 to the field value and return 1

These situations come from what we call interleaving. Interleaving describe the possible situations of several threads executing some statements. Only for three operations and two threads, there is a lot of possible interleaving.

So we must make the operations atomic to works with multiple threads.

In Java, the first way to make that is to use a lock. All Java objects contains an intrinsic locks, we’ll use that lock to make methods or statement atomic. When a thread has a lock, no other thread can acquire it and must wait for the first thread to release the lock. To acquire the lock, you have to use the synchronized keyword to automatically acquire and release a lock for a code.

You can add the synchronized keyword to a method to acquire the lock before invoking the method and release it after the method execution. You can refractor the getNextValue() method using the synchronized keyword :

public class Example {
     private int value = 0;
     public synchronized int getNextValue(){
         return value++;
     }
 }

With that, you have the guarantee that only thread can execute the method at the same time. The used lock is the intrinsic lock of the instance.

If the method is static, the used lock is the one present Class object. If you have two methods with the synchronized keyword, only one method of the two will be executed at the same time because the same lock is used for the two methods. You can also write it using a synchronized block :

public class Example {
     private int value = 0;
     public int getNextValue() {
         synchronized (this) {
             return value++;
         }
     }
 }

Using synchronized blocks, you can choose the lock to block on. By example, if you don’t want to use the intrinsic lock of the current object but another object, you can use another object just as a lock:

public class Example {
     private int value = 0;
     private final Object lock = new Object();
     public int getNextValue() {
         synchronized (lock) {
             return value++;
         }
     }
 }
 

What is the difference between Synchronization and Volatile variables?

The volatile keyword ensures only visibility, not atomicity. The synchronized blocks ensure visibility and atomicity. So you can use the volatile keyword on fields that doesn’t need atomicity (if you make only read and write to the field without depending on the current value of the field by example).

Volatile does not guarantee the atomicity of composite operations such as incrementing a variable where as synchronized modifier guarantees the atomicity of composite operations.

Example:

private static volatile int sno = 0;
 public static int getNextSno() {
     return sno++;
 }
 

The above method won’t work properly without synchronization. Every invocation may not return unique value because increment operator performs three operations read, add, store. first it reads the value then increments plus one then it stores back a new value. When a thread reads the field , another thread may enter and read the same value before the first thread writes the incremented value back . Now both thread returns the same sno. Use synchronized on the method getNextSno(), to ensure that the increment operation is atomic and remove volatile modifier from sno.


Rakesh Kalra

Hello, I am Rakesh Kalra. I have more than 15 years of experience working on IT projects, where I have worked on varied complexity of projects and at different levels of roles. I have tried starting my own startups, 3 of those though none of it were successful but gained so much knowledge about business, customers, and the digital world. I love to travel, spend time with my family, and read self-development books.

0 Comments

Leave a Reply