With the increase in scale of HPC systems, the frequency of system wide failures is expected to increase. The performance of Coordinated Checkpoint/Restart (C/R), the traditional fault tolerance technique, degrades under high failure rates because of frequent global rollbacks, which themselves are susceptible to failures. We propose CoLoR, a fault tolerance scheme that i)requires only the failing process to recover, ii)overlaps reexecution with restart, and iii)avoids the cumulative effect of successive failures. Our theoretical analysis reveals that such a scheme results in lower expected completion time than coordinated C/R. We also provide a proof-of-concept implementation in MPI using receiver based message logging and colocated rescuer (CoLoR)processes, and evaluate its performance on several HPC benchmarks. Our experimental results, combined with observations from the theoretical analysis, show that CoLoR can outperform both traditional C/R and replication over a large range of system sizes, without using extra logger nodes.