Hadoop 0.20.2 had a bug where fixing missing block replicas would not respect the rack aware placement policy. Over the course of many years we had lost enough drives to start losing blocks whenever we lost a drive. Took us a while to figure out what was happening. Luckily the data in hadoop was not the primary source, but we had to recopy data from origin. Then copy all files in HDFS to a new HDFS cluster (almost every file had at least one block affected) Petabyte scale copies don't happen quick. :)
For further actions, you may consider blocking this person and/or reporting abuse
We're a place where coders share, stay up-to-date and grow their careers.
Hadoop 0.20.2 had a bug where fixing missing block replicas would not respect the rack aware placement policy. Over the course of many years we had lost enough drives to start losing blocks whenever we lost a drive. Took us a while to figure out what was happening. Luckily the data in hadoop was not the primary source, but we had to recopy data from origin. Then copy all files in HDFS to a new HDFS cluster (almost every file had at least one block affected) Petabyte scale copies don't happen quick. :)