used to update the data_locality.csv file in the case where other transforms have been applied to the HDFS file splitting algorimth - with this information extracted from the hadoop.log file