Abstract
<jats:p>The chapter focuses on the timeliness dimension of information quality and considers situations in which the value of data can fluctuate over time. If timeliness is viewed in a rigid manner, blindly minimizing the time between an event's occurrence and its use, decision-making and analysis may actually utilize information that is inappropriate, even if the data are otherwise accurate, complete, and free of duplicates. Many types of information, from economics to weather and traffic, follow a periodic pattern where knowledge from events in the more distant past may be more important than recent data. This occurs when the pattern transitions into a new cycle. This chapter proposes that timeliness is more than just time lag and presents a method for automatically quantifying how appropriate various historical portions of a dataset would be for machine learning and analysis in each present context. Our contribution is the development of an algorithm based on topological data analysis (TDA) to simplify and identify invariant properties of the dataset. The algorithm is evaluated and examined on a synthetic dataset to better understand how it performs with different types of cyclical patterns and how the values of parameters affect its behavior. Our experiments show that our approach outperforms time-agnostic, general-purpose TDA, and that our algorithm can be configured to exhibit stable behavior with respect to the type of cyclical pattern, while also providing good performance when assigning timeliness values to the dataset.</jats:p>