The measurement of flow velocity at high frequencies (20-200 Hz) has been made easier over the last couple of decades by the development and commercialization of a variety of instruments, many of which are capable of measuring multiple sampling volumes simultaneously. A variety of methods has been proposed to remove errors in velocity time series and classify the quality of data. However, most methods are applied using custom algorithms written to treat custom data formats and remain hidden from a wider audience. For this project we made a new set of open-source algorithms, written in Matlab to: i) organize the data output from multiple instruments into a common format; ii) present the data in a variety of interactive figures for visualization and assessment; iii) clean the data by removing data spikes and noise; and iv) classify data quality. We hope that these algorithms will form the nucleus of an evolving toolbox that will help to accelerate the training of hydraulic researchers and practitioners, ensure a consistent application of methods for turbulence analysis, remove the bias of poor quality data from scientific literature, and ease collaboration through the sharing of data and methods. (MacVicar et al, 2014, Computers and Geosciences)
High resolution velocity profiling instruments have enabled a new generation of turbulence studies by greatly increasing the amount and quality of simultaneous velocity measurements that can be obtained. As with all velocity profiling instruments, however, the collected data are susceptible to erroneous spikes and poor quality time series that must be corrected or removed prior to analysis and interpretation of the results. We therefore added the ability to use Auto Regressive Moving Average (ARMA) models for comprehensive data cleaning to MITT. The recommended approach to detect and replace outliers in profiled velocity data is to use a spatial ‘seasonal’ filter that takes advantage of information available in neighboring cells and a low order ARMA model. Recommended data quality metrics are the spike frequency and the coefficients of the model. This approach is more precise than the most common current despiking method, offers a seamless method for generating replacement values that does not change the statistics of the velocity time series, and provides simple metrics with clear physical interpretation that can be used to compare the quality of different datasets.