Functional data analysis

Introduction

Functional data analysis (FDA) deals with the analysis of data that are defined on continue. Recently, this data analyzing approach has been receiving increased attention and utilized across various fields of research, such as biomechanics, psychology, environmental studies, sports, energy efficiency, public health, economics and finance, etc.

The idea behind FDA is that when the objectives of the analysis are continuous functions, the discrete data will be treated as finite snapshots from the continuous underlying processes; one needs to estimate these underlying processes first, and the subsequent analysis and inference will be performed based on the estimated continuous processes, which are referred to as the fitted functional data.

FDA has its advantages for data analysis in many aspects. A significant advantage is that when the data are in the form of smooth functions, FDA allows us to study not only the level of the data but also the derivatives. Moreover, as long as the underlying processes are continuous, FDA can provide consistent estimators that do not suffer from model misspecification.

Ongoing Projects

We have done, and are currently working on projects in the variety of fields, summarized as follows, adopting the idea of FDA.

Basketball FDA, as a nonparametric estimation tool, is used to model the score difference between home and away teams, allowing for arbitrary dependent structure among score change increments. The analysis based on the estimated functional score difference shows that momentum and home court advantage are both important influencing factors in determining the game outcomes.

Energy efficiency FDA techniques are used for steam usage analysis across different customers. Time- based data are identified as functional and potential patterns are recognized. The analyzed results are used to measure external demand within a specified geographic area, which further motivates the energy efficiency maximization process.

Blood Pressure FDA is used to compare the rhythms of blood pressures at each time point between two different patient groups: group A: patients with delayed complicated aortic dissection which should be treated by surgery and group B: patients remaining uncomplicated aortic dissection which could be treated conservatively. The goal is to use this method to find the blood pressure pattern that most likely leads to the complicated aortic dissection. Also, we would like to compare the 3-dimensional models of healthy and pathological aortic arches and use the similar method to find the pattern that most likely leads to the aortic dissection.

Distribution equality tests for functions A functional data approach is used to examine the distribution equality of GDP functions from different PWT versions. The discrete GDP observations are modeled using FDA, and a bootstrap test is performed for the hypothesis that the distribution functions of GDP are pairwisely equal across different PWT versions.

Functional dynamic factor models A functional data analysis approach is adopted to generalize the dynamic factor models so that it can accommodate the possible time-varying property of the factor loading non-parametrically. Large sample theories and simulation results are provided; an application of the model is also presented, using a widely employed macroeconomic data set.

Continuous-time GARCH A nonparametric method for the estimation and prediction of continuous- time GARCH models is proposed, using the idea of FDA. The goal is to develop such a continuous-time generalization for GARCH that helps to avoid the potential model misspecification problems and yields good estimation and forecast.

Bootstrap for functional linear regression models A bootstrap method is proposed for functional linear regression models with cross-sectional dependencies. The goal is to provide alternatives, via bootstrap, to the asymptotic confidence sets for statistical inference in functional linear regression analysis. The properties of the bootstrap method will be proved; simulation and empirical analysis will be presented.