Lan Luo, of the Department of Biostatistics at the University of Michigan and candidate for a faculty position in the Division of Biostatistics, will present:
“Real-time Regression Analysis with Streaming Health Datasets”
Abstract: This research is largely motivated by the challenges in modeling and analyzing streaming health data, which are becoming increasingly popular data sources in the fields of biomedical science and public health. In this work, the term “streaming data” refers to high throughput recording of large volumes of observations collected sequentially and perpetually over time, such as national disease registry, mobile health, and disease surveillance. Due to the large volume and frequent updates intrinsic to this type of data, major challenges arising from the analysis of streaming data pertain to data storage and information updating. This talk primarily concerns the development of a real-time statistical estimation and inference method for regression analysis, with a particular objective of addressing challenges in streaming data storage and computational efficiency. Termed as “renewable estimation”, this method greatly helps overcome the data sharing barrier, reduce data storage cost, and improve computing speed, all without loss of statistical efficiency. The proposed algorithms for streaming real-time regression will be demonstrated in generalized linear models (GLM) for cross-sectional data. I will discuss both conceptual understanding and theoretical guarantees of the renewable method and illustrate its performance via numerical examples. This is joint work with my supervisor Peter Song at the University of Michigan.
A social tea will be held at 9:30 a.m. in A434 Mayo. All are Welcome.