August 20, Track 3 (1000 - 1300)

Chair: Chair: Prof. Fugee Tsung

Zoom Host: Miss. Min Jiang

ThBT3 & ThCT3 Workshop 4 - Quality Analytics

 [1] Quality Big Data

  Prof. Fugee TSUNG

With the incoming digital transformation paradigm brought by big data, a new collection of quality tools is required for quality professionals to solve the complicate challenges and explore the hidden value from data. This talk gives a holistic summary of the innovative methodologies from the perspective of creative thinking and statistical thinking to leverage the tremendous opportunities offered by big data. The new collection of quality tools is illustrated in sequential manner through five incremental phases to solve a real quality problem, i.e., Design, Measure, Analyze, Improve and Control.

 [2] A Robust and Sparse Multivariate Functional Principal Component Analysis for Large Metro Flow Profile Datasets

  Dr. Kai WANG

Wide applications of automated fare collection devices and smart cards generate massive amounts of traffic transaction data, from which we can infer the passenger flow information at any time in a day and at any station in a public transit system. In this work, we analysis a large metro flow dataset in Hong Kong from the perspective of functional data analysis. In particular, the passenger inflow or outflow counts at a station in a day are regarded as a profile in the time domain. A multivariate functional principal component analysis (MFPCA) is proposed to handle a large collection of profiles at different stations in a whole metro system. Furthermore, to get rid of outliers and enhance model interpretability, a robust and sparse version of MFPCA is developed. The proposed methodology is applied to the MTR dataset in Hong Kong, and its superiority is verified in discovering more meaningful passenger flow patterns than the existing methods. The results can also be used for clustering, correlation analysis and outlier identification.

 [3] A Statistical Transfer Learning Approach to Improve Passenger Inflow Forecasts

  Mr. Zhenli SONG

With the proliferation of smart cities, public transportation services such as Urban Railway Transit (URT) systems are increasingly important to commuter’s mobility. In addition, the integration of sensing and information technology within the URT system provides abundant passenger commuting data, including the boarding time and station collected by the Automatic Fare Collection (AFC) system. This plentiful commuting data has been used to support decisions regarding the resource allocation and schedule of the URT system. Despite this rich source of data, accurately forecasting the number of passengers into each specific station remains challenging, especially for newly activated stations with less historical data. To remedy this, we propose a state-space model to investigate the passenger inflow time series within the statistical transfer learning framework. Distinct but closely related stations can be studied simultaneously by using the domain and engineering knowledge to impose a Bayesian prior distribution on the state space parameters. The proposed methodology produces a more accurate and robust estimate of the state space parameters and therefore improves forecasting performance.

 [4] Spatiotemporal Prediction based on Weakly Graph Tensor Decomposition and Completion

  Mr. Ziyue LI

Low-rank tensor decomposition and completion have attracted significant interest from academia given the ubiquity of tensor data. However, low-rank structure is a global property, which will not be fulfilled when the data presents complex and weak dependencies given specific graph structures. One application that motivates this study is the spatiotemporal data analysis. As shown in the preliminary study, weakly dependencies can worsen the low-rank tensor completion performance. In this paper, we propose a novel low-rank CANDECOMP / PARAFAC (CP) tensor decomposition and completion framework by introducing the L1-norm penalty and Graph Laplacian penalty to model the weakly dependency on graph. We further propose an efficient optimization algorithm based on the Block Coordinate Descent for efficient estimation. A case study based on the metro passenger flow data in Hong Kong is conducted to demonstrate an improved performance over the regular tensor completion methods.

 [5] Customer Satisfaction Monitoring through Integrating Online and Offline Data

  Ms. Yinghui HUANG

Nowadays, all industries are exposed to highly competitive business environment. As a result, customer satisfaction has become the core part of modern business. However, most of customer satisfaction measuring methods are based on survey which have several inevitable disadvantages: expensive in terms of time and money and the data may quickly become outdated. On the contrary, vast amounts of customer reviews are generated every day from multiple online platforms. The major challenges to associate the online reviews with survey-based customer satisfaction measurement are the survey data and online reviews are completely different by nature and they are collected through different channels. In this paper, we proposed a comprehensive customer satisfaction measurement system which integrate the survey and online data. A simulation study is conducted to evaluate the effectiveness of proposed method and an empirical study is provided to demonstrate the practical value.