Robust Rolling K-Means (R2K-Means): an Updateable Nonlinear K-Means Clustering Methodology for Financial Time Series

K-Means is a popular clustering algorithm designed to group data points into k clusters. In the financial industry, grouping funds or assets can isolate behaviors and define investment universes using any number of  performance measures, holdings, or alternative features. Standard K-Means clustering at each time increment creates extremely unstable results due to the effects of random initialization and cluster mislabeling. Robust Rolling K-Means (R2K-Means) is the extension of K-Means to time series allowing investors to dynamically track and group funds in a stable and updateable framework.  Since a learning-based model is only as powerful as...

Continue reading

Robust Rolling Regime Detection (R2-RD): A Data-Driven Perspective of Financial Markets

The nonstationary and high-dimensional nature of financial markets poses significant challenges for navigation. Temporally stable regime classification offers a perspective to manage these challenges. We propose the Robust Rolling Regime Detection (R2-RD) framework that adaptively retrains with streaming data and employs temporal ensemble, label assignment, and threshold policies to address temporal instability resulting from nonstationarity, model mismatches, etc.  Since a learning-based model is only as powerful as the data it trains on, the more stable results of the R2-RD make it a better candidate for usage across AI-based applications.

Continue reading

Robust Rolling PCA (R2-PCA): Managing Times Series and Multiple Dimensions

Principal Component Analysis (PCA) is an important methodology to reduce and extract meaningful signals from large data-sets. Financial markets introduce time and non stationarity aspects, where applying standard PCA methods may not give stable results. Our robust rolling PCA (R2-PCA) accommodates the additional aspects and mitigates commonly found obstacles including eigenvector sign flipping, and managing multiple dimensions of the data-set. Since a learning-based model is only as powerful as the data it trains on, the more stable results of the R2-PCA (versus the Standard PCA) make it a better candidate for usage across AI-based applications.

Continue reading

ChatGPT Mutual Fund suggestions — Good or Bad?

Not good - buyer beware as the results were inconsistent textually and from a performance perspective, where it suggested index trackers (without specifying Indices) and otherwise generally poor performing funds. This is not surprising as it is akin to asking an English major to solve a differential equation - pun intended! The math here is in ingesting existing published results and making them contextually available and not on training the models to accurately select the asset. As such, the basis of ChatGPT is the Large Language Models (LLM) that are trained on existing ‘outcomes’ that are solicited from...

Continue reading

Is Artificial Intelligence deployment the new Y2K?

In the realm of Artificial Intelligence (AI) some amazing things are being done by some amazing people that are leading to some amazing results. Hopefully this pacifies the shallow learning experts.  Now for the deep(er) learning aspects of the current push of AI everywhere and for all.  For the older engineers, most of the models being deployed (with updates) have been there for many years so what gives?  Well for one we know that great strides in readily available computing power have been a great catalyst for the more pervasive push of AI.  Another has been the ever-increasing...

Continue reading