
Federated and Imbalanced Learning
Federated Learning Compatible with Imbalanced Clinical Data
We are developing federated learning for NLP algorithms to classify new learning methods to work with imbalanced clinical data.
Advancing Federated Learning for NLP in Healthcare
Problem and Need for the Study
Natural Language Processing (NLP) algorithms offer a way to conduct research with clinical text data. Federated learning is a decentralized approach to algorithm training, enabling researchers to train models on smaller, localized datasets without sharing sensitive health information. However, a key challenge arises when the datasets are imbalanced, meaning some categories have significantly more training data than others. This can cause misleading results, especially in critical applications like NLP in healthcare.
Innovation and Impact
This project is a continuation of the research team's previous work on federated learning for medical images.
The primary goals are:
- To develop and validate federated learning for NLP algorithms that extract and classify data from clinical text.
- To develop new federated learning methods compatible with imbalanced learning frameworks and capable of working under constraints.
Currently, federated learning algorithms are limited to solving problems without constraints. The federated learning methods developed in this study will be the first to work with constraints and be compatible with frameworks that account for imbalanced clinical data.
Key Personnel


Performance Sites
University of Minnesota
- Multiple Principal Investigators: Ju Sun and Rui Zhang
Grant Details
- This project is funded by Cisco Systems, Inc.
- Project dates: 01-January-2023 to 31-December-2023