Large, High-quality Dataset of Bone Marrow Cell Images Developed for Training AI Systems to Identify Blood Diseases

Large, High-quality Dataset of Bone Marrow Cell Images Developed for Training AI Systems to Identify Blood Diseases

Artificial intelligence-based systems have been developed that can greatly improve the speed of diagnosis and one area that could benefit from AI-based tools is for helping to diagnose blood disorders. Identifying blood disorders can be an incredibly time-consuming process, as it requires an analysis of bone marrow cells using optical microscopes. That process is still conducted manually many thousands of times a day.

Algorithms can be developed to analyze medical images to identify blood disorders, but they need to be trained, which requires access to large, high-quality data sets. Now a database of 171,374 high-quality microscopic images of bone marrow cells has been developed which can be used to train AI-based systems to identify anomalies that are characteristic of blood disorders.

The dataset includes single-cell images taken from 945 patients with a variety of blood diseases and was developed by a team of researchers at Helmholtz Munich, LMU University Hospital Munich, the MLL Munich Leukemia Lab, and the Fraunhofer Institute for Integrated Circuits.

“To our knowledge, this image database is the most extensive one available in the literature in terms of the numbers of patients, diagnoses, and single-cell images included,” explained the researchers in the paper. “It allows us to train high-quality classifiers of leukocyte cytomorphology that identify a wide range of diagnostically relevant cell species with high precision and recall.”

The researchers applied convolutional neural networks (CNNs) to the dataset and found they outcompeted previous feature-based machine learning algorithms in terms of accuracy and generalizability and the researchers believe their CNNs could solve the classification problem of single bone marrow cells. “The analysis of bone marrow cells has not yet been performed with such advanced neural networks,” said Christian Matek, Ph.D., a postdoctoral researcher at the Helmholtz Zentrum München, which is because, until now, high-quality public datasets of bone marrow cells have not been available.

The researchers are planning to further expand their bone marrow cell database to capture a broader range of findings and say further work is needed to evaluate the performance of the network in a real-world diagnostic setting, but the database is publicly available and can be freely used for research and training purposes and as a reference for AI-based diagnostic approaches, including the identification of blood cancers.

You can read more about the study in the paper – Highly accurate differentiation of bone marrow cell morphologies using deep neural networks on a large image data set – which was recently published in Blood. DOI: 10.1182/blood.2020010568