https://www.high-endrolex.com/48 Unlocking the Power of Open Cloud Datasets in Healthcare AI Development
Sticky Banner
public/uploads/demo/banner-tai-nguyen-1.jpg

Blogs

Welcome to our blog section on Artificial Intelligence (AI)! Here, we will explore in-depth one of the fastest and most exciting technological fields of the modern era.

Unlocking the Power of Open Cloud Datasets in Healthcare AI Development

29/11/2023

Unlocking the Power of Open Cloud Datasets in Healthcare AI Development

In a recent presentation at the Society for Imaging Informatics in Medicine (SIIM), Dr. Erin Chu, DVM, PhD, from Amazon Web Services (AWS), shed light on the untapped potential of open cloud datasets for advancing AI development in healthcare. 

Dr. Chu emphasized a critical challenge faced by data professionals across various fields: "The whole reason my team exists is because we believe that sharing open data in the cloud lets people spend more time analyzing and innovating on that data rather than acquiring data." Recognizing this bottleneck, she highlighted the mission of her team at AWS—to facilitate more time for analysis and innovation by championing the sharing of open data in the cloud. 

As the leader of AWS's open data team, Dr. Chu oversees the Registry of Open Data, a digital catalog housing vast open datasets on AWS. The majority of this data resides in object-based storage "S3 buckets," ensuring convenient access through native interfaces. The registry aims to bring users closer to the data, presenting them with tangible S3 buckets that they can explore either in the console, utilizing third-party tools, or via the command-line interface. The registry also provides users with usage examples, publications, and articles illustrating how others leverage the available datasets. 

Dr. Chu further outlined the Open Data Sponsorship Program, an application-based initiative covering the costs associated with storing and distributing high-value, high-impact data. Notably, this program supports renowned datasets such as the Imaging Data Commons, NYUMets, FastMRI, and the Emory Breast Imaging Dataset (EMBED). Dr. Chu emphasized that the ownership of these datasets remains with the individuals managing them. 

For open datasets to thrive, Dr. Chu stressed the importance of optimizing data for analysis, regardless of its storage method. Creating a community around the data is equally crucial, fostering collaboration and encouraging diverse uses of shared information. 

Dr. Chu acknowledged that a significant challenge in medical imaging lies in the need for a gold standard deidentification process, traditionally accomplished through human review. However, as datasets scale up to thousands or hundreds of thousands of images, an efficient AI-driven deidentification process becomes imperative. While benchmarks using AI for deidentification have been explored, Dr. Chu highlighted the ongoing journey toward their practical acceptance in the medical imaging field. 

The insights from Dr. Chu's presentation underscore the transformative impact of open cloud datasets on healthcare AI development, offering a pathway to streamline data access, encourage collaboration, and accelerate innovation in medical imaging. 

----------------------------------

Meet DrAid™ for Data Lake and Data Management, a sophisticated healthcare cloud system designed to update and store data from various application systems used in the hospital and data generated by healthcare staff. This infrastructure can also allow for the extraction of data and information to meet needs anytime, anywhere (without limitations on time and location). 

Explore the capabilities of DrAid™ for Data Lake and Data Management today to witness the transformative impact of integrating technology into the medical field!