Tech Talk [Bayer]: Data-Centric Revolution: Enhancing AI with Croissant Metadata

Speaker

Steffen Vogler (Bayer)

Host

Christiana Kalfas
Sr. CRC, CSAIL Alliances
The deployment of AI-based medical software has yielded mixed results, highlighting the urgent need for a data-centric approach to machine learning. This talk investigates the transformative potential of the community tool called "Croissant", a standardized metadata format developed to enhance the ML-readiness of datasets. By shifting the focus from model architecture to the quality and structure of input data, Croissant facilitates the creation of more robust and accurate AI systems. We examine a proposal of an ecosystem supporting Croissant, emphasizing its integration with data providers, tech enablers, and governance frameworks. Furthermore, we explore how this innovative tool can streamline the development process, improve data discoverability, and promote responsible AI practices, ultimately leading to improved outcomes in healthcare and other domains.



Steffen Vogler (he/him) is a Principal Imaging Technology Scientist at Bayer Radiology-R&D, leading research and product development on AI in medical computer vision with special focus on the Radiology domain. His interest is in data-centric machine learning, ethical AI and health equity. He has been member of the ITU-WHO Focus Group "Artificial Intelligence for Health" and currently curator of the discovery track “Data-centric Machine Learning for Good” at “AI for Good”. Prior to joining Bayer, he did a PhD in Neurobiology and worked on basic research question around memory formation in the mammal brain.

- new meta data exchange format https://github.com/mlcommons/croissant
Bayer’s AI Innovation Platform https://app.innovationplatform.ai/