Cancer.Poly: A Hybrid Polystore/LLM Simplifying Access To Knowledge in Clinical Data Lakes
Speaker
Michael Gubanov
Florida State University
Host
Michael Stonebraker
Title:
Cancer.Poly: A Hybrid Polystore/LLM Simplifying Access To Knowledge in Clinical Data Lakes
Abstract:
Clinical data lakes store wealth of patient information that is often life-saving.
However, getting to it is too time-consuming due to the knowledge representation barriers and scale.
Hence, very few people use all available information, and consequently the patients often do not get the best treatment.
In this talk I will describe our work on Cancer-Poly - a hybrid in-memory analytical Polystore/LLM designed to simplify access to
complex clinical data lakes. I will briefly overview its architecture, key use cases and then focus in more detail on our custom embeddings
that are the foundation of most system components.
Bio:
Michael Gubanov is an Assistant Professor at Florida State University. His research interests include Polystore databases, Data fusion,
Artifial Intelligence (AI), and Large Language Models (LLMs). His most recent project is in collaboration with Moffitt Cancer Center and Research Institue
- a Hybrid Polystore/LLM for Cancer Data Lakes.
Gubanov earned his PhD from the University of Washington (UW) where he worked on scaling up Data fusion.
During the PhD program he spent time working at IBM Almaden Research Center on Data Integration (project Clio, productized as a part of IBM Infosphere);
at Google on Web-search and Large-scale Machine Learning (productized as parts of SETI and Froogle);
at Microsoft Research on Natural Language Processing and Web-search (productized as a part of Bing!).
After UW, Gubanov was a Postdoc at MIT CSAIL working with Prof. Michael Stonebraker, A.M. Turing Award Winner on Web-scale fusion and profiling
of unstructured and structured data.
Gubanov's team was a finalist in the Nokia X-Prize 2.25M challenge and Vodafone Americas Foundation Annual Wireless Innovation Project.
He is a recipient of the NASSHP Young Investigator Award, IEEE ICDE Best Paper Award, ACM SIGMOD Research Highlight Award,
Communications of the ACM (CACM) Research Highlight Award, and Amazon AWS Aritifical Intelligence (AI) Award.
His reserch is sponsored by NSF, Amazon, and Florida Department of Health Casey DeSantis Florida Cancer Innovation Fund.
Tianyu Li is inviting you to a scheduled Zoom meeting.
Join Zoom Meeting
https://mit.zoom.us/j/96099569635?pwd=clk4bUVYbnNaZmlOTjVYUFAzK0Rqdz09
Password: 071013
One tap mobile
+16465588656,,96099569635# US (New York)
+16699006833,,96099569635# US (San Jose)
Meeting ID: 960 9956 9635
US : +1 646 558 8656 or +1 669 900 6833
International Numbers: https://mit.zoom.us/u/abaKYlLdE
Join by SIP
96099569635@zoomcrc.com
Join by Skype for Business
https://mit.zoom.us/skype/96099569635
Cancer.Poly: A Hybrid Polystore/LLM Simplifying Access To Knowledge in Clinical Data Lakes
Abstract:
Clinical data lakes store wealth of patient information that is often life-saving.
However, getting to it is too time-consuming due to the knowledge representation barriers and scale.
Hence, very few people use all available information, and consequently the patients often do not get the best treatment.
In this talk I will describe our work on Cancer-Poly - a hybrid in-memory analytical Polystore/LLM designed to simplify access to
complex clinical data lakes. I will briefly overview its architecture, key use cases and then focus in more detail on our custom embeddings
that are the foundation of most system components.
Bio:
Michael Gubanov is an Assistant Professor at Florida State University. His research interests include Polystore databases, Data fusion,
Artifial Intelligence (AI), and Large Language Models (LLMs). His most recent project is in collaboration with Moffitt Cancer Center and Research Institue
- a Hybrid Polystore/LLM for Cancer Data Lakes.
Gubanov earned his PhD from the University of Washington (UW) where he worked on scaling up Data fusion.
During the PhD program he spent time working at IBM Almaden Research Center on Data Integration (project Clio, productized as a part of IBM Infosphere);
at Google on Web-search and Large-scale Machine Learning (productized as parts of SETI and Froogle);
at Microsoft Research on Natural Language Processing and Web-search (productized as a part of Bing!).
After UW, Gubanov was a Postdoc at MIT CSAIL working with Prof. Michael Stonebraker, A.M. Turing Award Winner on Web-scale fusion and profiling
of unstructured and structured data.
Gubanov's team was a finalist in the Nokia X-Prize 2.25M challenge and Vodafone Americas Foundation Annual Wireless Innovation Project.
He is a recipient of the NASSHP Young Investigator Award, IEEE ICDE Best Paper Award, ACM SIGMOD Research Highlight Award,
Communications of the ACM (CACM) Research Highlight Award, and Amazon AWS Aritifical Intelligence (AI) Award.
His reserch is sponsored by NSF, Amazon, and Florida Department of Health Casey DeSantis Florida Cancer Innovation Fund.
Tianyu Li is inviting you to a scheduled Zoom meeting.
Join Zoom Meeting
https://mit.zoom.us/j/96099569635?pwd=clk4bUVYbnNaZmlOTjVYUFAzK0Rqdz09
Password: 071013
One tap mobile
+16465588656,,96099569635# US (New York)
+16699006833,,96099569635# US (San Jose)
Meeting ID: 960 9956 9635
US : +1 646 558 8656 or +1 669 900 6833
International Numbers: https://mit.zoom.us/u/abaKYlLdE
Join by SIP
96099569635@zoomcrc.com
Join by Skype for Business
https://mit.zoom.us/skype/96099569635