Juan Manuel Zambrano Chaves

Profile Picture

Google Scholar




Current Research
Prior Research


I am a physician scientist, currently a PhD candidate in the Department of Biomedical Data Science at Stanford University. I am passionate about developing methods to aid clinicians in providing better care using tools such as artificial intelligence.

Most recently I’m interested in developing foundation models to combine multiple data modalities, such as medical images, text and tabular and medical record data. I aim to use these methods to provide improved risk prediction for diseases.

I have undergraduate degrees in biomedical engineering and medicine from Universidad de los Andes, located in my native Colombia. Outside of work I enjoy cycling and spending time in nature.

Current Research

I research foundation models in medicine. I am fortunate to be co-advised Daniel Rubin, Akshay Chaudhari and Curt Langlotz. I also am fortunate to collaborate with other talented researchers in Radiology at Stanford, including Robert Boutin and Andreas Loening with whom I collaborate in opportunistic imaging, and development of automated protocoling tools in radiology, respectively.

Opportunistic imaging

As an example of multimodal data fusion work, I have developed methods to identify ischemic heart disease -the world’s number 1 cause of morbidity and mortality- risk using abdominal computed tomography and medical records. Our best performing fusion model can improve predictive performance compared to models recommended by current clinical guidelines (increase in F1 score by 19/100). Even more exciting: we show this analysis can be done on images that are already acquired during routine practice, rendering additional diagnostic value to the patient without additional radiation exposure. We also developed improved methods for model interpretability in this project. An initial version of this work is available as a preprint. Stay tuned for the (extensively) peer-reviewed version with some surprises to come.

I’ve also framed disease prediction as a multitask problem. My idea is that instead of developing individual models separately for individual diseases, we can train a single model that leverages images (and other data modalities) from multiple patient disease cohorts to improve label efficiency and predictive performance. One particular setting this could be useful is when assessing risk of multiple cardiometabolic diseases (e.g. diabetes mellitus, hypertension, atherosclerotic cardiovascular disease), which share common risk factors and often present together in the same patient. In an early example of this work we demonstrated the potential of this approach using computed tomography images only (MICCAI 2022). We are working on showing this approach can benefit from multimodal fusion, as in the ischemic heart disease example above.

The methods we developed above can also help advance our understanding of disease. For example, we can scale our methods developed above to analyze body composition in an adult population, deriving imaging biomarkers of disease. In work presented at the Society for Advanced Body Imaging 2022 meeting I showed how we can use these population-wide analyses to study hundreds of associations between imaging biomarkers and diseases, uncovering new associations and validating previous ones. Full paper on this coming soon.

We can also use these methods to improve quality in documentation of disease. In presented at RSNA 2022, I shared how we can use our methods to diagnose sarcopenia in a large population, and how the prevalence of sarcopenia revealed by imaging (~30% in our cohort; comparable to ~20% prevalence in the population) starkly contrasts with very low documentation in the medical record (<1% of cases). Our methods can clearly help bridge the documentation gap enabling improved quality. Moreover, they can facilitate future research by clinicians in many specialties, a prime example being sarcopenia which as we showed is associated with hundreds of diseases.

Radiology NLP

I’ve also developed tools for automated protocoling in radiology. Protocoling consists of taking a physician’s reason for exam and the accompanying patient’s health record and specifying a radiology protocol (i.e., the specific scanner-level parameters and sequences that will be used to acquire the image). This process is currently done manually. Automated methods such as the ones I have developed -that can automatically correctly do this for >90% of cases- can help save valuable time and potentially reduce human errors. We’ve presented early work of this at ISMRM 2022, submitted a patent disclosure where I am a co-inventor, and are in the review process for a first manuscript. This work has led me to take a deeper dive into natural language processing and understanding methods in the radiology domain.

Lately, I’ve been working on building improved representations for radiology text, as well as ways to evaluate them. Stay tuned for soon-to-be submitted work in this space. In the meantime, check out our work facilitating vision and language research in medical imaging presented as an ACL 2022 demo.

Prior Research

Prior to Stanford, I performed research as an undergraduate biomedical engineering student and as a medical student during my time at Universidad de los Andes Biomedical Engineering Department and School of Medicine.


I worked with Juan Manuel Cordovez and the Mathematical and Computational Biology - BIOMAC group where I co-developed an agent based model for adults walking in a city. Such models can help aid policy makers study the impact of public transportation in promoting healthy living habits such as walking for transportation.

I also worked with Olga Lucia Sarmiento and the Epidemiology group - EpiAndes where I contributed to a cross-sectional study of adults walking for transportation in Bogotá. We identified the role of TransMilenio, a bus rapid transit system, in promoting physical activity in adults in the city.

Wet lab

As a member of the Basic Medical Sciences - CBMU group led by John Mario González, I helped develop an in vivo model for the parasite causing Chagas disease (Trypanosoma cruzi) in zebrafish (Danio rerio). Chagas disease is common in the Americas, with over 8 million infected individuals. Unfortunately, it can cause irreversible heart damage requiring heart transplant, and particularly affects low resource communities. To make things worse, there is still no effective treatment for chronic disease, which is when most people realize they have the disease. Our model, which has since been further developed by researchers in the lab, may aid in understanding the pathophysiology of the disease and development of pharmacological interventions. I was fortunate to present my contribution by winning a paid trip to the Latin American Zebrafish Association - LAZEN 2014 meeting and course (see a picture of younger me in Figure 1 :)).

Last but not least, I co-led work identifying the potential of gene therapy in recovery after cervical spinal cord injury in an animal model. The therapy, an Adeno-Associated Virus delivery of TrkB (a receptor for Brain Derived Neurotrophic Factor), achieved a ~4x improvement in recovery of electrical activity in the diaphgram (the main breathing muscle) after spinal cord injury over the control group. My roles as a main contributor to this project ranged from co-designing the study, performing animal surgery, statistical analyses (some of my first) and manuscript preparation. Our work led to my first first author publication.