Skip to main content

AI models need to be built on more complete and global datasets

Empowering patients with a broader and more inclusive view of what's possible can lead to more personalized care, says one CEO. Failing to do so means AI will always fall short of its true potential to support better outcomes.
By Bill Siwicki
John Orosco of Red Rover Health on AI
John Orosco, CEO of Red Rover Health
Photo: John Orosco

As artificial intelligence begins to transform healthcare, a big question often is overlooked: Are hospitals, health systems and IT vendors training AI on the right data?

Many AI models rely heavily on data from U.S. and European sources. As such, this can create biases that limit treatment options. Valuable insights from other parts of the world are left out. In fact, research has shown biased datasets can contribute to healthcare disparities and overlook effective treatments available outside the U.S.

John Orosco has plenty of experience in AI and datasets through his work as CEO of Red Rover Health. The company specializes in simplifying EHR integration through a platform that uses secure RESTful APIs to connect third-party software with EHR systems. RESTful APIs are a type of web protocol interface that allows a consumer to execute to retrieve data or post updates to a source system.

All of this is designed to enable healthcare organizations to enhance existing EHRs with best-of-breed systems, improving access to real-time patient data and streamlining clinical workflows.

We spoke with Orosco on the primary challenge with AI and data, AI reaching its full potential in healthcare by being trained on more diverse and global data, AI's connection with genomics and precision medicine and why AI models should consider non-mainstream therapies to provide patients with the best possible treatment.

Q. What is the primary challenge today with artificial intelligence and data?

A. The main problem with AI in healthcare today isn't the technology itself – it's that we're still in the early stages of its evolution. Large language models continue to mature at a rapid pace, and while they're already showing incredible promise, it's clear we've only scratched the surface.

Early indicators suggest AI is here to stay and that it will fundamentally reshape how we approach automation, decision making and productivity across industries, especially in healthcare.

But as powerful as these models are becoming, their effectiveness hinges on access to data. AI can only be as good as the information it has to work with. And in healthcare, where data is often fragmented across different systems, buried in unstructured notes, or locked behind outdated interfaces, that's a real challenge.

Integration isn't just helpful – it's essential. Without access to comprehensive, well-connected data sources, the full potential of AI is effectively neutralized. It becomes a brilliant tool that's only able to see part of the picture.

So, the real focus right now shouldn't just be on what AI might do in the future, but on what we can do today to prepare for that future. That means breaking down data silos, building smarter infrastructure, and ensuring LLMs have access to as much relevant, high-quality data as possible.

As the models continue to improve – and they will – this foundation will be what determines how much real value we're able to unlock. In short, the models are maturing fast – now it's up to us to make sure the data is ready for them.

Q. You believe AI in healthcare can only reach its full potential if it's trained on more diverse and global data than it is today. Please elaborate.

A. It can only reach its full potential if it's trained on diverse, global data sets. Right now, much of the data being used to train LLMs comes from specific regions, mostly the U.S. While that might seem like a good starting point given the amount of healthcare data in the U.S., it's actually limiting.

Training AI solely on regional or national data bakes in the cultural, systemic and clinical biases of that region. It gives us a narrow lens through which AI understands medicine and health, and that fundamentally restricts its usefulness.

Take the U.S., for example. Our healthcare system tends to favor certain approaches to treatment, like prescribing medications or recommending surgery. In contrast, other countries might rely more heavily on natural remedies, alternative therapies or different care pathways altogether.

If AI is only trained on U.S.-based data, it will naturally reflect and reinforce those treatment patterns, even when other approaches might be equally or more effective in different contexts. That's one reason why many American patients who can afford it seek care abroad – because they believe there are effective treatments available outside the boundaries of FDA approvals or U.S. clinical norms.

If we truly want AI to support better health outcomes globally, we need to think beyond borders. That means training models on a wide range of data from different countries, cultures and care models. It's not just about volume, it's about variety. Diverse data makes AI smarter, more adaptable, and ultimately more equitable.

Without it, we risk building tools that are technically advanced but functionally narrow. If we want AI to reflect the full spectrum of human health and treatment possibilities, we need to give it a fuller picture of the world.

Q. You say genomics and precision medicine can offer more personalized care. What is the AI and data connection?

A. There's a powerful connection between AI, data and the future of personalized care through genomics and precision medicine. Think of the human body as an operating system. Each of us runs on our own unique source code, which is our DNA.

Mapping the genome is essentially decoding that system. It tells us how we're wired to respond to certain medications, how we metabolize drugs, and even what conditions we might be predisposed to. Despite this insight, however, much of modern medicine still takes a trial-and-error approach.

We prescribe treatments, then say things like, "Let's see how you feel in a week." That's inherently imprecise and often inefficient or even risky.

This is where AI can play a transformative role. When AI is trained on genomic data in combination with other clinical data like treatment protocols, lab results and real-world evidence, it becomes far more precise in its predictions and recommendations.

By including genomics in the data mix, AI can help identify the most effective treatments for each individual before trial and error even begins. It can also help avoid serious side effects by flagging medications a person is likely to metabolize poorly or not respond to at all.

The future of precision medicine depends on this kind of integration. Genomic data on its own is valuable, but its full potential is only realized when it's combined with broader datasets and analyzed at scale by AI. When that happens, we move closer to care that's not just personalized but proactive, predictive and safer. AI becomes the engine that turns data into insight, and genomics becomes a foundational layer in truly individualized care.

Q. You also suggest AI models should consider non-mainstream therapies to provide patients with the best possible treatment. What do you mean here?

A. What I mean is that AI models should expand their view beyond just local, mainstream treatment protocols, especially when those protocols are defined by regional governing bodies. Too often, AI systems are trained on datasets that reflect only what's been approved or reimbursed in one country, usually based on regulatory or insurance parameters.

While that might make sense from a compliance standpoint, it limits the potential of AI to offer patients a truly comprehensive view of available treatment options. Just because a therapy isn't FDA-approved or isn't covered by insurance doesn't mean it lacks merit. In fact, it might be widely accepted and effective in another country.

Non-mainstream or alternative therapies used globally shouldn't be ignored by AI simply because they fall outside the local medical playbook. Patients deserve to know what's available – not just within their zip code or insurance network, but around the world.

Of course, access and reimbursement are real barriers, and there are political and regulatory complexities, especially here in the U.S. But the role of AI should be to inform and expand the conversation, not narrow it.

If a patient sees an AI-generated treatment recommendation that includes a promising therapy used internationally, they can then discuss it with their physician and make an informed decision together.

At the end of the day, AI should serve as an unbiased guide, not one constrained by local policies or insurance limitations. Empowering patients with a broader view of what's possible can lead to more personalized, thoughtful care. It won't be easy to implement this globally inclusive mindset, but failing to do so means AI will always fall short of its true potential to support better health outcomes.

Follow Bill's HIT coverage on LinkedIn: Bill Siwicki
Email him: bsiwicki@himss.org
Healthcare IT News is a HIMSS Media publication.

WATCH NOW: Chief AI Officers require a deep understanding of the technologies and clinical ops