In our prior post, we introduced structuring and abstraction, two processes needed to get real-world medical records into a form that we can draw insights from. We also explained how those processes map to the latest AI techniques. In this post, wego deeper into what medical records look like in the wild and what happens when we try to apply AI techniques to them.
Real-world medical records quickly break common assumptions used to develop AI techniques in academic settings, and it’s these real-world issues that make our job difficult and interesting. They’re also the reason that “pretty good on a benchmark” has never translated into a great solution for the tasks we train on.
Noise
Medical records are filled with mistakes. Among our hemophilia patients, each of whom has one of two subtypes of the disease A or B, we see that nearly 30% have a contradictory diagnosis somewhere in their records. This could represent a true misdiagnosis, but more often reflects mundane realities like a provider picking the wrong option from a drop down menu and other folks copy/pasting that result forward in EHR software. My own record from a leading healthcare institution had my body temperature listed at 98 degrees C, which would be unpleasant if true. In another case, we received a faxed record with some unfortunate artifacts mangling the word “Not” in the context of printing “Not Detected” for the results of a disease test-panel.
Tolerance to these types of errors is table stakes for working with records. For us, it most often means looking at records longitudinally to make sure we have a consistent picture of patient health given all of the evidence. No one record tells the story of a patient’s healthcare journey. In practice, this means training an LLM that looks at snippets from multiple records together, and considers who wrote a piece of information, when, and how it connects to other evidence, before drawing a conclusion.
It also means building a multi-layered QC system that lets us define both data conformance and plausibility rules. These rules make sure a disease-modifying therapy didn’t start before the patient was diagnosed, or a condition only possible in females is not identified for a male patient.
And most importantly, it means having a technical feedback loop. From the very start of PicnicHealth, it’s been clear that we can’t anticipate all of the artifacts and corner cases we’ll see when working with medical data. Instead, what is most important is building systems to identify mistakes, fix them, and incorporate that result into all of the future processing we do.
… so what does this mean? While a sophisticated OCR model may contend with strikethroughs like this, we are still left with an interpretive question – was evidence of a heart attack detected or not? Even simple examples like this make clear that looking across multiple records is absolutely necessary when building an accurate, consistent model of a patient’s health.
Records are big, so LLM context generation and provenance matter
The biggest single record we’ve received was over 24,000 pages. Yup. It combined documents from visits spanning decades within a large hospital system for a patient with Sickle Cell Disease, a population that tends to have many long tables of lab results in their records. Below, we show the distribution of record sizes in our data today, noting that even if we only processed 1,000 records a week, we’d see one with more than 10k pages each week.
We regularly receive individual records that have more than 1,000 pages. In fact, if we processed only 1,000 records a week, we’d see at least one with more than 10k pages each week.
LLMs are improving in their ability to use and consistently latch onto information in long context windows, and meanwhile exciting new techniques like Selective State Space models might come fully online to work with more input data. But today, to use the most battle-tested foundational models, we have to manage context window limitations carefully. For our average record of about 30 pages, with about 280 words per page, we need a 32k context window to grok one record at a time (if we assume 3.5 tokens per word). For an easy patient with 20 records, we would need at least 600k context size to find patterns that occurred over time in their care journey.
Identifying relevant context is particularly important for abstraction tasks. It is both a practical matter and an appropriate representation of how clinicians flip through a stack of records when their time is limited. When it comes to validating abstracted information (-- and building software interfaces to make that easy), well-designed context generation mechanisms become a powerful mechanism for illuminating how we uncovered the evidence behind a conclusion.
We find that provider specialty and dates are important first-pass guideposts and lend themselves to easily explainable provenance for the data we produce. Searching for individual concepts also helps, but only once we have aligned across the many variations of how they’re written and the many semantically-different but clinically-similar concepts. Connections between concepts – a multiple sclerosis patient must have their diagnosis confirmed by a neurologist to be definitive, a patient with Paroxysmal nocturnal hemoglobinuria (PNH) will always have a particular panel of labs drawn – also help us build the right LLM inputs and iteratively refine answers.
Nuance
When we say nuance, we mean uses of language in medical records that we don’t expect to find often in other corpora – and most importantly, in the pre-training dataset for common LLMs.
For example, simple things like the date of a visit – an obvious question from a patient’s perspective – can be absolutely buried within a record. In one recent example we looked at, a document had the date it was printed, the date and time the physician note was first written, the date and time it was amended, the date and time it was signed, the date labs were ordered, the date and time the samples were received at a laboratory (after the visit had concluded), the date and time when those results became available, etc.
In this sea of similar data, the dates are easy to spot with off the shelf Named Entity Recognition (NER) models, but the meaning of those dates requires interpretation. In the example we looked at, the dates were mapping out the workflow of that particular facility for that particular out-patient visit type. When we ask even a powerful model like GPT-4 to answer a simple question like “when did this appointment happen [as the patientthinks of it]?” we find that models not trained directly on labeled records data easily miss the nuances in the data they’re looking at. We find this same behavior often when navigating the many names of providers, technicians, and other support staff documented in records, as illustrated below.
The healthcare system still relies heavily on faxes, CDs, and other seemingly outdated technology for sending information between providers. This means that even records produced by modern EHR systems can quickly get distorted as they zip around. This is one example, where the signature line has become barely legible after transmission. Without careful training or access to more information, an NER model applied to the text will confidently flag a single provider name; but associating them with the visit would be incorrect, since the uncertain signatory is the one who saw the patient. Better to build a model that knows to cross reference with other parts of the record to find the answer.
It’s in cases like this where the power of our training dataset shines – extensively labeling records (for almost 10 years!) has given us more than enough examples of the esoteric patterns of how medical encounters are documented. Our data captures patterns by facility, by specialty, among clusters of similarly-run practices, etc. and these get incorporated in our LLM when we fine tune on such a large volume of data. This level of complexity also demonstrates how much further models need to go than off-the-shelf NER to perform even basic real-world applications that require interpreting records.
Clinical Knowledge Matters in the Long Tail
Another tricky dynamic of medical records data is that what matters most is not always apparent in the statistics of the data. That lab panel we mentioned for PNH only happens once per patient and, across the full population of our users, only among a few patients – but it is of the utmost importance. Even an LLM with 99% recall may miss it, and that is a problem we have to solve.
For us, the practical place to start was by getting clinicians, epidemiologists, biostatisticians, and AI experts to work together all day, every day. PicnicHealth has teams that intermingle all of these specialties, and we’ve built technical systems to encode domain expertise into even the lowest levels of our structuring tasks. This means tools to help quickly examine data – e.g. to report precision and recall on obscure concepts, our QC rules that automatically flag plausibility and conformance issues for human review, and workflows to amplify and augment specific training examples so that our data distribution reflects importance rather than simply prevalence. It also means making sure we can flag and manually review records that may be affected by a blindspot.
Today, our LLM powers both patient-facing products and research facing products. For patients, we provide a platform that empowers them with ownership of their data and gives them tools for curating records, managing and coordinating care, and access to our own care services. For researchers, we help run far-more efficient observational studies, in part by developing software that helps abstractors generate regulatory-grade Case Report Form data quickly and accurately from the information already contained in patient records. This side-steps the need to run physical sites that are expensive, complicated, and ultimately a burden to all involved. The features and products that our AI enables are universally exciting, but what we do today is just the beginning.
1. Provider assessments
PicnicHealth’s providers can schedule virtual visits with study participants to conduct assessments required by the study protocol. Using clinical expertise, these assessments help evaluate participants' symptoms, overall health, and functional ability.
2. Diagnostics
The PicnicHealth care team can order specific diagnostic tests, such as labs or imaging, if they weren't part of the patient's routine care. This ensures that sponsors have all the necessary data to address their unique research questions.
3. Safety and adverse event reporting
PicnicHealth’s clinical team can provide support to ensure appropriate safety reporting. This includes monitoring for safety events to support safety adjudication.
4. Primary Investigator (PI) oversight
The PI of the PicnicHealth Virtual Site provides clinical oversight to ensure appropriate study conduct, including assessing whether the study is following study protocol, meeting compliance with regulatory standards and good clinical practice guidelines, collecting data accurately, and maintaining documentation and producing progress reports as required.
25,966
patients onboarded to platform
1,427,368
medical visits processed
56,861
facilities provided medical records
255,101
healthcare providers
95+
research programs
12
published posters and manuscripts
10
partnerships with top 30 pharma
New Research
Discover how PicnicHealth data powered medical research in 2021
This year, experts from PicnicHealth joined podcasts, webisodes, virtual summits and much more to speak to the importance of patient-centric approaches when building complete, deep real-world datasets.
Sickle cell (SC) is the most common inherited blood disorder in the United States. Red blood cells become rigid and shaped like crescent moons, preventing oxygen from getting to parts of the body. This can cause fatigue, severe pain, organ damage or stroke.
List the names of all the doctors, hospitals, and other facilities your loved one visits regularly, along with those they have visited in the past. Try to go back as far as you can, striving for at least the last 5-10 years, but do your best. Even if you can’t remember them all, having a strong baseline can help you quickly identify gaps in records.
Ensure You Have the Appropriate Legal Status
It is important to make sure that you are fully empowered to make decisions on behalf of your loved one with Alzheimer’s. Your relationship status with the patient may not be enough to legally give you access to your loved one's medical information. It is a good idea to talk to an expert about securing special legal status, such as Power of Attorney (POA), a legal document that allows an individual to name someone as their decision maker should they no longer be able to make decisions on their own.
Gather and Organize the Medical Records in One Place
It’s important to have all of your loved one’s medical records together in one spot. This makes it much easier for you and your loved one’s physicians to accurately map the patient’s medical journey and more easily share information between doctors. Fortunately, tools exist to make record management and access simple. A free resource like PicnicHealth helps you collect and organize all of this information. PicnicHealth’s intuitive timeline allows you to pinpoint data across the medical history, eliminating your need for keeping heavy binders filled with paper records or keeping track of multiple software portal logins.
Review the Medical Records to be an Informed Advocate
The better you understand your loved one's medical history, the better you can advocate on their behalf. Access and understanding of this information will help you to ask informed questions with physicians. Through regular communication backed by the data in the medical records, you can help your loved one’s care team develop a more successful care plan.
Learn more about PicnicHealth’s commitment to the Alzheimer’s community and the Alzheimer’s Association
When you’re juggling appointment times and insurance claims, putting a robust support system together might not strike you as the most urgent task. Investing the time to cultivate relationships with people can turn to in times of need will pay dividends. The next time you need a last-minute ride or just someone to listen, you won’t be on your own. There are many condition-specific support groups and support groups for caregivers generally in person or online. In addition to the encouragement and empathy they provide, support groups can be a helpful source of tips, resources, and recommendations for navigating caregiving.
2
Stay organized.
The backbone of effective caregiving is organization. Keep medical information, appointment schedules, and medication lists in order. Use a planner or a digital service like PicnicHealth to stay on top of your responsibilities. This attention to detail can prevent future complications and reduce day-to-day stress.
3
Explore treatments and clinical trials.
We’ve seen incredible breakthroughs in treatment over the past couple of years, powered by patients and their caregivers participating in research. Stay in the loop about the latest in medical advancements and available resources that could benefit your loved one. Whether it’s a new therapy option or a community service that aids independence, being informed can make a world of difference in the quality of care you provide.
4
Make time for self-care.
It may seem self-centered to focus on self-care—but when you feel good, you can be a better caregiver. Whether it’s exercise, a mindfulness practice, a soak in the bath, or just time to rest when you need it, carve out those moments in the day when you can unwind, reset, and stay healthy mentally and physically. Think of it as building up your reserves of kindness, patience, and understanding—which can only benefit your loved one. No one can pour from an empty cup.
Having trouble managing your loved one's medical records?
Easily manage all of your loved one's medical records and contribute to ongoing Alzheimer's research with PicnicHealth.
Tip: Download or print the poster at the end of this article to review before your next appointment!
However, it's important to consult with a healthcare provider or registered dietitian to determine the appropriate amount of protein for your individual needs. In general, a diet with moderate protein intake (about 0.8 grams per kilogram of body weight per day) is recommended for people with kidney diseases.
Learn more about contributing to IgAN research with PicnicHealth.
A tablet, phone, or laptop with a working camera, microphone, and stable internet connection.
A quiet, distraction-free area with enough space to walk a few steps if applicable.
A chair that you can use during any movements or tasks you’ll be asked to perform.
The tripod mailed to you via Amazon.
What to Expect
Before your video call:
Book Your Assessment
Visit your to-do list on your PicnicHealth Research Dashboard or click the scheduling link sent to your email. Note: Search for “New task for the ORBIT-CIDP Study" to find the video call scheduling link.
Receive Confirmation
Check your email for a confirmation with your scheduled video call time and instructions.
On the day of your video call:
Click on Video Link
Join your personal video call using the link we sent by email, or text message, or find it on your research dashboard.
Meet your nurse
A Registered Nurse (RN) will guide your virtual assessment, which will last about 30 minutes.
Complete the Physical Activity Assessment (INCAT)
The nurse will guide you through questions and, if needed, physical tasks to help researchers gain a deeper understanding of CIDP.
Complete the Movement Assessment (Optional)
If you participate, a nurse will guide you through three short recorded movement activities to complete as best you can:
Chair Task
While seated with your arms crossed over your chest and hands on oppositeshoulders, you’ll be asked to stand up, remain standing for 20 seconds, and then sit back down.
Arm Movement Task
While seated with your arms resting at your sides, you’ll be asked to raise both arms out to the sides until they meet above your head, then lower them back to your lap.
Finger Dexterity Task
While seated, raise your right hand with fingers extended. Touch your thumb to each fingertip in order, then reverse. Repeat with your left hand. This will then be repeated with your left hand.
Earn Compensation
Receive up to $55 for your participation:
$25 for completing the Physical Activity Assessment (INCAT).
$30 for the Optional Movement Assessment.
Recording: Your research assessment may be recorded to ensure accurate data collection. If you participate in the optional Movement Assessment, it will also be recorded. These recordings may capture your voice and responses, but identifiable information like your face, name, or background will be removed to protect your privacy.
Opt Into the Smart Insole Study Activity
Complete the opt-in survey to confirm your participation.
Receive Your Smart Insoles
Your smart insoles will be shipped to your home via FedEx and should arrive within 1 week.
Create Your Account
You’ll receive an email from Celestra Health with your account details. Follow those steps to set up your account.
If you don’t see an email from Celestra Health in your inbox, please check your spam or junk folder.
Download the App
After creating your account, you’ll be directed to a landing page with links to the App Store or Google Play. Use the link to download the correct version of the app for your device.
For illustrative purposes only, your insoles may look different
Log In
Open the app and log in using the email address and password you used when creating your account.
Enable Permissions
For iOS users: Enable Motion & Fitness and allow access to Apple Health.
For Android users: Enable Activity Recognition permissions.
Connect Your Insoles
Turn on Bluetooth, and follow the app's instructions to connect your smart insoles.
Enable Notifications
Enable push notifications to stay updated on reminders and activity progress.
For illustrative purposes only, your insoles may look different
Start Walking Sessions
When you’re ready to perform a walking session, tap ‘Start’ on the Ad Hoc Walking task card in the app.
Smart insoles are designed to fit comfortably into any pair of closed shoes
Need Help?
Should you need to contact Celestra Health support for any reason, you can submit a ticket through the Help section of the app by tapping the Submit A Ticket card and filling out the form. A Celestra Health representative will typically respond within one business day.
A fully charged device (smartphone, tablet, or laptop) with a working camera, microphone, and stable internet connection.
A quiet, well-lit space that is free from distractions.
Good lighting so your face is clearly visible; having a small flashlight or your phone’s flashlight nearby can help with skin, scalp, or joint checks.
Flexible device positioning so you can easily adjust or prop up your device hands-free if the research staff asks to view specific areas (such as your face, hands, or scalp).
Space to move in case you are briefly asked to stand or walk a few steps.
Your medication information, including your current steroid(s) and BENLYSTA® (belimumab) — either the medication bottles or a list with doses and schedule.
Time to focus without interruptions so the visit can be completed comfortably.
Before Your Video Call:
Schedule your visit
Use the scheduling link on your PicnicHealth Research Dashboard or the link sent to your email. Tip: Search your inbox for “New task for the BEACON-SLE Study - schedule your remote visit” to find the scheduling email.
Check your confirmation
You’ll receive an email with your appointment time and instructions for joining the video call.
On the Day of Your Video Call:
Join the call
Click the Zoom link sent to you by email or text message, or use the link available on your research dashboard.
Meet with the research staff member
They will ask you structured questions about your health and any lupus symptoms you’ve experienced over the past 30 days.
If needed, they may guide you through a few simple visual checks (such as looking at your skin, hair, joints, or mouth). You can always tell them if you’re not comfortable with anything.
Receive Compensation
You’ll receive up to $60 for completing your visit.
AI is modernizing non-interventional research and solving researcher’s greatest challenges, including incomplete data. Read the full article to learn more.
At Web Summit 2025 in Lisbon, our Co-founder and CTO, Troy Astorino, shared his thoughts on how AI can enable patient-focused care and research participation. He discusses how PicnicHealth simplifies access to diverse information sources and drives improved health outcomes. Check out this recap of his discussion!
Today, PicnicHealth released a preprint on LLMD, its state of the art medical large language model (LLM). LLMD achieves human-level accuracy when extracting clinical insights from medical records, and outperforms the most powerful and general and industry-specific LLMs. This model highlights how AI can realize the potential of real-world data (RWD) and offers life science companies a more efficient, powerful way to gain valuable clinical insights.