Typically, COTA partners with life science companies, providers, and payers to apply real-world evidence in order to improve cancer research and care delivery. With the need to quickly understand the current COVID-19 pandemic, COTA has redirected a significant amount of its resources to study more than 3,000 hospitalized COVID-19 patients in collaboration with Hackensack Meridian Health (HMH). Through this research, the organizations are generating real-time learnings on the pandemic through the analysis of real-world data.
Initial research findings include:
To further explain COTA’s in-depth analysis and research methods from this ongoing work, Kathryn Tanenbaum, Senior Manager of Operations at COTA spoke with two of our data experts leading COTA’s analysis, Eric Hansen, director of analytics, and Shivam Mathura, director of product and strategy. The two describe how, in the span of 24 hours, they shifted gears from oncology data analytics to COVID-19 research in order to quickly generate insights to inform the doctors and nurses treating these patients.
KATHRYN: How did the collaboration with HMH initially begin, and what were the initial goals of the collaboration?
ERIC: Once HMH decided they were going to curate their own COVID-19 data set, they reached out to us pretty early on to help with the analysis and research effort. Some of the doctors working on this project are very familiar with COTA as real-world data experts and knew we were capable of handling this type of analysis for them.
As far as initial goals, it was really a global goal. Beyond COTA and HMH, but collectively, we have this novel disease that we don’t know very much about. And so the goal of this project was to learn as much as we can to quickly develop actionable insights. It’s similar to other efforts we are now a part of like the COVID-19 Accelerator Program.
Ultimately, we’re trying to answer questions like, What are the demographics and clinical characteristics of these patients?; What are the major risk factors?; What types of disease characteristics lead to better or worse outcomes?; And what type of treatments are more effective than others?
SHIVAM: I would agree that our initial goals were definitely to help characterize the patient population within their catchment area that were being hospitalized with COVID-19 and then start to address questions over time around safety and effectiveness of therapeutic options. The key issue was that there wasn’t a lot of information about COVID-19 available when we started the collaboration. The initial collaboration was mostly about getting a good foundation of information in the hands of the physicians so that they could understand at a very high level what was happening across their network, and then beginning to use that information to direct more complicated analysis.
K: What was COTA’s role in the project?
S: COTA was primarily the real-world data companion for HMH. They had built a system and a team to collect data, but they wanted to leverage our knowledge and expertise of how to curate real-world data in a way that it can be impactful in a care setting. We know there are many ongoing randomized clinical trials investigating therapeutic options for COVID-19.
Obviously, clinical trials take a long time. And for good reason, they’re the cornerstone of why agencies make certain policy decisions. But that also means there isn’t a lot of available data that can be used in real-time by these physicians.
E: Adding to this, we have worked with the teams at HMH in the past, so they were familiar with our expertise and the quality of our work. This familiarity and trust combined with our start-up mentality allowed us to be lean and nimble in our approach. So, kind of overnight, our team went from focusing strictly on oncology data to focusing entirely on COVID-19 analysis.
At the end of the day, you’re dealing with data that’s coming from an EHR, you’re analyzing many of the same data elements, whether it’s patient demographics or comorbidities, or treatments and adverse events. We’re also using many of the same endpoints that we would use in oncology, such as overall survival, which means we can apply similar methods and models that we would use in oncology.
K: Without proven treatment paths or best practices, how do you even get started in this type of research, particularly when the global understanding of this virus continues to evolve?
E: Doctors are in a position where they have to make decisions in real-time without established treatment guidelines, so there’s quite a bit of experimentation because they don’t have a lot to go on. This highlights the importance of this research from both COTA + HMH in addition to all of the other companies and hospital systems that are doing this type of work. Collectively, we need insights, we need to figure out what is working and then share that information as quickly as possible.
But for us, it starts with doing a full characterization of the data, then collaborating with the front-line doctors, asking them, “what stands out here based on your experience?” We characterize all variables that are available to us, and we split it into different cohorts, such as survivor versus nonsurvivor. Using this, they can see the different proportions and determine things like whether a patient with hypertension is more likely to be in the nonsurvivor group than they are in the survivor group. We then dig into those surface insights a little bit more, but it really had to start with a strictly exploratory data phase.
S: Before we even started to build anything we could show to the physicians regarding the clinical equipoise, we wanted to characterize the population so that they understand who their patients are at a very high level. And then once they understand it at a high level, they can use their clinical expertise and their resources to help identify deeper patterns or insights. For instance, they may see that certain patients are having poor outcomes because they may be treated at one particular facility. There are many factors that could be driving this. It could mean they’re coming mostly from a long-term care facility, or they’re coming from a facility where they may have been exposed earlier than others, or they have more comorbid conditions. These observations help identify those deeper patterns.
Even before we do that, we have to do two things. The first: always look at quality. We know that the data coming out of EHRs can be particularly messy and that’s where our expertise comes into play. Through this process, our goal is to make sure that the data that is coming to us can be transformed into a usable asset. The second is to build a robust and scalable infrastructure for analysis that makes this process repeatable and the analysis reproducible.
For this project specifically, we were also very aware that the volume of patients that are entering the system is changing constantly. We could get a new data set every single day with updated patients, increased volume, and updated data. And we needed to be applying the same sort of metrics and measures to give us the consistency in terms of analysis. This also involved looking at trends, working with the physicians to identify things that were changing or things that were different from a day to day basis. Given circumstances, we had to move really, really quickly, but also we needed to be as consistent as we could with our approach so that we’re not falling into any particular data or statistical fallacies.
K: How did you go about finding insights from this data once you had the information pulled from the EHR? What have been some of the initial results and the initial data you looked at to reach those results?
S: The first steps were definitely looking at quality and then setting up the infrastructure for us to begin rapid analysis. Then, once we had that aggregated data set, we wanted to provide population level insights to the physicians so that they could start asking those harder questions about safety and efficacy.
To do this, we applied some quality assurance (QA) measures and then conducted a univariate analysis of important clinical factors on different patient outcomes like survival or ICU admission. From there, we could look at attributes that were documented in clinical literature as being very important to COVID-19 patients. Using this information, we were creating overviews and sharing it with the FDA and the CDC, so they could see initial trends in the data collected by HMH in the care setting. We also knew there were attributes based on research that had been done in China, in Europe, and that was just beginning to be published in scientific journals. This offered insight into clinical lab tests as well as features of the patient and presenting symptoms that would be very relevant to how the patient might progress or survive or their time to survival.
Going deeper into the risk factor analysis, we were looking at BMI, comorbidities, whether or not patients had received insulin as part of their diabetic treatment to identify the most prognostic factors. Part of the concern with real-world evidence is you don’t have a control group like you do in randomized clinical trials. Therefore, you need to be able to adjust for different variables and be extra cautious to try to avoid confounding factors. For example, in a real-world setting the time to get to ICU is influenced by a number of patient-independent factors like current ICU bed availability, how long they have had symptoms, time to hospitalization – all those things are real-world data concerns that you’re not always able to adjust for.
Speaking with the physicians about what they’re seeing in the practice and using their care experience is really the invaluable piece because it’s not something that you can see just by looking at the data. Once we have the data, we can work with physicians to identify the key attributes that we should be looking at, giving us a much better idea of what to collect based on their expertise in the clinical sense.
E: Through this process that Shivam described, we found out that it’s very hard to tease out the different factors because a lot of the variables relate to each other. For example, while older patients are less likely to survive than younger patients, older patients are also more likely to have something like hypertension or arrhythmia. So we wanted to know, is it really the age or is it really the hypertension or arrhythmia that’s driving that survival difference? Trying to tease out those different variables and figure out what’s actually the driving force behind the differences in survival rates was something that was very difficult.
To address these challenges, we used a regularized model that is meant to limit the number of variables that remain in the final model in order to prevent overfitting. While we could very precisely model survivorship in this HMH population, which is based in New Jersey, what good is that to the United States or a population in Ohio or California, because we’ve completely fitted it to this population. Instead, we were able to use things like cross validation, which trains the model on part of your population and tests it on unseen parts of your population. This helped immensely with the generalizability of our results. This allows us to then say, this model is cross validated and it shows that it shows good performance so we’re able to translate those findings to a different population.
K: Coming off of that very technical review, for people who aren’t as well versed in data science, how would you explain the impact of this work?
E: From the time we first started this collaboration, there has been a severe shortage of information available to the doctors and nurses who are on the front lines. They’re trying to make the best decisions for their patients with what they have available to them but it’s often not sufficient, especially in the early stages of this pandemic.
It’s kind of like going into a boxing match blindfolded and you’re just throwing punches and hoping something lands. But with each additional analysis, each additional published finding, it gives doctors more information on how to treat the disease as effectively and efficiently as possible. So our findings, once published, will go into that library of knowledge that is growing every day.
S: While the demand was really high to see things as quickly as possible, we also wanted to balance that with the quality and validity of what we were producing. We wanted to be sure that anything we would put in front of the team at HMH would hold up to not just our standards, but if we had to put that in front of the public, or our peers, it would be able to hold up their scrutiny as well.
In terms of impact, certainly we have been successful at filling some gaps in knowledge that existed – and continue to exist – because of ongoing trials, and the lack of available documented literature in the United States. Being able to push out new findings to the physicians, the researchers, and policymakers very quickly was great.
K: So what happens now, is this research going to continue? What are the future plans?
E: HMH and COTA are continuing the collaboration effort. Specifically, we’ve joined other companies in the COVID-19 Accelerator Program, and we’re doing a parallel analysis with several other companies where we will all do the same analysis, but using our own data sets. This is similar to efforts we’ve done with Friends of Cancer Research in the past where anywhere from five to 10 companies will come together with their own oncology data set and do the same analysis to increase the acceptance of real-world data and real-world evidence. This is probably slightly different in scope, because we’re still trying to characterize this disease and learn more about it. But we’re moving fast on that effort.
Second, HMH has the ability to continue abstracting patients. As our research questions changed, we are able to go back and capture more information that we didn’t actually capture initially. As an example, when they asked us to analyze hydroxychloroquine, they needed more information as to dose and how long the patient was on hydroxychloroquine, as well as how long from hospital admission to their first dose of hydroxychloroquine. They actually went back to the 3,000 patients and captured that information and then sent it in a refresh dataset. As the research questions change, and as the FDA has posed questions to us, this collaboration puts us in a good position to be able to provide answers quickly.
K: Finally, do you think COVID-19 will change how real-world data and real-world evidence are used by healthcare stakeholders, regulators, and others?
S: Yes, and I think it will change the industry’s view of these data sources positively. Clinical trials have always been the gold standard for what is required to make policy decisions. What we’ve seen firsthand with COVID-19 is that there are situations where you need data and insights that you can’t generate with a clinical trial in time or with the existing infrastructure. By having a robust real-world evidence architecture, you’re able to quickly give clinicians who are being impacted greatly by the situation, the tools to make better decisions.