Introducing CAILIN: The Latest Development from COTA’s AI Lab

At COTA, we know that high-quality oncology real-world data (RWD) holds the keys to the next breakthrough in cancer care. Through our work with life sciences partners, oncologists, and hospitals across the U.S., we’ve seen firsthand how data can power discoveries about new treatments and inform the right steps in care for patients. 

We also know that working with RWD comes with challenges. The time and effort of transforming data into analysis-ready formats is significant, and only those with data science or analytics expertise have the knowledge and skills to query the data themselves. We began to explore ways artificial intelligence (AI) could remove the barriers to faster, more efficient, more democratized RWD analyses and insight generation. 

Today, we’re proud to introduce CAILIN, COTA’s suite of cancer AI solutions that will expand the possibilities for cancer RWD analyses. CAILIN powers RWD research in two key ways: 

  1. Empowers individuals to simply ask their research questions of a dataset as they would in a search engine – then receive an answer in moments.
  2. Facilitates data abstraction and curation to help human abstractors prepare datasets for research more efficiently–decreasing the cost of abstraction.

CAILIN will transform ways of working with RWD, and we’re already seeing the impact through our work with provider partners. Together, we’ll use CAILIN to rapidly process fragmented data across their vast networks, unlocking insights that can deliver and accelerate precision medicine for patients.

Here’s a glimpse into how we’re already using CAILIN internally to address some of the major RWD pain points that our experts and customers face.  

Generating insights rapidly from high-quality data

Until recently, if someone wanted to ask a simple research question of a dataset – for example, “in people with a specific type of cancer, how many patients are taking a third-line therapy?” – it could take weeks to months to get an answer. Analyzing RWD required knowledge of coding – plus the effort of actually writing the code to query and analyze the data. Within a life sciences company, that means every question would get sent to a data science or analytics team, who would have to decide whether to prioritize it against the other items on a mounting to-do list. 

With CAILIN, the only entry requirement for participating in RWD research is curiosity. Anyone with the proper permissions can type their research questions into a LLM chat interface, and then receive a correct answer in seconds. Not only does this accelerate the timeline to answer simple questions at the start of a research project, it democratizes access to insights with self-service RWD analyses and allows more researchers to dive deeper and ask more complex research questions. Moving beyond simple patient counts, CAILIN could help non-technical users understand treatment patterns. One example is how many patients have been exposed to immunotherapy, such as CAR T, for multiple myeloma.

For life sciences organizations, this means teams from Discovery to Medical to Health Economics and Outcomes Research (HEOR) can access all the insights they need. For example, development leads can rapidly learn about biomarker expression to fine-tune their eligibility criteria for an upcoming trial. Another example, medical affairs leads can now address their strategic questions in real-time during planning meetings without the need to tap their data science resources for help. 

Streamlining RWD abstraction and curation with AI

The traditional process of preparing datasets for analysis takes significant time and effort, which has made teams shy away from doing this work due to cost. However, when this work isn’t done, the data may be unreliable.  CAILIN will make this work faster, easier and less costly.

CAILIN makes medical data abstractors more efficient and effective. One example that is rapidly increasing efficiency is answering curators’ open questions by pulling guidance from training documents and literature, eliminating the time needed to track down the next best step. Rather than abstractors locating and reading the relevant guidance, CAILIN can find the answer in seconds.

CAILIN can also learn the correct abstraction techniques from medical staff’s entries into its large language model (LLM). With training and validation from medical experts, CAILIN will be able to understand how to code certain fields from semi-structured and unstructured data, and then make the right choices when extracting information from medical records.

Exploring what’s possible for AI-enabled cancer research

We’re continuing to explore how much faster and more productive we can be for our partners and customers with CAILIN in our RWD toolkit. We’re making quick progress; today, we’re teaching the model to answer questions like, “what is the average age of diagnosis for a cohort of patients with multiple myeloma?” in seconds. Soon, CAILIN will be able to tell us about the treatment options for patients who have already received three types of therapy, or how overall response rates differ between patients on different courses of therapy. We look forward to continuing to realize CAILIN’s potential to add more value to RWD findings and change the field with broader access to RWD. 

While we don’t expect CAILIN to ever replace the role of human experts and medical extractors, it can significantly streamline their work to get more done with the highest quality data COTA is known for. Ultimately, this will bring more information to researchers’ fingertips faster, unlocking new opportunities for research. 

I applaud and am so grateful to each individual within COTA who has worked tirelessly to bring CAILIN to life – and extend my appreciation to our partners who will fearlessly explore the future of AI-enabled RWD research with us. Together, we can deliver even more value even faster for people touched by cancer.