Skip to content

[In Development] An application to parse freetext inclusion criteria and produce a structured cohort definition that can be executed against OMOP CDM

License

Notifications You must be signed in to change notification settings

OHDSI/Criteria2Query

Repository files navigation

Criteria2Query

Criteria2Query 2.4 is published!

Introduction

Criteria2Query (C2Q) is an automatic cohort identification system. It enhances human-computer collaboration to convert complex eligibility criteria text into more accurate and feasible cohort SQL queries. It synergizes machine efficiency and human intelligence of domain experts to enable real-time user intervention for criteria selection and simplification, parsing error correction, and context-dependent concept mapping.

Features

  • An editable user interface with functions to prioritize or simplify the eligibility criteria text for cohort querying;
  • Accessible and portable cohort SQL query formulation based on the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) version 5;
  • Real-time cohort query execution with result visualization.

Interface and use case example

System Requirements

  • Java 8+
  • Apache Maven 3
  • Apache Tomcat
  • Python 3.7.6+
  • PostgreSQL DBMS (to demonstrate the real-time cohort SQL query execution, not strictly required)

Dependencies

  • SynPUF_1K and SynPUF_5% datasets, or any other dataset in CDM Version 5.2.2 format.
  • OMOP CDM Vocabulary version 5 files. These can be obtained from Athena.
  • Usagi for concept mapping.

Getting Started

  1. Install all required system dependencies as specified in the System Requirements section above.

  2. Git clone this repository.

  3. Download the negation scope detection model and place it in the NegationDetection directory.

  4. Create a Python virtual environment and install the required packages from venv_requirements.txt. (Instruction: https://packaging.python.org/en/latest/guides/installing-using-pip-and-virtual-environments/)

  5. Update the directory paths for Negation Detection and the Python virtual environment in the following file: /criteria2query/src/main/java/edu/columbia/dbmi/ohdsims/pojo/GlobalSetting.java

// Set the directory for the negation detection model
public final static String negateDetectionFolder = "/opt/tomcat/NegationDetection";

// Set the path to the Python virtual environment
public final static String virtualEnvFolder = "/opt/tomcat/python_virtualenvs/C2Q_NEGATION/bin";
// For Windows, use the following format:
// public final static String virtualEnvFolder = "D:\\C2Q\\python_virtualenvs\\C2Q_NEGATION\\Scripts";
  1. Download Usagi, and implement a POST API endpoint (referred to as the Concept Hub) that allows searching concepts by term and domain.

  2. Configure the Concept Hub endpoint in the following file:/criteria2query/src/main/java/edu/columbia/dbmi/ohdsims/pojo/GlobalSetting.java Update the concepthub URL to point to your POST API endpoint:

// Set the Concept Hub POST endpoint
public final static String concepthub = "http://localhost:8081/concepthub";
  1. Import the SynPUF datasets into your PostgreSQL database (Skip this step if the datasets are already imported.)

    • Download the SynPUF_1K and SynPUF_5% datasets in OMOP CDM version 5.2.2 format.
    • Download the OMOP CDM vocabulary files (v5) from Athena.
    • Follow the instructions in the OHDSI Common Data Model repository to instantiate the CDM schema in your PostgreSQL database for both datasets.
    • You may also use your own datasets, as long as they conform to the OMOP CDM v5.2.2 format.
  2. Configure database connection settings Update the database URLs, username, and password in the following file to connect to your SynPUF_1K and SynPUF_5% databases:/criteria2query/src/main/java/edu/columbia/dbmi/ohdsims/pojo/GlobalSetting.java

//Connect to the databases
    public final static String databaseURL1K = "jdbc:postgresql://localhost/synpuf1k";
    public final static String databaseURL5pct = "jdbc:postgresql://localhost/synpuf5pct";
    public final static String databaseUser = "Please connect to a database.";
    public final static String databasePassword = "*****";
  1. Deploy the C2Q application. Once configured, deploy the application and open it in your web browser.

Publications

Fang, Y., Idnay, B., Sun, Y., Liu, H., Chen, Z., Marder, K., Xu, H., Schnall, R., & Weng, C. (2022). Combining human and machine intelligence for clinical trial eligibility querying. Journal of the American Medical Informatics Association : JAMIA, ocac051. Advance online publication. https://doi.org/10.1093/jamia/ocac051

Yuan, C., Ryan, P. B., Ta, C., Guo, Y., Li, Z., Hardin, J., Makadia, R., Jin, P., Shang, N., Kang, T., & Weng, C. (2019). Criteria2Query: a natural language interface to clinical databases for cohort definition. Journal of the American Medical Informatics Association : JAMIA, 26(4), 294–305. https://doi.org/10.1093/jamia/ocy178

Support

If you have any questions/comments/feedback, please submit a form here or contact Dr. Chunhua Weng at Columbia University.

About

[In Development] An application to parse freetext inclusion criteria and produce a structured cohort definition that can be executed against OMOP CDM

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •