Vidul Ayakulangara Panickan

I'm a Research Associate at Harvard Medical School's Department of Biomedical Informatics and a Research Scholar at the Harvard–MIT Center for Regulatory Science, with appointments at VA Boston Healthcare System and Brigham and Women's Hospital. I also serve as Deputy Director of the HMS/CELEHS Data Science in Action Summer Program.

My current efforts center on clinical NLP and medical language models deployed on HPC clusters across the Veterans Affairs and Mass General Brigham healthcare systems for phenotyping research, multi-institute EHR harmonization, medical code crosswalks, clinical representation learning and biomedical knowledge graphs.

Vidul Ayakulangara Panickan
Harvard Medical School, USA

News

Apr 2026
Serving as Judge for the Health Systems Innovation Lab Hackathon 2026, Boston Hub, Harvard School of Public Health.
Sep 2025
Guest instructor for EHR computer lab, BMIF 204, Harvard Medical School.
Sep 2025
PEHRT preprint released — a pipeline for harmonizing EHR data for translational research.
Nov 2024
Guest instructor for EHR computer lab, BMIF 300qc, Harvard Medical School.
Jul 2024
CELEHS/HMS Data Science Summer Program wrap-up covered in Harvard T.H. Chan School of Public Health.
Oct 2023
EVAR postmarket surveillance study published in JAMA Internal Medicine.
Dec 2022
Secure Science with CITADEL — scaled NLP and concurrence computation on the Summit supercomputer to analyze Veterans Health Records. Coverage in HPCwire.

Research

My work sits at the boundary of applied ML, clinical informatics, and software engineering — turning unstructured EHR data into research-ready datasets and building the pipelines that produce them. I'm interested in how representation learning, language models, and harmonization infrastructure can scale across institutions while handling the messiness of real-world clinical data.

Natural Language Processing

I develop Clinical NLP systems for EHR codified and narrative data, from rule-based pipelines to finetuning language models for entity extraction, phenotyping, and medical code retrieval.

EHR Infrastructure & Knowledge Graphs

Maintain multi-institutional pipelines that harmonize EHR data into analysis ready datasets, building medical code crosswalks, cross institutional knowledge graphs, and HPC scale NLP deployments.

Selected Publications

arXiv Sep 2025
Gronsbell J, Panickan VA, Zhou D, Lin C, Charlon T, Hong C, Xiong X, Wang L, Gao J, Zhou S, Tian Y, Shi Y, Gan Z, Cai T.
JAMA Internal Med Oct 2023
Wang X*, Panickan VA*, Cai T*, Xiong X, Cho K, Cai T, Bourgeois FT.
Science Jul 2024
Verma A, Huffman JE, Rodriguez A, Conery M, Liu M, Ho Y-L, Kim Y, Heise DA, Guare L, Panickan VA, et al.
Nature Medicine 2023
Potential Pitfalls in the Use of Real-World Data for Studying Long COVID
Zhang HG, Honerlaw JP, Maripuri M, Samayamuthu MJ, Beaulieu-Jones BR, Baig HS, L'Yi S, Ho Y-L, Morris M, Panickan VA, Wang X, Weber GM, Liao KP, et al. (4CE Consortium).
J Biomed Inform 2025
DOME: Directional Medical Embedding Vectors from Electronic Health Records
Wen J, Xue H, Rush E, Panickan VA, Cai T, Zhou D, Ho Y-L, Costa L, Begoli E, Hong C, Gaziano JM, Cho K, Liao KP, Lu J, Cai T.
J Biomed Inform 2025
ARCH: Large-Scale Knowledge Graph via Aggregated Narrative Codified Health Records Analysis
Gan Z, Zhou D, Rush E, Panickan VA, Ho Y-L, Ostrouchov G, Xu Z, Shen S, Xiong X, et al.
NPJ Digital Med 2025
Label-Efficient Phenotyping for Long COVID Using Electronic Health Records
Hong C, Wen J, Zhang HG, Panickan VA, Yang DY, Chen AW, Xiong X, Wang X, Morris M, et al.
JAMIA 2024
CIPHER: Centralized Interactive Phenomics Resource — An Integrated Online Phenomics Knowledgebase for Health Data Users
Honerlaw J, Ho Y-L, Fontin F, Murray M, Galloway A, Heise D, Connatser K, Davies L, Gosian J, Maripuri M, Russo J, Sangar R, Tanukonda V, Zielinski E, Dubreuil M, Zimolzak AJ, Panickan VA, et al.
Patterns 2024
LATTE: Label-Efficient Incident Phenotyping from Longitudinal Electronic Health Records
Wen J, Hou J, Bonzel CL, Zhao Y, Castro VM, Gainer VS, Weisenfeld D, Cai T, Ho Y-L, Panickan VA, Costa L, Hong C, Gaziano JM, Liao KP, Lu J, Cho K, Cai T.
NPJ Digital Med 2024
Multisource Representation Learning for Pediatric Knowledge Extraction from Electronic Health Records
Li M, Li X, Pan K, Geva A, Yang D, Sweet SM, Bonzel CL, Panickan VA, Xiong X, Mandl K, Cai T.
NPJ Digital Med 2021
KESER: Clinical Knowledge Extraction via Sparse Embedding Regression with Multi-Center Large-Scale EHR Data
Hong C, Rush E, Liu M, Zhou D, Sun J, Sonabend A, Castro VM, Schubert P, Panickan VA, et al.

Teaching

BMIF 204 · 2025

Foundations of Clinical Data - Computer Lab

Co-led computer lab session on working with EHR data for master's and PhD students at HMS.

Lab Instructor · Harvard Medical School
BMIF 300qc · 2024

Working with MIMIC-IV data - Computer Lab

Led the computer lab workshop on working with EHR data for AI in Medicine Phd students at HMS.

Lab Instructor · Harvard Medical School
2023–present

Data Science in Action Summer Program

I lead curriculum design and operations for Harvard's data science summer program for high school students, covering machine learning, statistics, and Python.

Deputy Director · CELEHS / HMS
NIH AIM-AHEAD

MIMIC-IV Data Preparation

Python/Jupyter tutorial hosted by the NIH AIM-AHEAD consortium as training material for AI/ML health-equity research.

Tutorial Developer
Open Methods

EHR Processing Tutorial

Computational workflows for processing EHR data — extraction, preprocessing, and feature engineering for clinical NLP modeling.

Tutorial Developer

Service

2026
Judge — HSIL Hackathon 2026, Boston Hub Health Systems Innovation Lab, Harvard School of Public Health
2026
Reviewer — AMIA 2026 Amplify Informatics Conference American Medical Informatics Association
2026
Reviewer — IEEE ICHI 2026 IEEE International Conference on Healthcare Informatics

Appointments

2019 – present
Research Associate, Biomedical Informatics
Harvard Medical School, Department of Biomedical Informatics
2022 – present
Research Scholar
Harvard–MIT Center for Regulatory Science
2020 – present
Data Scientist (Contractor)
VA Boston Healthcare System — Million Veteran Program
2020 – present
Sponsored Staff Collaborator
Brigham and Women's Hospital — Verity Bioinformatics Core
2020 – 2023
Associated Personnel Staff
Boston Children's Hospital — Pediatric Phenotyping
2023 – present
Deputy Director, Data Science in Action Summer Program
Harvard Medical School / CELEHS

Education

2016 – 2019
M.S. in Computer Science
University of Massachusetts Amherst
2011 – 2015
B.Tech. in Computer Science and Engineering
Amrita University, Amritapuri, Kerala, India

Affiliations

Harvard Medical School, DBMI
Harvard–MIT Center for Regulatory Science
VA Boston Healthcare System
Brigham and Women's Hospital
NIH AIM-AHEAD Consortium
4CE Consortium

Contact

Department of Biomedical Informatics
Harvard Medical School
10 Shattuck St, Suite 514
Boston, MA 02115