i2b2Blog

Wednesday, October 13, 2010

GARLIC—Genomic Analysis Results Library Integration Cell

Brian Wilson presented the roadmap for i2b2(v2) for representation of genome-scale data ranging from SNP arrays to assembled full genomes or the short reads from which they were assembled. This preliminary roadmap is outlined below. Given the general interest of the GARLIC application domain (i.e. querying for patient populations with specified genome-scale characteristics) there are likely to be several national groups and/or users which have experience and interest. We plan to contact those we know of and appreciate other groups reaching out to help, collaborate or inform.

Friday, October 8, 2010

NLP kickoff

Ozlem, Guergana, Margarita, Jiaping Zeng, Pete Zak

Reviewed the co-reference NLP challenge and what needs to be done?

Discussed the challenges of combining packages with different open source licenses. The webservice API is generally seen as a useful way to keep packages with different licenses working together. The cost is the increased installation/dissemination complexity.

Kickoff meeting of DBP #1

Ashwin (IBD) Ananthakrishnan, Philip L. De Jager (MS), Beth Karlson, Helena Canhão, Kat, Sordo, Guergana, Stan, Peter Szolovits, Robert Plenge, Raoul Guzman, Vivian Gainer, Xia Zongqi (Neuro-rheumatology), LJ Wei, Peter Szolovits, Tianxi Cai

The PI's described how they propose to build on the experience of the RA DBP to go into further depth into the phenotypes (around RA) and drug effects/efficacy and see how far we can go with EHR-derived phenotypes. Further, we will be studying the shared and differing pathotypes across a larger range of autoimmune diseases, namely including inflammatory bowel disease and multiple sclerosis (for which we have expert representation in this DBP). We also reviewed the imperative to identify subcohorts (based on combinations of clinical and genomic stratification) that have distinguishing therapeutic efficacy and/or adverse events.

Temporal Reasoning

Present: Susanne Churchill, Ozlem, Zak, Guergana, Griffin Weber, Dan Nigrin

Welcome to my first blog post in the recompeted i2b2.

We discussed temporal reasoning requirements in the new i2b2. Requirements for interval and point logic and how we were informed by Nigrin's master thesis reviewed. Integration of NLP derived temporal relations was preliminarily reviewed.

New DBP: Diabetes and CVD

Present: Stan Shaw, Vivian Gainer, Margarita Sordo. Kat, Susanne Churchill, Ozlem, Zak, Guergana

The new DBP led by Stan Shaw addresses the epigenetics (DNA methylation, histone acetylation, histone methylation assayed using full genome-scale resequencing) of heart disease in the context of diabetes mellitus.

Kat reviewed the overall approach that we take in DBP's in defining cohorts through billing code,

Ozlem gave an update on the new de-identification pipeline.

We reviewed which fellow we might find to do some of the heavy lifting with Stan.

Sunday, September 19, 2010

i2b2 for EPIC users

Courtesy of Keith Marsolo, here is some very useful guidance.

We've posted some documentation that describes our work with the Epic Clarity database. It can be found at the following location:

https://bmi.cchmc.org/svn/i2b2/i2b2/public/data/Documents/

The first document: "Moving Data from Epic to I2B2.docx" describes, well, our approach to moving data from Epic to i2b2. It's not complete, but it provides an overview of how we load demographics, diagnoses, and medications. There's some information on how we create metadata XML for each content type. This is similar to the upcoming "modifier" functionality in version 1.6. Until we release our code, it's more or less CCHMC-specific, but the content would be applicable to either approach.

The other document "epic_dw_master_tables.pdf" details some of the steps used to create our "Master" Epic datamart that we provide to users for reporting and other purposes. Not all of the tables are included, but we've provided a few of the more frequently used ones. The scripts provide an overview of what columns/tables to use to look for certain content.

Words of "wisdom":

1. The tables/columns in Epic may have misleading names. Always check the Epic UserWeb/Clarity data dictionaries to determine the true purpose of the field.

2. Never assume that the Epic documentation is accurate and/or complete. Each institutions Epic implementation is different, and data may be in unique locations, particularly if that data is fed by an interface. To verify the data you are working with is correct (i.e. what you think it means), I would encourage you to work with your Clarity Reporting Team or end users to ensure that the data in the database matches what appears on the screen.

Friday, August 20, 2010

i2b2 in Japan

As our Summer winds down, the various i2b2 core development teams are returning to Boston. Meanwhile our colleagues in Japan have been busy implementing their version of i2b2. Kudos!