Tuesday, November 17, 2009

Open Source

A very insightful response from Fred Trotter. He lays down the options in a very nuanced and clear fashion. Much appreciated.

Thursday, November 12, 2009

Major Depressive Disorder

Perlis, Smoller, et al.,


MInutes courtesy Patience Gallagher.


·       Longitudinal Classifier

o   Roy and Victor will finalize parameters of algorithm

o   Victor will then report (and provide a visualization of):

§ % depressed, % well, # of all notes, etc

o   To query Crimson: Victor will provide a list of medical record numbers within the two groups of interest (responsive and resistant) to Lynn Bry who can then report how many samples are currently available within Crimson

§ Parameters may need to be readjusted, depending on response from Crimson

·       Discussion of Validation

o   The issue with last week’s approach:

§ The algorithm is based on the text, so the first level of validation should not use information (i.e. clinician’s extensive knowledge of a patient) that is not in the text.

o   New validation plan:

§ STEP ONE:

·       GOAL: Determine if an expert clinician’s classification (based on notes only) is the same as the result of the algorithm

·       Pull successfully classified notes

o   All of a patient’s notes will be reviewed

o   Clinicians will be blinded to the results of the classifier

o   The sample of notes will reflect the results of the algorithm: “Random but representational”

o   To keep the validation clean, it will be a “case-control” model - Treatment resistant vs. Responsive

o   For now, only the electronic medical record will be used.

§ The consensus was that patient charts would be more annoying than beneficial, and the electronic records probably have sufficient information

§ Additionally, the information from the paper records is not integrated into the algorithm

§ May look at paper records down the road

§ STEP TWO: Compare list of patients that Roy knows are treatment resistant or responsive and run their notes through the classifier

·       (Tianxi says this will be beneficial as it will give more power to the classifier)

§ STEP THREE: Use the Quick Inventory of Depression Severity (QIDS) as an external source of validation

·       Many patients have QIDS scores in their charts – can determine if the algorithm classification is consistent with performance on the QIDS – this would be a cross-sectional measure

o   The output of this approach would be: “Among patients classified s depressed, the mean QIDS score is ____”

§ NEXT STEPS:

·       Do first level of the validation over the next few weeks.

o   Victor will pass the notes to Roy.

o   Roy will be the sole clinician reviewing the notes

·       Manuscript:

o   Victor and Tianxi have provided their input to Roy

o   Roy will integrate this information and then re-distribute the manuscript to the group

·       PV

o   Update from Victor on obesity:

§ Compared the BMIs of this data set with all other patients and the distribution of the MDD sample is very similar, but right shifted compared to majority’s BMI distribution

·       This makes sense! - Being depressed (and on antidepressants) leads to weight gain

Friday, October 30, 2009

Widening the Use of Electronic Health Records Data for Research

Wisconsin North, Oct 30, 2009

A symposium hosted by the National Center for Research Resources (NCRR). Louise Ramm, Deputy Director of NCRR provided framing challenges and welcome. Zak Kohane introduced use cases, sources of and reviewed the false dichotomy between health-record based research and clinical trials..

Gary Gibbons provided a perspective of disease in the African-American population as an exemplar of a complex orphan disease in the sense that like rare orphan diseases, it is understudied and insufficiently treated. He also pointed out how in many parts of the country underserved minorities are located away from the academic health centers that have made the most inroads in the use of electronic health records.. Therefore, institutions such as Morehouse School of Medicine have their work cut out for them (and not a lot of resources) to integrate data from a large number of only lightly affiliated practices. That same challenge presents an opportunity to be even more impactful in an orphan disease of epidemic quality. Professor Gibbons also urged a broadening of the captured context beyond what is conventionally captured in a standard (brief) healthcare visit. Environmental variables that are highly penetrant, much more so than many genomic markers are poorly captures. He concluded by reviewing the current compelling information about pharmacogenomic differences, and population genetic risks and also the wide holes in our knowledge of these as they pertain to various groups within the USA.

Andrew Auerbach from UCSF addressed comparative effectiness research and its translation into "Health system innovation research" Described how much can be done with charge data, and how additional codified data types (e.g. medications) can further improve the quality of that data. Closed with a discussion of how the various stakeholders in using EHR data for research (.e.g NIH, Payors, health systems leaders, physicians, and patients) might be well aligned or not. Put us on notice that IRB's are unfamiliar about distributed query systems and/or grids and this is becoming at last an obstacle for many CER studies. Summarized several use cases such as optimal length of treatment of pneumonia? Can a patient-focused discharge checklist reduce risk for readmission?

Wisconsin North, Oct 30, 2009
Robert Plenge described his use of i2b2 and electronic health records for genotypic research and discovery of endophenotypes.
John Brownstein reviewed non-traditional public health research using institutional data and non-traditional, non-institutional healthcare data extraction and analysis.

Wednesday, October 28, 2009

i2b2 Academics Users Group—Natcher Building, NIH

Zak Kohane summarized some of the new i2b2-based projects that recently were announced including a South Carolina consortium (a GO grant funds the automated consent component), a pediatric rheumatology research network including 60 sites (NIH GO grant), Shawn Murphy is working with an imaging consortium to better integrate images into i2b2 instances (CTSA administrative supplement) using XNAT/BIRN infrastructure, and B.U. and U. Mass received a GO grant to study health disparities using i2b2 as infrastructure. Lynn Bry received a 2 year R01 under ARRA for the Crimson-i2b2 integration project.

Susanne Churchill welcomed the group and noted that there is a consortium of European i2b2 users/implementors/refiners that are putting together a proposal for joint work across the European Union (for EU funding). She noted that the AUG now numbers over 100 and represents 30 healthcare/academic institutions including 5 internationally.

Shawn Murphy summarized several new developments including

  • Release Candidate 1.4 to support the Enterprise including: analysis views, improved role-based access and auditing, replacement of Gridsphere with webservices and AJAX client, Microsoft Active Directory integration, obfuscation of results for privacy purposes.
  • In the future, the distributions of i2b2 be via configurable VM's so that the functionality can be attached to local databases without requiring a full install. The full source code compile and install will still be available.
  • Eclipse plug-in "store" where developers can contribute their own plug ins and where users can download the plugins they want.
  • More support for derived data (e.g. to systematically return NLP concepts derived from the clinical notes

Lynn Bry described the Crimson system and how i2b2 discarded sample have far higher utilization rates than other samples (e.g. for biorepositories). She also described the Sample Ontology and how she is borrowing from WHO and SNOMED to standardization. The system includes an ontology manager to allow local ontology management and update samples. Also described are the IRB permissions data from RC 1.4. Will work towards multisite studies within the two year implementation time frame. Lynn described the Enterprise Master Specimen Index that tracks samples and patient relationships in various levels of identity (consented identified, de-identified, and anonymous samples).

Dan Housman and Peter Emerson from Recombinant were invited into the AUG for the discussion segment about sample management (because of the AUG's wish to keep companies at arm length) to discuss their own efforts in sample management. They made it clear that all their developments they are involved in will be contributed back to the i2b2 community as fully open source code.

Andy McMurry summarized the status of the distributed querying system called SHRINE that is now implemented at several Harvard-affiliated hospitals and several West Coast academic health centers (e.g. UCSF and UW) that is now fully IRB approved (at Harvard) for queries returning aggregate numbers (across demographics, laboratory results, medications and diagnoses). Andy also made a very clear several technical hurdles that were overcome including the ontology matching process (on the fly). Finally, he announced the availability of SHRINE code in a fully open source codebase.

i2b2 CICTR presented by Nick Anderson. They have been able to query muiti-institutional "anonymized PHI data". Application is in diabetes and cardiovascular disease. Described technical, governance, ontology and evaluation process that CICTR is driving. Nick described the heterogeneous systems that CICTR has to query across. Nick distinguished the need for high level institutional support which is a sine qua non requirement for success and the need for a broad range of paid technical personnel.

Keith Marsolo from Cincinnati's Children's reported on their Epic roll out and how that relates to their i2b2. Described their quality assurance efforts. Notes the challenge of the firehose and makes the acute observation that most investigators just want a spreadsheet and anything more complicated than that tends to get ignored. Keith also emphasized their goal to allow streamlined adding of research data to the clinical data. He makes the important point that "age at FACT" is essential for pediatric applications to allow them to be easily accessed in the i2b2 workbench. Keith mentioned using i2b2 for research databases for Eosinophilic esophagitis, and IBD.

Phil Reeder from UT Houston talked about medications mapping. It is a challenge and they have chosen to map to RxNorm and then manually had to map into SNOMED CT (perhaps their database was out of date). Started from an All Scripts database and had a semi-automated process. Notes that every year there are at least a 1000 new drugs (i.e. different packaging, pill sizes etc). Brought up the thorny (and annoying IMHO) of the proprietary mappings to standard vocabularies.

Ralph Zottola and Edward Westrick described the effort at U. Mass (data sourced from Meditech system, REDcap EDC, biorepository, EMPI, Allscripts, and departmental systems) where they are up to 2,000,000 patients. Edward described the managed care network (1000 physicians) that plugs into U. Mass and how quality measures inform the discussions and bargaining with payors. Reviewed different measures including HEDIS, patient experience, was well as the increasingly important Relative Resource Utilization (Efficiency). Demonstrated how knowing what is going on in the healthcare institution allows for a sober and leveraged discussion with payors. The healthcare system approached the medical school and settled on i2b2 and they already have seen that they can accurately forecast their performance and to provide a feedback loop (with financial incentives) to healthcare providers. Ralph pointed out that the fact that clinical operations are using i2b2 is also causing an improvement in the quality of the data being delivered to the data marts.

Iain Sanderson and Jihad Obeid. Iain started by describing a very comprehensive Informatics Initiatives in South Carolina. They have a unified IRB with a goal of clinical trials across the state. There is both a scientific and a funding motivation in this. There are three informatics initiatives have dovetailed (CTSA biomedical informatics, HSSC IT business plan, and a GO grant on consent). This has resulted in the South Carolina Integrated Platform for Research (SCIPR) that uses i2b2 for the clinical research data warehouse. In the process they are adopting a wide range of open source solutions including Sun Microsystems' JavaCaps. Iain reports that the data sharing agreements between the 6 centers across HSSC are under way and likely to result in an MoU in short order. Iain also described the beginnings of the consent management/gathering system, the permissions ontology and documenting the different consent processes at the institutional members of the HSSC. Finally, Iain discussed how personal patient health portals may be used to provide the patient-facing part of the network.

Bethesda, Oct 28, 2009BOS, Oct 28, 2009BOS, Oct 28, 2009Bethesda, Oct 28, 2009

Friday, October 23, 2009

Rheumatoid Arthritis

Plenge et al.,

3/4 of the genotyping is completed.

Manuscript to be submitted today regarding phenotyping accuracies.

What's next

Discussed what might be the priority areas for i2b2 in Core 2 for the competitive re competition.

The RFA has not yet be announced so the discussion was, of necessity, wide-ranging.

Major Depressive Disorder (and Bipolar Disease)

Smoller, Perlis, et al.,

Discussed the challenge of merging note-level NLP conclusions to wholistic patient evaluations.

Reviewed various Bayesian and (more broadly) machine learning approaches to defining which notes contribute most to the accurate classification of the patient phenotype.