AI News, In The Hospital Of The Future, Big Data Is One Of Your Doctors

In The Hospital Of The Future, Big Data Is One Of Your Doctors

(Patients on the map are grouped by how closely related their health data is, based on clinical readings like blood sugar and cholesterol.) From this map and others like it, Dudley might be able to pinpoint genes that are unique to diabetes patients in the different clusters, giving new ways to understand how our genes and environments are linked to disease, symptoms, and treatments.

(The eventual goal is to enroll 100,000 patients in what’s called the BioMe platform to explore the possibilities in having access to massive amounts of data.) “There’s nothing like that right now–where we have a sort of predictive modeling engine that’s built into a health care system,”

Almost every web company was born swimming in easily harvested and mined data about users, but in health care, the struggle has for a long time been more simple: get health records digitized and keep them private, but make them available to individual doctors, insurers, billing departments, and patients when they need them.

Deep Patient: An Unsupervised Representation to Predict the Future of Patients from the Electronic Health Records

This section presents the deep patient method and describes the pipeline implemented to evaluate the benefits of this representation in the task of predicting future diseases.

Each layer of the network is trained to produce a higher-level representation of the observed patterns, based on the data it receives as input from the layer below, by optimizing a local unsupervised criterion (Fig.

Briefly, an autoencoder takes an input and first transforms it (with an encoder) to a hidden representation through a deterministic mapping: parameterized by , where is a non-linear transformation (e.g., sigmoid, tangent) named “activation function”, W is a weight coefficient matrix, and b is a bias vector.

This can be viewed as simulating the presence of missed components in the EHRs (e.g., medications or diagnoses not recorded in the patient records), thus assuming that the input clinical data is a degraded or “noisy” version of the actual clinical situation.

The parameters of the model θ and θ′ are optimized over the training dataset to minimize the average reconstruction error, where is a loss function and N is the number of patients in the training set.

We used the reconstruction cross-entropy function as loss function, i.e., Optimization is carried out by mini-batch stochastic gradient descent, which iterates through small subsets of the training patients and modifies the parameters in the opposite direction of the gradient of the loss function to minimize the reconstruction error.

The learned encoding function is then applied to the clean input x and the resulting code y is the distributed representation (i.e., the input of the following autoencoder in the SDA architecture or the final deep patient representation).

The data related to patients who visited the hospital prior to 2003 was migrated to the electronic format as well but we may lack certain details of hospital visits (i.e., some diagnoses or medications may not have been recorded or transferred).

For each patient in the dataset, we retained some general demographic details (i.e., age, gender and race), and common clinical descriptors available in a structured format such as diagnoses (ICD-9 codes), medications, procedures, and lab tests, as well as free-text clinical notes recorded before the split-point.

All the clinical records were pre-processed using the Open Biomedical Annotator to obtain harmonized codes for procedures and lab tests, normalized medications based on brand name and dosages, and to extract clinical concepts from the free-text notes19.

Negated tags were identified using NegEx, a regular expression algorithm that implements several phrases indicating negation, filters out sentences containing phrases that falsely appear to be negation phrases, and limits the scope of the negation phrases23.

To this aim we modeled the parsed notes using topic modeling25, an unsupervised inference process that captures patterns of word co-occurrences within documents to define topics and represent a document as a multinomial over these topics.

This list was filtered to retain only diseases that had at least 10 training patients and manually polished by a practicing physician to remove all the diseases that could not be predicted from the considered EHR labels alone because related to social behaviors (e.g., HIV) and external life events (e.g., injuries, poisoning), or that were too general (e.g., “other form of cancers”).

Descriptors appearing in more than 80% of patients or present in fewer than five patients were removed from the dataset to avoid biases and noise in the learning process leading to a final vocabulary of 41,072 descriptors.

In particular, we found that using 500 hidden units per layer and a noise corruption factor lead to a good generalization error and consistent predictions when tuning the model using the validation data set.

In particular, we considered principal component analysis (i.e., “PCA” with 100 principal components), k-means clustering (i.e., “K-Means” with 500 clusters), Gaussian mixture model (i.e., “GMM” with 200 mixtures and full covariance matrix), and independent component analysis (i.e., “ICA” with 100 principal components).

In particular, PCA uses an orthogonal transformation to convert a set of observations of possibly correlated variables into a set of linearly uncorrelated variables called principal components, which are less than or equal to the number of original variables.

To predict the probability that patients might develop a certain disease given their current clinical status, we implemented random forest classifiers trained over each disease using a dataset of 200,000 patients (one-vs.-all learning).

Translational bioinformatics in the era of real-time biomedical, health care and wellness data streams

With the advent of large-scale adoption of health-monitoring devices, health care and wellness monitoring data are currently captured in an unprecedented scale.

For example, remote monitoring of implanted life-critical devices has been implemented at several institutions [113, 114], and assessments of these continuous data streams found that they lead to non-inferior patient outcomes [115, 116], cost savings [116, 117] and earlier identification of monitor malfunction [118].

systematic approach should be established to consent a potential user, and after completing the consent, the data can be transferred to verified software and database systems at individual hospitals or the health system where the patient seeks care using secured data protocols.

Unstructured data captured in text format using web forms in wellness or clinical applications can be used as a possible data resource for clinical research using natural language processing (NLP) tools.

Historical data on the use of over-the-counter (OTC) medication and diet journaling [119] from health monitoring tools and online journals compiled by patients can be extracted using NLP tools to better understand both health and disease state.

Data elements, data capturing technologies, aggregation systems and clinical or health care ontologies for integrating health-monitoring data into EMR were partially defined as the current ecosystem of standards, toolkits and resources (Tables 3 and ​and4).4).

The insight on the individual’s longitudinal health and holistic considerations meets the Patient-Centered Outcomes Research Institute ( mission to improve the ability to discern which health care options are best for a particular patient and potentiates informed individualized health decisions.

Artificial Intelligence Transforms the Future of Medicine

Electronic health records (EHRs) weren’t originally designed to predict disease risk or determine a more precise treatment.

More specifically, Deep Patient uses deep learning, a machine-learning approach that essentially mimics the neural networks of the brain, allowing the computer system to learn new things without being programmed to do so.

After scanning millions of photos, a computerized neural network achieved near 75% accuracy in identifying cats—without receiving any information on cats or cat traits.

In 2012, computer scientists at Google fed millions of YouTube images into a giant computerized neural network to see if it could learn what cats looked like without being fed any information on cat characteristics.

With Deep Patient, scientists fed deidentified data from 700,000 EHRs into a computer neural network, which randomly combined data to make and test new variables for disease risk.

Led by Regina Barzilay, PhD, a scientist at the Massachusetts Institute of Technology, the effort uses natural language processing to teach computers how to read and interpret EHR data, including nonstandardized parts known as “free text.” In other words, algorithms learn to read breast pathology reports and accurately identify if cancer is present.

That’s a big deal, according to Hughes, because it means researchers and clinicians can use AI to sort and identify massive swaths of relevant pathology data that were once only intelligible to humans—and that holds the potential for revealing new insights into cancer prevention, detection, and treatment.

Hughes added that right now, only about 3% of cancer patients participate in clinical trials, which means much of the field’s therapeutic recommendations are based on a small cohort of patients.

“This is the next step beyond that.” Today, about 7,000 clinicians from 500 institutions worldwide are using and building a diagnostic and management tool that uses AI to glean useful information from the world’s collective medical knowledge.

“The challenge with deep learning in general is that it’s hard to peek inside the box.” But Dudley doesn’t think that should keep scientists from using deep learning in research or clinical settings.

In fact, he predicts that the largest health care company of the future will be focused on data and won’t own any hospitals—a situation similar to Uber, the world’s largest taxi company, which doesn’t actually own any vehicles.

Mount Sinai Announces Appointment of Joel Dudley, PhD, as Executive Vice President for Precision Health

Joel Dudley, PhD, an internationally recognized investigator in translational bioinformatics and precision medicine, has been named Executive Vice President for Precision Health for the Mount Sinai Health System.

Precision medicine is an innovative model of health care that customizes diagnosis and treatment for an individual patient based not only on genetic data but also on medical history, laboratory tests, health history, lifestyle, and environmental influences.

“Our goal with the Precision Health Enterprise is to continue to personalize therapies for our patients, including those with conditions such as cancer, chronic diseases such as diabetes and Alzheimer’s disease, or rare genetic conditions.

We believe that new product development, prototyping, artificial intelligence, predictive analysis, prevention, and partnerships will help us deliver better care in a rapidly changing health care ecosystem.”

Within the Institute, Dr. Dudley is developing an integrated translational biomedical research model at the nexus of advances in omics, clinical medicine, digital health, and artificial intelligence.

“I look forward to working with Joel and his team to deploy next-generation strategies in health engagement, precision wellness, and prevention to empower and care for patients through their health and wellness journey.”

“Joel’s vision of developing a new generation of care that is more intelligent, contextually relevant, and rapidly designed for the people we serve is closely aligned with the mission of the Department of Health Systems Design,”

Moving from Precision Medicine to Next Generation Health Care

A Department of Medicine Grand Rounds presented by Joel Dudley, PhD, Executive Vice President for Precision Health, Director of the Institute for Next ...

SINAInnovations 2015: Keynote Address - Quantified Disease and Precision Prevention

Linda Avey, Co-Founder and CEO, We are Curious, Inc.speaks on personal data, genomics, quantified disease, and precision prevention. Introduction by Joel ...

Healthy Aging: Promoting Well-being in Older Adults

The population of older Americans is growing and living longer than ever. Comments on this video are allowed in accordance with our comment policy: ...

Treatment-free remission (TFR) – the safety of stopping treatment in chronic myeloid leukemia (CML)

Jan Geissler, Co-founder of the CML Advocates Network, discusses the safety of stopping treatment in chronic myeloid leukemia (CML), which was a dominant ...

New and Emerging Therapies for Heart Failure | Gregg Fonarow, MD

Heart Failure Update UCLA Heart Failure Symposium 2013.

Realizing the Opportunities of Genomics in Health Care - Geoffrey Ginsburg, M.D., Ph.D.

Dr. Ginsburg is the founding director of the Center for Applied Genomics in the Duke University Medical Center and the founding executive director of the Center ...

CSHL Keynote: Dr. Eric Schadt, Mt Sinai Medical Center

"Considering the digital Universe of data to better diagnose and treat patients" from the 2013 Systems Biology: Networks meeting.


What causes addiction? Easy, right? Drugs cause addiction. But maybe it is not that simple. This video is adapted from Johann Hari's New York Times ...

At the Heart of Precision Medicine

Jasonee was diagnosed with Arrythmogenic Right Ventricular Dysplasia/Cardiomyopathy (ARVD/C) in 2012. After a sudden dramatic cardiac event, a series of ...

Making Precision Medicine a Reality with SAP Healthcare and Mercy

Learn more about SAP's work with Mercy hospital system at HIMSS 2016: In this interview, I sit down with Curtis ..