Abstract
Traditional clinical prediction models focus on parameters of the individual patient. For infectious diseases, sources external to the patient, including characteristics of prior patients and seasonal factors, may improve predictive performance. We describe the development of a predictive model that integrates multiple sources of data in a principled statistical framework using a posttest odds formulation. Our method enables electronic realtime updating and flexibility, such that components can be included or excluded according to data availability. We apply this method to the prediction of etiology of pediatric diarrhea, where 'pretest’ epidemiologic data may be highly informative. Diarrhea has a high burden in lowresource settings, and antibiotics are often overprescribed. We demonstrate that our integrative method outperforms traditional prediction in accurately identifying cases with a viral etiology, and show that its clinical application, especially when used with an additional diagnostic test, could result in a 61% reduction in inappropriately prescribed antibiotics.
Introduction
Healthcare providers use clinical decision support tools to assist with patient diagnosis, often to improve accuracy of diagnosis, reduce cost by avoiding unnecessary laboratory tests, and in the case of infectious diseases, deter the inappropriate prescription of antibiotics (Sintchenko et al., 2008). Typically, data entered into these tools is related directly to the patient’s individual characteristics, but data sources external to the patient can be informative for diagnosis. For example, climate, seasonality, and epidemiological data inform predictive models for communicable disease incidence (Colwell, 1996, Chao et al., 2019 Fine et al., 2011). The emergence of advanced computing and machine learning has enabled the incorporation of large data sources in the development of clinical support tools (Shortliffe and Sepúlveda, 2018) such as SMARTCOP for predicting the need for intensive respiratory support for pneumonia (Charles et al., 2008) or the ALaRMS model for predicting inpatient mortality (Tabak et al., 2014).
Clinical decision support tools rely on the availability of information sources and computing at the time of patient encounter. Although increased availability of internet/mobile phones have increased access to information and computing power in lowresource settings, there may be times when connectivity, computing power, or datacollection infrastructure is unavailable. Thus, there is a need to build clinical decision support tools which can flexibly include features of external sources when available, or function without them if unavailable. Methods that enable the dynamic updating of predictive models are advantageous due to potential cyclical patterns of infectious etiologies. Furthermore, with the emergence of pointofcare (POC) tests for clinical decisionmaking (Price, 2001), predictive models that are able to integrate results of such diagnostic testing could enhance their usefulness.
We develop a novel method for diagnostic prediction which integrates multiple data sources by utilizing a posttest odds formulation with proofofconcept in antibiotic stewardship for pediatric diarrhea. Our formulation first fits separate models from different sources of data, and then combines the likelihood ratios from each of these independent models into a single prediction. This method allows the multiple components to be flexibly included or excluded. We apply this method to the prediction of diarrhea etiology with data from the Global Enteric Multicenter Study (GEMS) (Kotloff et al., 2013) and assess the performance of this tool, including with the addition of a synthetic diagnostic, using two forms of internalvalidation and by showing its potential effect on reducing inappropriate antibiotic use.
Materials and methods
We present our approach to building and assessing a flexible multisource clinical prediction tool with (1) the data sources, (2) the individual prediction models, (3) the use of the likelihood ratio for integrating predictive models, (4) validation of the method, (5) the impact of an additional diagnostic, and (6) a simulation of conditionally dependent tests. We program our prediction tool using R version 3.6.2 (R Project for Statistical Computing, RRID:SCR_001905).
Data sources
Request a detailed protocolWe apply our posttest odds model using clinical data from GEMS, a prospective, casecontrol study from 2007 to 2011 which took place in seven countries in Africa and Asia. Methods for the GEMS study have been described in detail (Kotloff et al., 2012). Briefly, 9439 children with moderatetosevere diarrhea were enrolled at local health care centers along with one to three matched controlchildren. A fecal sample was taken from each child at enrollment to identify enteropathogens and clinical information was collected, including demographic, anthropometric, and clinical history of the child. We used the quantitative realtime PCRbased (qPCR) attribution models developed by Liu et al., 2016 in order to best characterize the cause of diarrhea. Our dependent variable was presence or absence of viral etiology, defined as a diarrhea episode with at least one viral pathogen with an episodespecific attributable fraction (AFe ≥ 0.5) and no bacterial or parasitic pathogens with an episodespecific attributable fraction. Prediction of viral attribution is clinically meaningful since it indicates that a patient would not benefit from antimicrobial therapy. We defined other known etiologies as having a majority attribution of diarrhea episode by at least one other nonviral pathogen. We exclude patients with unknown etiologies when fitting the model, though it has been previously shown that these cases have a similar distribution of viral predictions using a model with presenting patient information as those cases with known etiologies (Brintz et al., 2020).
We obtained weather data local to each site’s health centers during the GEMS study using NOAA’s Integrated Surface Database (Smith et al., 2011). The incidence of many pathogens, including rotavirus (Cook et al., 1990), norovirus (Ahmed et al., 2013), cholera (Emch et al., 2008), and Salmonella (Mohanty et al., 2006), are known to have seasonal patterns, and other analyses have established climatic factors to be associated with diarrheal diseases (Colwell, 1996, Chao et al., 2019, Farrar et al., 2019). Stations near GEMS sites such as in The Gambia exhibit seasonal patterns (Figure 1). We used daily temperature and rain data weighted most by those weather stations closest to the GEMS sites (Appendix 1).
Construction of predictive models
We define each model using the features described in the below subsections in an additive logistic regression model. Each model can be trained using a sample of data from a specific country, continent, or all available data.
Predictive model (A) presenting patient
Request a detailed protocolThe patient model derived from the GEMS data treats each enrolled patient as an observation and uses their available patient data at presentation to predict viral only versus other etiology of their infectious diarrhea. In order to make a parsimonious model, we used the previously published random forests variable importance screening (Brintz et al., 2020). Using the screened variables (Table 1), we fit a logistic regression including the top five variables that would be accessible to providers at the time of presentation. These include age, blood in stool, vomiting, breastfeeding status, and midupper arm circumference (MUAC), an indicator of nutritional status. We note that while variables such as fever and diarrhea duration were shown to be important in previous studies (Fontana et al., 1987), adding these variables did not improve performance. Additionally, we excluded 'Season', since variables representing it are included in the climate predictive model (discussed below), as well as 'Heightforage Zscore', another indicator of nutritional status, which would require a less feasible calculation than measurement of MUAC.
Predictive model (B) climate
Request a detailed protocolWe use an aggregate (mean) of the weighted (Appendix 1) local weather data over the prior 14 days to create features that capture sitespecific climatic drivers of etiology of infectious diarrhea. By taking an aggregate, we create a moving average that reflects the seasonality seen in Figure 1. An example of the aggregate climate data from The Gambia is shown in Figure 1—figure supplement 1. From the figure, which also shows a moving average of the viral rate, We see that the periods of higher viral cases of diarrhea tend to have low temperatures and less rain.
Predictive model (C) seasonality
Request a detailed protocolWe include a predictive model with sine and cosine functions as features as explored in Stolwijk et al., 1999. Assuming a periodicity of 365.25 days, we have functions $sin(\frac{2\pi t}{365.25})$ and $cos(\frac{2\pi t}{365.25})$. We show that standardized seasonal sine and cosine curves correlate with a rolling average of daily viral etiology rates in The Gambia over time (Figure 1—figure supplement 2). These functions can be used to model the countryspecific seasonality of viral etiology rate.represent multiple underlying processes that result in a seasonality of viral etiology.
Use of the likelihood ratio to integrate predictive models from multiple data sources
We integrate predictive models from the multiple sources of data described above using the posttest odds formulation. Using Bayes’ Theorem, $P(AB)=\frac{P(BA)\cdot P(A)}{P(B)}$, to construct the posttest odds of having a viral etiology,
where $V=1$ represents a viral etiology and $V=0$ represents an other known etiology, ${T}_{1},{T}_{2},\mathrm{\cdots},{T}_{k}$ represent the k tests, the distribution of the predictions from one or more predictive models, used to obtain the posttest odds, and $\frac{P(V=1)}{P(V=0)}$ is the pretest odds. Note that going from line (2) to line (3) requires conditional independence between the tests, that is, that $P({T}_{i}={t}_{i},{T}_{j}={t}_{j}V=1)$ = $P({T}_{i}={t}_{i}V=1)\cdot P({T}_{j}={t}_{j}V=1)$ and $P({T}_{i}={t}_{i},{T}_{j}={t}_{j}V=0)=P({T}_{i}={t}_{i}V=0)\cdot P({T}_{j}={t}_{j}V=0)$ for all i and j. We test for conditional independence to assess the necessity of making higherdimensional kernel density estimates using the $ci.test$ function from the $\{bnlearn\}$ package in R (Scutari, 2010). We derive each $P({T}_{j}={t}_{j}V=1)$ and $P({T}_{j}={t}_{j}V=0)$ using Gaussian kernel density estimates on conditional predictions from a logistic regression model fit on the training set (Silverman, 1986). The distribution of $P({T}_{j}V)$ is derived using the kernel density estimator $f({t}_{j})=\frac{1}{nh}{\sum}_{i=1}^{n}K(\frac{{t}_{j}{x}_{i}}{h})$ where, in our case, $K(x)=\varphi (x)$, the standard normal density function, and the bandwidth, h, is Silverman’s 'rule of thumb' and the default chosen in the $density$ function in R (Parzen, 1962).
Figure 2 shows an example of the frequency of predictions from a logistic regression model conditional on the viralonly status (V = 0 and V = 1) determined from attributable fractions. Additionally, we overlaid the estimated onedimensional kernel density. To obtain the value of $\frac{P({T}_{j}={t}_{j}V=1)}{P({T}_{j}={t}_{j}V=0)}$, the predicted odds, from a model’s prediction, we divide the kernel density estimate from the $V=1$ set (right) by the kernel density estimate from the $V=0$ set (left). It is feasible to estimate a multidimensional kernel density so that it is not necessary to make the conditional independence assumption to move from line 2 to line 3 in the equation above. Figure 2—figure supplement 1 shows an example twodimensional contour plot for kernel density estimates of predicted values obtained from logistic regression on GEMS seasonality and climate data in Mali which we will discuss further below. The density was created using R function $kde2d$ (Venables and Ripley, 2002).
Pretest odds from historical data
Request a detailed protocolWe calculated pretest odds using historical rates of viral diarrhea by site and date. We utilize available diarrhea etiology data for a given date, regardless of year, and site using a moving average such that pretest probability ${\pi}_{d}$ for date d is
where k_{d} is the number of observed patients on date d, ${D}_{di}$ is 1 if the etiology of the patients’ diarrhea is viral and 0 otherwise, and n is the number of days included on both sides of the moving average. We would expect ${\pi}_{d}$ to represent a pretest probability of observing a viral diarrhea etiology on date d. Given that this rate information will likely be unavailable in new sites without established etiology studies, we provide an alternative formula based on recent patients’ presentations (Appendix 2). Additionally, we include a sensitivity analysis by calculating pretest odds using conventional diagnostic methods data as qPCR data are unlikely to be available in highburden settings.
Validating the method
Request a detailed protocolGiven the temporal nature of some of the tests we developed, we estimate model performance using within rollingoriginrecalibration evaluation. This method evaluates a model by sequentially moving values from a test set to a training set and retraining the model on all of the training set (Bergmeir and Benítez, 2012); for example, we train on the first 70% of the data and test on the remaining 30%, then train on the first 80% of the data and test on the remaining 20%. No data from the training set is used as part of the prediction for the test set. In each iteration of evaluation, predictions on the test set are produced and corresponding measures of performance obtained: the receiver operating characteristic (ROC) curve, and area under the ROC curve (AUC), also known as the Cstatistic, along with AUC confidence intervals (LeDell et al., 2015). Figure 3 depicts one iteration of within rollingoriginrecalibration evaluation.
We additionally include a joint density for the climate and seasonal data in which we estimate a twodimensional kernel density (not shown in Figure 3). This model is called ”Joint’ in the results to follow. To assess how this model might generalize to a site that was not used for model training, we used a leaveonesiteout validation. By excluding a site and training the model’s tests at a higher level, such as on the entire continent, we get an idea of performance at a new site within one of the continents for which we have data. Lastly, we define a threshold for the predicted odds ratio based on the desired specificity of the model. We use this threshold to evaluate the effect of the model on prescription or treatment of patients with antibiotics in the GEMS data.
Modeling the impact of an additional diagnostic test
Request a detailed protocolWe include a theoretical diagnostic which indicates viral versus other etiology with a given sensitivity and specificity specifically to show the effect of an additional diagnostictype test, such as a host biomarkerbased pointofcare stool testpointofcare stool test, on the performance of our integrated posttest odds model. We include three scenarios: (1) 70% sensitivity and 95% specificity, (2) 90% sensitivity and 95% specificity, and (3) 70% sensitivity and 70% specificity. In order to estimate the performance of an additional diagnostic test, for each patient in each of 500 bootstrapped samples of our test data, we randomly simulated a test result based on the sensitivity or specificity of the diagnostic test. From the simulated test result, we derive the likelihood ratio of the component directly from the specified sensitivity and specificity of the test. A positive test results in a component likelihood ratio of $\frac{sensitivity}{1specificity}$ and a negative test results in a component likelihood ratio of $\frac{1sensitivy}{specificity}$. We then take an average the measure of performance of the bootstrapped samples.
Simulation of conditionally dependent tests
Request a detailed protocolWe demonstrate the utility of the twodimensional kernel density estimate through simulation. In each iteration of the simulation (100 iterations), we generate 3366 responses from a random Bernoulli variable Z with a $\frac{1}{3}$ probability of success (the approximate proportion of GEMS cases with a viral etiology). Then, conditioned on Z we generate predictive variables X and Y such that:
where $\sigma $ is a random draw from the standard normal distribution and values of $\gamma $ ranging from −10 to 10 determine the level of conditional dependence between the two predictors conditional on the value of Z. $\gamma =0$ indicates conditional independence. Using an 80% training set, we derive the kernel density estimate for the likelihood ratio (no pretest odds included) using X and Y as two separate tests and as a single twodimensional test and calculate the AUC from the 20% test set.
Determination of appropriate antibiotic prescription
Request a detailed protocolWe demonstrate the clinical usefulness of our models by applying them directly to the prescription of antibiotics. For each version of the model, we determined the threshold of prediction that would amount to attaining a model specificity of 0.90 and 0.95. Since the prediction of a viral only etiology of diarrhea indicates that antibiotics should not be prescribed, we chose these high specificities due to the potential harm or even death that could occur if a patient who needed antibiotics did not receive them. Using the thresholds, we determine which patients our models would correctly predict a viral only etiology of their diarrhea (true positives) as well as patients our model would incorrectly predict a viral only etiology of their diarrhea (false positives).
Results
Integrative posttest odds models outperformed traditional models for prediction of diarrhea etiology
Of the 3366 patients in GEMS with an attributable identified pathogen, 1049 cases were attributable to viral only etiology. We first examined whether our integrative posttest odds model can better discriminate between patients with diarrhea of viralonly etiology and patients with other etiologies than a traditional prediction model which includes only the presenting patient’s information. We found that the best integrative model with an AUC of 0.839 (0.808–0.870) had a statistically better performance than the traditional model with an AUC of 0.809 (0.776–0.842) with a pvalue of 0.01 (DeLong, twosided). Overall, using the AUC as a discrimination metric, the integrative models (AUC: 0.837 (0.806–0.869)) outperformed the traditional model (AUC: 0.809 (0.776–0.842)). Overall, the best performing models were ones in which either the seasonal sine and cosine curves, or the prior patient pretest component alone was added to the presenting patient information with AUC’s of 0.830 and 0.839 (with 80% training data), respectively (Figure 4). Including additional components, especially including both climate and seasonality (although not as a joint density), appears to reduce the performance. As expected, a reduced testing set increases the AUC but also increases the variance of the estimate (Figure 4—figure supplement 1). Using conventional diagnostic methods data data to calculate pretest odds instead of qPCR data reduces AUC slightly from 0.839 to 0.829 (0.798–0.860).
To assess our model’s performance more granularly, we then examine performance of the top two predictive models by individual sites. We found that the AUC, with 80% training and 20% testing, varied greatly by site, ranging from 0.63 in Kenya to 0.95 in Bangladesh (Table 2). Of note, the African sites have fewer patients in their testing and training sets than the Asian countries due to a combination of fewer patients enrolled at those sites and proportionately fewer patients with known etiologies. In leaveonesiteout validation testing, we found that the climate test tends to outperform the seasonality test, and that there were notable differences in cstatistics between sites with the order of performance similar to within rollingoriginrecalibration evaluation (Figure 4—figure supplement 2).
Addition of a diagnostic test to integrative models improves discrimination
Emerging efforts to develop diagnostic devices, including laboratory assays as well POC tests, have focused on the performance of the test used in isolation. Here, we consider the use of a diagnostic device in combination with clinical predictive models. We used the integrative model to examine the impact that an additional diagnostic would have on discrimination of two of the best performing models. We show that an additional diagnostic, with varying sensitivity and specificity, would improve the crossvalidated AUC as expected (Table 3). An additional test with a 70% sensitivity and 70% specificity increases the AUC by 3–5%, while a more specific test could increase the AUC by 10%.
We next examined ROC curves, which visually demonstrate the effect of additional diagnostics with varying levels of sensitivity and specificity (Figure 5). We show that a similar level of sensitivity and specificity is achievable by the model with the pretest information versus the model with seasonal information. Additionally, the additional diagnostics result in improved overall sensitivity and specificity corresponding to sensitivity and specificity of the diagnostic. The overall sensitivity and specificity of each model is greater than the diagnostic alone.
Breaking the conditional independence assumption can be addressed using 2D Kernel density estimates
Our integrative posttest odds method assumes the conditional independence of its component tests, and thus we performed simulation of increasingly conditionally dependent components to assess the performance of the method when the assumption is broken. We showed that the AUC of the posttest odds model deteriorates quickly as the conditional independence assumption is violated (Table 4). With no conditional dependence between predictions from models X and Y, the result using onedimensional kernel density is comparable to the result with twodimensional kernel density model. However, as the conditional correlation between the tests increase to −0.90, the onedimensional AUC decreases by about 11% while the posttest odds with the twodimensional test performs consistently across this range of conditional correlation.
Clinician use of an integrative predictive model for diarrhea etiology could result in large reductions in inappropriate antibiotic prescriptions
Given that one potential application of an integrative predictive model for diarrhea etiology would be as support for clinical decision making for antibiotic use (i.e. antibiotic stewardship), we then examined the impact that the top predictive model would have on prescription of antibiotics by clinicians in GEMS. Of the 3366 patients included in our study, 2653 (79%) were treated with or prescribed antibiotics, 806 (30%) of whom were prescribed to those with a viralonly etiology as determined by qPCR. Here, we examined how use of integrative predictive model could have altered antibiotic use in our sample. Of the 681 patients in the 20% test set, 540 (79%) were prescribed antibiotics, including 166 (30%) with a viralonly etiology. Of those prescribed/given antibiotics the model with pretest odds, with threshold chosen for an overall specificity of 0.90, identified 88 (53%) viral cases as viral, and 29 nonviral cases as viral. With an additional diagnostic with a sensitivity and specificity of 0.70, the same model would on average identify 102 (61%) viral cases as viral with the same 31 nonviral cases identified as viral. Assuming that clinicians would not prescribe antibiotics for those cases identified by the predictive model with the additional diagnostic as viral, we would avoid 88 (53%) and 102 (61%) of inappropriate antibiotic prescriptions in the two scenarios described. The majority of the false positives (29 and 30 in the two scenarios) were episodes majority attributed to Shigella, STETEC, and combinations of rotavirus with a nonviral pathogen (Table 3—source data 1). All of these false positive, with exception of 1 case, had nonbloody diarrhea, and thus would have been deemed as not requiring antibiotics by WHO IMCI guidelines.
Discussion
The management of illness in much of the world relies on clinical decisions made in the absence of laboratory diagnostics. Such empirical decisionmaking, including decisions to use antibiotics, are informed by variable degrees of clinical and demographic data gathered by the clinician. Traditional clinical prediction rules focus on the clinical data from the presenting patient alone. In this analysis, we present a method that allows flexible integration of multiple data sources, including climate data and clinical or historical information from prior patients, resulting in improved predictive performance over traditional predictive models utilizing a single source of data. Using this formulation, if certain sources of data such as climate or previous patient information are not available (e.g. due to a lack of internet connection or data infrastructure), the prediction can still be made using current patient information or seasonality, as appropriatethe other sources. A mobile phone application is an ideal platform for a decision support tool implemented in lowresource settings. Through internet access by wifi or cellular data, a smartphone platform could automatically download recent patient or climate data, while its portability would facilitate clinicians in entering current patient clinical information. We show that application of such a predictive model, especially with an additional diagnostic test, may translate to reductions in inappropriate antibiotic prescriptions for pediatric viral diarrhea.
The global burden of acute infectious diarrhea is highest in low and middleincome countries (LMICs) in southeast Asia and Africa (Walker et al., 2013), where there is limited access to diagnostic testing. The care of children in these regions could greatly benefit from an accurate and flexible decision making tool. Decisions for treatment are often empiric and antibiotics are overprescribed (Rogawski et al., 2017), although the majority of cases of diarrhea do not benefit from antibiotic use and also many instances of acute watery diarrhea are selflimiting . For example, 2653 (79%) of the 3366 patients in our study were treated with or prescribed antibiotics. Of these 806 (30%) were prescribed to those with a viralonly etiology. Unnecessary antibiotic use exposes children to significant adverse events including serious allergic reactions (Logan et al., 2016, Marra et al., 2009) and clostridium difficile infection (Jernberg et al., 2010), and contributes to increased antimicrobial resistance. We show that a predictive model can be used to discriminate between those with and without a viralonly etiology and that the inappropriate use of antibiotics can be avoided in 54% cases using our model with no additional diagnostics.
We found using within rollingoriginrecalibration evaluation that models which include either the pretest odds calculated historical rates or the seasonal test were the best at discriminating between viral etiologies and other etiologies, a finding that held true across training and testing set sizes. However, in the leaveoneout validation, models which included the alternate pretest odds and climate tended to perform the best. This difference is likely due to the generalizeability of the individual tests, i.e, the leaveoneout tests are trained at the continental level and the effect of climate on etiology is intuitively more generalizeable than seasonal curves which are very specific to each location. We found that our integrative model with only the historical (pretest) information included (without additional diagnostics) would have identified a viralonly etiology in 88 (53%) patients who received antibiotics. We then show that even the use of an additional diagnostic test with modest performance (70% sensitivity and specificity) would further decrease inappropriate antibiotic use by another 14 (for a total of 102, or 61% of) patients. In the context of calls by the WHO for the development of affordable rapid diagnostic tools (RDTs) for antibiotic stewardship (Declaration, 2017), our findings suggest that development and evaluation of novel RDTs should not be performed in isolation. Potential for integration of rapid diagnostic tests into clinical prediction algorithms should be considered, although this needs to be balanced with the additional time and resources needed. The incremental improvement in discriminative performance achieved by the addition of an RDT to a clinical prediction algorithm may not be costeffective in lower resourced settings. Finally, providing this model in the form of a decision support tool to the clinician could translate to reductions in inappropriate use of antibiotics, although further research needs to be done to explore the degrees of certainty that clinicians require to alter treatment decisions.
The novel use of kernel density estimates to derive the conditional tests when calculating the posttest odds enabled a flexibility in model input. While kernel density estimates have been used for conditional feature distributions in Naïve Bayes classifiers (John and Langley, 1995, Murakami and Mizuguchi, 2010), here we show that they can be used to derive conditional likelihoods for diagnostic tests constituting one or more features, stressing the effect of the overall test on the posttest odds and not individual features. As such, complicated machine learning models can be combined with simple diagnostics as part of the posttest odds. For example, we could have fit neural networks in lieu of logistic regression models, and in addition to these more complicated models, it is possible to incorporate the result of an RDT that make results available to the clinician at the pointofcare. Additionally, our method of using twodimensional kernel density estimates can also be used to overcome the conditional independence assumption for tests based on potentially interrelated diagnostic information. Densities with higher than two dimensions can be considered, though, computational limitations are likely in both speed and, we expect, accuracy, as the dimensions increase.
Our study has a number of limitations. First, a robust training set of both cases and noncases is required to adequately build the conditional kernel densities. Second, the posttest odds calculation, at the time of prediction, lacks interpretation on a feature level like a logistic regression or decision tree. Although, we do observe the effect of a test on an observation, we cannot see which features caused that effect without diving deeper into the training of the diagnostic tests.Thirdly, the prediction algorithm generated by the posttest odds model using GEMS data was only validated internally, and further studies are need for external validation and field implementation. Fourth, our estimation of antibiotic use reduction used data from a clinical research study, which may have biases inherent to such studies. Last, our study uses the AFe cutoff of greater than or equal to 0.5 to assign etiology from the qPCR data. This cutoff was selected based on expert elicitation, but the effect of using this cutoff has not been explored. Bacterial cases with AFe¡0.5 were excluded in our analysis, but may still benefit from antibiotic treatment.
In conclusion, we have developed a clinical prediction model that integrates multiple sources external to the presenting patient, through use of a posttest odds framework and showed that it improved diagnostic performance. When applied to the etiological diagnosis of pediatric diarrhea, we demonstrate its potential for reducing inappropriate antibiotic use. The flexible inclusion or exclusion of output from its components makes it ideal for decision support in lower resourced settings, when only certain data may be available due to limitations in information computation or connectivity. Additionally, the ability to incorporate new training data in realtime to update decisions allows the model to improve as more data is collected. Such a predictive model has the potential to improve the management of pediatric diarrhea, including the rational use of antibiotics in lower resourced settings.
Appendix 1
Weighted weather station data
Daily local weather information was constructed based on data from weather stations within 200 km of the site of interest. We chose 200 km because one our sites, Mozambique, does not have any stations nearer than 180 km. We then collect the temperature and rain info from the top five closest weather stations and take a weighted average where they are weighted inversely by distance so that the closer weather stations will have more effect on the average. For instance, for temperature on day d across the five closest weather stations: ${T}_{d\cdot}=\frac{{\sum}_{i=1}^{5}{T}_{di}\cdot {d}_{i}^{1}}{{\sum}_{i=1}^{5}{d}_{i}^{1}}$ where ${T}_{di}$ is the average temperature for weather station i on day d and d_{i} is the distance from weather station i.
Appendix 2
Pretest odds from prior patient predictions for prediction in new sites
We calculated pretest odds by combining past predictions from predictive model A, the presenting patient model. By taking a weighted average of the recently predicted odds of viral etiology, we attempt to capture recent local trends in diarrhea pathogens, such as localized outbreaks. This is similar to heuristic decision making historically used by clinicians. We aggregated the odds calculated from the presenting patient model on their probability scale for each site over the past d days such that pretest probability ${\pi}_{d}$ for day d is
where ${P}_{di}$ are the $i=1,\mathrm{\cdots},k$ current patient predictions converted from the odds scale to the probability scale on day d and n is the number of prior days included in the calculation. Provided the greatest weights are put on the most recent predictions, we would expect an influx of certain symptoms related to a viral etiology to be represented by ${\pi}_{d}$.
Data availability
GEMS data are available to the public from https://clinepidb.org/ce/app/. Data and code needed to reproduce all parts of this analysis are available from the corresponding author's GitHub page: https://github.com/LeungLab/GEMS_PostTestOdds [copy archived at https://archive.softwareheritage.org/swh:1:rev:67f4f5a0cdc3d569e756142f0142aaa23a9b1e03/].

ClinEpiDBID DS_841a9f5259. Study GEMS1 Case Control.
References

On the use of crossvalidation for time series predictor evaluationInformation Sciences 191:192–213.https://doi.org/10.1016/j.ins.2011.12.028

Clinical predictors for etiology of acute diarrhea in children in resourcelimited settingsPLOS Neglected Tropical Diseases 14:e0008677.https://doi.org/10.1371/journal.pntd.0008677

The seasonality of diarrheal pathogens: a retrospective study of seven sites over three yearsPLOS Neglected Tropical Diseases 13:e0007211.https://doi.org/10.1371/journal.pntd.0007211

SMARTCOP: a tool for predicting the need for intensive respiratory or vasopressor support in communityacquired pneumoniaClinical Infectious Diseases 47:375–384.https://doi.org/10.1086/589754

Global seasonality of Rotavirus infectionsBulletin of the World Health Organization 68:171.

Seasonality of cholera from 1974 to 2005: a review of global patternsInternational Journal of Health Geographics 7:31.https://doi.org/10.1186/1476072X731

Integrating spatial epidemiology into a decision model for evaluation of facial palsy in childrenArchives of Pediatrics & Adolescent Medicine 165:61–67.https://doi.org/10.1001/archpediatrics.2010.250

Simple clinical score and laboratorybased method to predict bacterial etiology of acute diarrhea in childhoodThe Pediatric Infectious Disease Journal 6:1088–1091.https://doi.org/10.1097/0000645419870612000004

ConferenceEstimating continuous distributions in bayesian classifiersProceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence, Morgan Kaufmann Publishers Inc. pp. 338–345.

Computationally efficient confidence intervals for crossvalidated area under the ROC curve estimatesElectronic Journal of Statistics 9:1583–1607.https://doi.org/10.1214/15EJS1035

The microbiome and mental health: looking back, moving forward with lessons from allergic diseasesClinical Psychopharmacology and Neuroscience 14:131–147.https://doi.org/10.9758/cpn.2016.14.2.131

Antibiogram pattern and seasonality of Salmonella serotypes in a north indian tertiary care hospitalEpidemiology and Infection 134:961–966.https://doi.org/10.1017/S0950268805005844

On estimation of a probability density function and modeThe Annals of Mathematical Statistics 33:1065–1076.https://doi.org/10.1214/aoms/1177704472

Use of antibiotics in children younger than two years in eight countries: a prospective cohort studyBulletin of the World Health Organization 95:49–61.https://doi.org/10.2471/BLT.16.176123

Learning bayesian networks with the bnlearn R packageJournal of Statistical Software 35:1–22.https://doi.org/10.18637/jss.v035.i03

Decision support systems for antibiotic prescribingCurrent Opinion in Infectious Diseases 21:573–579.https://doi.org/10.1097/QCO.0b013e3283118932

The integrated surface database: recent developments and partnershipsBulletin of the American Meteorological Society 92:704–708.https://doi.org/10.1175/2011BAMS3015.1

Studying seasonality by using sine and cosine functions in regression analysisJournal of Epidemiology & Community Health 53:235–238.https://doi.org/10.1136/jech.53.4.235

Using electronic health record data to develop inpatient mortality predictive model: acute laboratory risk of mortality score (ALaRMS)Journal of the American Medical Informatics Association 21:455–463.https://doi.org/10.1136/amiajnl2013001790

Global burden of childhood pneumonia and diarrhoeaThe Lancet 381:1405–1416.https://doi.org/10.1016/S01406736(13)602226
Decision letter

Joshua T SchifferReviewing Editor; Fred Hutchinson Cancer Research Center, United States

Eduardo FrancoSenior Editor; McGill University, Canada

Joe BrownReviewer; Georgia Tech University, United States
In the interests of transparency, eLife publishes the most substantive revision requests and the accompanying author responses.
Acceptance summary:
We were excited by your idea to integrate population level data such as seasonality and recent weather into clinical decision making tools. This approach has potential widespread utility for multiple pathogens with seasonal variability, and even for the SARSCoV2 pandemic which is notable for wide spatiotemporal incidence of new infections. In the case of pediatric diarrhea, the potential to limit antibiotic overprescribing is of enormous public health importance.
Decision letter after peer review:
Thank you for submitting your article "A modular approach to integrating multiple data sources into realtime clinical prediction for pediatric diarrhea" for consideration by eLife. Your article has been reviewed by two peer reviewers, and the evaluation has been overseen by a Reviewing Editor and a Senior Editor. The following individual involved in review of your submission has agreed to reveal their identity: Joe Brown (Reviewer #1).
The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.
Summary:
This paper describes a computationally advanced method for clinical prediction of pediatric diarrhea cases based on the inclusion of prior data on the etiology of disease and additional data beyond that available at patient presentation. Over prescription of antibiotics is a global issue and major contributor to antimicrobial resistance. This paper's methods would allow more accurate diagnosis of viral causes of diarrhea based not just on individual data but also on preexisting distributions of cases in other patients and on regional factors such as seasonality.
The approach is sound and the conclusions justified. The authors present a flexible approach to predictive modeling of viral diarrhea that is globally adaptable to local conditions and available resources. The work is extensive and thorough with major thought given to sensitivity of model validation and calibration. Though the difference in AUC with these added nonpatient specific variables is only slight, inclusion into bedside diagnostic tools could decrease the unnecessary use of antibiotics significantly.
The statistical methodology is rigorous and extensive with substantial detail given to model assumptions and validation
Essential revisions:
1) The utility of the method depends on the quality (and diagnostic accuracy) of prior data that may not generally be available in highburden settings. Models rely on highly specific quantitative pathogen data that may not be widely available outside of the dataset that the authors use. The GEMS quantitative reanalysis from which data were derived (Liu et al., 2016) argues convincingly that quantitative data are required to establish etiology of cases, given the high prevalence of asymptomatic carriage of enteric pathogens in children in highburden cohorts. This does not necessarily preclude the utility of this analysis, however, given that more such data may become available in highburden settings. But, it should be acknowledged that the current approach as described is likely to be restricted to settings where high quality, accurate prior etiological data are available. Currently that excludes most highburden sites.
A sensitivity analysis that accounts for suboptimal prior etiology data on study outcomes would be a short but useful addition to the current analysis. That is: how does the model perform when prior / training data are less than 100% accurate?
2) A reader may wonder whether a regional or global dataset on prior etiology could substitute for local data in training models for clinical prediction, given the current and nearterm scarcity of available local data at specific sites of interest. A straightforward way to test this would be to train models in country A, same region (e.g., Bangladesh) and test the predictive model against the cohort from country B, same region (e.g., Pakistan). Such an analysis would help convey to readers how local the training data need be to improve diagnostic accuracy / reduce prescription of unnecessary antibiotics. Similarly, a model trained using data taken from all GEMS sites could be applied to specific countries, in order to assess model performance when no reliable, locally available data on etiology exist. That may well be the norm in many settings of interest.
3) Of 9439 children, only 3366 had known etiologies which indicates the likely true cause of diarrhea remains indeterminate for almost 2/3 of cases. Unknown etiologies were not included in models (AFe>=0.5 only), so authors don't consider diarrheal episodes with no clear majority but where say AFe viral=0.49 and AFe bacterial=0.48. Such cases could be important to evaluate particularly if they are highly prevalent and they may still benefit from antibiotic treatment. Please consider describing this as a limitation or performing sensitivity analyses go account for the uncertainty of the AFE cutoff.
https://doi.org/10.7554/eLife.63009.sa1Author response
Essential revisions:
1) The utility of the method depends on the quality (and diagnostic accuracy) of prior data that may not generally be available in highburden settings. Models rely on highly specific quantitative pathogen data that may not be widely available outside of the dataset that the authors use. The GEMS quantitative reanalysis from which data were derived (Liu et al., 2016) argues convincingly that quantitative data are required to establish etiology of cases, given the high prevalence of asymptomatic carriage of enteric pathogens in children in highburden cohorts. This does not necessarily preclude the utility of this analysis, however, given that more such data may become available in highburden settings. But, it should be acknowledged that the current approach as described is likely to be restricted to settings where high quality, accurate prior etiological data are available. Currently that excludes most highburden sites.
A sensitivity analysis that accounts for suboptimal prior etiology data on study outcomes would be a short but useful addition to the current analysis. That is: how does the model perform when prior / training data are less than 100% accurate?
Thank you for suggesting this additional analysis. We have included a sensitivity analysis in which we use only “conventional diagnostic methods” data for the pretest odds calculation, instead of the qPCR data. The conventional methods, as described in the initial GEMS report (Kotloff et al., 2013), while more likely to be available in highburden settings, identified fewer pathogens as attributable causes of diarrhea, and are regarded to be less accurate than qPCR with regards to pathogen attribution. We additionally provide alternatives to using prior etiology data such as the climate component that can be trained at the regional level with results shown in the supplement.
We added “Additionally, we include a sensitivity analysis by calculating the pretest odds using conventional diagnostic methods data, as qPCR data are unlikely to be available in highburden settings” to the Materials and methods section and “Using conventional diagnostic methods data data to calculate pretest odds instead of qPCR data reduces AUC slightly from.839 to 0.829 (0.798 0.860).” to the Results section.
2) A reader may wonder whether a regional or global dataset on prior etiology could substitute for local data in training models for clinical prediction, given the current and nearterm scarcity of available local data at specific sites of interest. A straightforward way to test this would be to train models in country A, same region (e.g., Bangladesh) and test the predictive model against the cohort from country B, same region (e.g., Pakistan). Such an analysis would help convey to readers how local the training data need be to improve diagnostic accuracy / reduce prescription of unnecessary antibiotics. Similarly, a model trained using data taken from all GEMS sites could be applied to specific countries, in order to assess model performance when no reliable, locally available data on etiology exist. That may well be the norm in many settings of interest.
Thank you for highlighting this – our submission included a “leaveoneout crossvalidation” in Figure 4—figure supplement 2 in which we have trained at the continent (“region”) level and tested on the leftout site. We describe this in the Materials and methods subsection “Validating the method”. These findings are included in the text in the Results section, “In leaveonesiteout crossvalidation testing, we found that the climate test tends to outperform the seasonality test”. Additionally, Table 2 contains countryspecific results for the present patient model alone, which is trained on 80% of each country (though the country of interest is included in that training).
3) Of 9439 children, only 3366 had known etiologies which indicates the likely true cause of diarrhea remains indeterminate for almost 2/3 of cases. Unknown etiologies were not included in models (AFe>=0.5 only), so authors don't consider diarrheal episodes with no clear majority but where say AFe viral=0.49 and AFe bacterial=0.48. Such cases could be important to evaluate particularly if they are highly prevalent and they may still benefit from antibiotic treatment. Please consider describing this as a limitation or performing sensitivity analyses go account for the uncertainty of the AFE cutoff.
We agree that more exploration of AFe cutoff is a limitation of our study. We have now added into the limitations paragraph of the Discussion:
“Last, our study uses the AFe cutoff of greater than or equal to 0.5 to assign etiology from the qPCR data. This cutoff was selected based on expert elicitation, but the effect of using this cutoff has not been explored. Bacterial cases with AFe<0.5 were excluded in our analysis, but may still benefit from antibiotic treatment.”
https://doi.org/10.7554/eLife.63009.sa2Article and author information
Author details
Funding
National Center for Advancing Translational Sciences (8UL1TR000105)
 Ben J Brintz
 Benjamin Haaland
 Tom Greene
National Institute of Allergy and Infectious Diseases (R01AI135114)
 Daniel T Leung
Bill and Melinda Gates Foundation (OPP1198876)
 Daniel T Leung
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Acknowledgements
This investigation was supported by the University of Utah Study Design and Biostatistics Center, with funding in part from the National Center for Research Resources and the National Center for Advancing Translational Sciences, National Institutes of Health, through Grant 8UL1TR000105 (to BJB, BH, and TG). Research reported in this publication was supported by the NIAID of the NIH under award number R01AI135114 (to DTL), and the Bill and Melinda Gates Foundation award OPP1198876 (to DTL). The authors would like to thank Bill and Melinda Gates for their active support (JLP and DC) of the Institute for Disease Modeling and their sponsorship through the Global Good Fund.
Senior Editor
 Eduardo Franco, McGill University, Canada
Reviewing Editor
 Joshua T Schiffer, Fred Hutchinson Cancer Research Center, United States
Reviewer
 Joe Brown, Georgia Tech University, United States
Publication history
 Received: September 11, 2020
 Accepted: January 17, 2021
 Version of Record published: February 2, 2021 (version 1)
Copyright
© 2021, Brintz et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics

 660
 Page views

 72
 Downloads

 0
 Citations
Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.