ABSTRACT
Conclusion:
In this study, it has been shown that the discrimination and calibration of the PRISM III and PIM II score is good in predicting mortality in a tertiary pediatric intensive care unit where medical and surgical patients are accepted.
Results:
After exclusions 825 patients included in the study. The mean value of the PRISM III was 9.5±6.8 and the mean value of the PIM II score was 1.9±8.2. The calculated SMR was 1.03 according to the PRISM III score and 0.76 according to the PIM II score. In the ROC analysis performed to evaluate the discrimination, the AUC values for PRISM III PDR and PIM II PDR were; 0.908±0.017 (p<0.001), 0.855±0.024 (p<0.001), respectively. When PRISM III and PIM II PDR values were analyzed in groups, the difference between predicted and observed mortality was not statistically significant (p=0>0.05).
Methods:
Demographic data of patients hospitalized in the pediatric intensive care unit between January 1, 2015 and December 31, 2018 were scanned form the electronic records. PRISM III and PIM II score, PDR, and standardized mortality rate (SMR) were calculated. In order to show the discrimination of the scores, the area under the ROC curve (AUC) was calculated and the significance limit was accepted as 0.80. Hosmer-Lemeshow Goodness-of-fit test was used to evaluate the calibrations and p>0.05 was considered significant.
Introduction:
The most commonly used scoring systems for the assessment of predicted mortality (PDR) in the pediatric intensive care unit are the “pediatric risk of mortality” (PRISM) and the “pediatric index of mortality” (PIM) scores. The aim of this study is to evaluate the calibration and discrimination of PRISM III and PIM II scores in predicting mortality in a tertiary university hospital pediatric intensive care unit in Turkey.
Introduction
Since the mid-1990s in Turkey, the number of pediatric intensive care units, which are structured independently of adult and neonatal intensive care units, has started to increase rapidly. In the following decade, it officially became a minor program in medical education and the education program was clearly defined. In this process, the minimum standards of the new intensive care units to be opened in the national health system have been defined and continuously inspected.1 The main purpose of an intensive care unit is to reduce mortality.2 For this reason, one of the defined standards is to evaluate the expected mortality rates in intensive care units with standard scoring systems and compare them with the actual mortality rates. The increase in infrastructure opportunities, the reflection of technological developments on patient care, and the increase of qualified health personnel have revealed the need to recalibrate and discriminate the scoring systems used in the evaluation of mortality. In addition, scoring systems are important to eliminate bias by selecting patients with similar disease severity when conducting clinical trials.2,3 If the observed mortality number and distribution is similar to the number and distribution estimated from the results of the scores, it can be said that the performance of the institution is equivalent to the institutions in which the validity of these scores has been demonstrated elsewhere in the world.4 The most commonly used scoring systems for the evaluation of mortality in the pediatric intensive care unit are the “pediatric risk of mortality” (PRISM) and the “pediatric index of mortality” (PIM) scores.2 The PRISM III score uses the patient’s most abnormal variants (PRISM III-24 score) during the first 12 or 24 hours in the intensive care unit, and it predicts possible mortality during this hospitalization.5 The PIM II score estimates the risk of death from data available at the time of admission to the intensive care unit and has therefore been reported to be suitable for continuous monitoring of the quality of pediatric intensive care.6 The aim of this study is to evaluate the calibration and discrimination of PRISM III and PIM II scores in predicting mortality in a tertiary university hospital pediatric intensive care unit in Turkey.
Materials and Methods
Patients and Data
The data of patients hospitalized in the Akdeniz University Pediatric Intensive Care Unit between January 1, 2015 and December 31, 2018 were scanned from electronic records. Their age, gender, underlying disease, reason for hospitalization in the intensive care unit, duration of invasive and non-invasive ventilation, length of stay in the intensive care unit, tracheostomy requirement and prognosis were recorded. Predicted death rate (PDR) was recorded using the PRISM III and PIM II scores, as well as the logarithmic formulas recommended for these scores.7,8
Standardized mortality rate (SMR) was calculated by dividing the mean of the PDR values obtained from the scores for both scoring systems by the actual mortality rate. Ideally, the SMR is expected to be close to 1. When this value was above 1, it was interpreted that the mortality predicted by the test was higher than the actual value, and when it was below 1, it was interpreted that the test predicted mortality (PDR) less than the actual value.
Features of the Unit Where the Study was Performed
Akdeniz University Pediatric Intensive Care Unit is an independent 8-room unit separated by an automatic door system. Two of these rooms are full isolation rooms. All beds are equipped with centrally connected advanced monitor system and advanced ventilators. During the period of the study, 1 lecturer, 1 minor specialist, 3 research assistants, one of whom was a senior, and 14 nurses worked in the unit. All medical and surgical patients aged 1 month to 18 years, including trauma, congenital heart surgery, and organ transplantation, are accepted. Advanced treatments such as high-frequency oscillatory ventilation, continuous renal replacement therapy, and extracorporeal membrane oxygenation (ECMO) are performed. The possibility of using ECMO is limited for economic reasons (less than 5 per year).
Exclusion Criteria
Patients who were hospitalized in the intensive care unit for less than 24 hours, whose cardio-pulmonary arrest status could not be stabilized at the end of the first 2 hours after admission, whose data could not be reached, who had undergone bone marrow transplantation or who had known chromosomal anomalies were excluded from the study.3,9,10
Statistical Analysis
Statistical evaluation was performed using the Statistical Package for Social Science (SPSS) 23 software. Descriptive statistics were made by using frequency and percentage (%) for categorical variables and by using mean and standard deviation (SD) values, and the median, minimum and maximum values for numerical variables. The chi-square test was employed to compare categorical variables with each other, while the Mann-Whitney U test was used for the analysis of numerical variables. A p-value below 0.05 was considered significant.
The area under the ROC curve (AUC) was calculated to evaluate how well the PRISM III and PIM II scores discriminated against the risk of death, and the significance limit was accepted as 0.80. When the AUC was higher than 0.80, it was considered that the scores were able to discriminate adequately between the survivors and the non-survivors, and the scores had good discrimination.
In order to evaluate the calibrations of the scoring systems, the patients were divided into 5 different categories according to their risk groups, and the number of deaths, expected number of deaths, actual number of survivors and expected number of survivors were compared with the Hosmer-Lemeshow Goodness-of-fit test according to the total number of patients in the groups. In the case of p>0.05, it was evaluated that there was no statistically significant difference, and the calibration of the mortality test was considered good.
Consent was obtained for the study with the decision of the Akdeniz University Faculty of Medicine Clinical Research Ethics Committee, dated 09/04/2019 and numbered 70904504.
Results
Thirty-six patients with known chromosomal abnormalities, 55 patients who underwent bone marrow transplantation, and a total of 324 patients who were hospitalized in the intensive care unit for less than 24 hours or were unstable at the 2nd hour after cardiopulmonary resuscitation or had missing data were excluded from the study in accordance with the exclusion criteria (Figure 1). Three hundred seventy-eight (45.8%) of the patients included in the study were girls, and the mean age was 46.7 months (1-22) years. Among the reasons leading to intensive care hospitalization, respiratory failure (19.9%), trauma (18.4%), congenital heart surgery (16.1%), and postoperative follow-up (16%) were the most common ones (Table 1). Of the patients, 493 (59.75%) had a known chronic disease (Table 2). The duration of mechanical ventilation in the study group was 3.6 days (SD 6.0), and the mean intensive care unit stay was 7.1 days (SD 12.2). Tracheostomy was performed in 53 (6.42%) patients. The mortality observed in the study group was 8.60% (n=71). Mortality was 7.6% in males and 9.8% in females (p=0.265).
In the study group, the mean PRISM III score was 9.5 (SD 6.8), the mean PRISM III PDR was 8.3, and the PIM II score was 11.38. The SMR calculated according to the PRISM III score was 1.03, and the SMR according to the PIM II score was 0.76.
The area under the curve (AUC) was 0.908±0.017 (p<0.001) in the ROC analysis performed to evaluate the discrimination of the PRISM III score PDR. Similarly, when PIM II score PDRs were evaluated, AUC was found to be 0.855±0.024 (p<0.001). Since the AUC was above 0.80, it was seen that the discrimination of both scores was good (Table 3).
The Hosmer-Lemeshow Goodness-of-fit test was applied to evaluate the calibration of the PRISM III score. When the PRISM PDR values of 825 patients were analyzed in groups, the difference between predicted and actual mortality was not significant (p=0.753). Calibration of the PIM II score was also similarly evaluated, and the difference between the predicted and actual mortality was similarly statistically insignificant (p=0.251). Since the p-values for both scores were insignificant, it was seen that their calibration was good (Table 4).
Discussion
Scoring systems are needed in pediatric intensive care units in order to evaluate the disease severity and response to treatment of study groups created for scientific research and to determine the expected mortality. It is seen that PRISM, PIM, PELOD and mSOFA scores are preferred in studies conducted in our country with critically ill children (Table 5). It is seen that most of these studies are retrospective, the number of patients is low, they are generally conducted on non-homogeneous groups, and the facilities of the units are not sufficiently comparable. Similar to this study, although the discrimination of the PRISM III score was found to be good in studies in which the PRISM III score was evaluated, the calibration of the PRISM III score was not evaluated in one of the studies, and the calibration of the test was reported to be poor in another study conducted by Oymak and Bayrakci.11,12 In the evaluation of expected and observed mortality rates in this study, both the calibration and discrimination of PRISM III and PIM II scores were found to be good. Similar to the studies conducted in our country, the results obtained in studies conducted outside the countries where the tests were developed are not homogeneous (Table 6).
There are also differences in the discrimination and calibration results of the tests in the studies conducted on the specific groups. Köner et al.13 reported that the discrimination and calibration of the PIM II score was good in children followed up in the intensive care unit after congenital heart surgery, whereas the discrimination of the baseline and peak mSOFA score was superior to the PIM II score in predicting mortality. No comparison was made with the PRISM score in this study.13 In another study conducted in the USA in children followed up for surgical and medical heart disease, it was detected that the PRISM III score was good in distinguishing mortality. However, when evaluated in terms of calibration, the expected mortality was lower than the observed in cardiac pathologies with lower risk and higher than the observed in pathologies with higher risk; therefore, the calibration was not good in the study group.14 Kesici et al.15 reported that the calibrations of PRISM III and PIM II scores were not good in children, all of whom were followed up on mechanical ventilators, and that the use of oxygenation index as a criterion in this group might be beneficial. In a retrospective study including 338 patients in a pediatric intensive care unit in Brazil where cancer patients were followed, mortality was reported as 18.34%, SMR as 0.78 and AUC as 0.71 for PRISM III score, and SMR as 0.77 and AUC as 0.76 for PIM II score. It was concluded that they were well calibrated, but they calculated the expected mortality higher.16
When PRISM, PIM and PELOD scores in 398 patients followed up for sepsis were evaluated together with their current and old versions, PIM score predicted lower mortality, and AUC area values obtained in ROC analysis with PRISM III, PIM II and PELOD II scores were 0.75, 0.78 and 0.75, respectively.17 The group included in our study did not consist of a homogeneous disease group, and the results obtained may have been affected by the distribution of the subgroups. In order to minimize this problem, patients with proven genetic disorders who underwent bone marrow transplantation, who were shown in previous studies to have unique risk factors, were excluded from the study group in this study.
Study Limitations
The most important limitation of this study is that it is a single-centered and retrospective evaluation and updated versions of the used scores are available. PRISM IV and PIM III scores have been developed and made available. On the other hand, in a study using the same scores, it was reported that the discrimination of PRISM IV and PIM III scores was not better than the previous versions, and the AUC values (0.70 and 0.76 for PRISM IV and PIM III, respectively) were similar.17 The results obtained in our study could not be compared with other scoring systems and newer versions of existing scores.
Conclusion
In this study, it was shown that the discrimination and calibration of PRISM III and PIM II scores were good in a tertiary pediatric intensive care unit where medical and surgical patients were accepted. Discrimination and calibration of newly developed versions of these scores and less commonly used updated scores such as PELOD II and mSOFA should be evaluated in a multicenter national study. In this way, the scientific outputs of studies conducted in different units and on relatively small groups can be interpreted more accurately and used in the development of health policies.