Intended for healthcare professionals
Original research

Diagnostic prediction model for screening of elevated low-density and non-high-density lipoproteins in young Thai adults between 20 and 40 years of age

Abstract

Background Low-density lipoprotein cholesterol (LDL-C) and non-high-density lipoprotein cholesterol (non-HDL-C) levels are paramount in atherosclerotic cardiovascular disease risk management. However, 94.4% of Thai young adult are unaware of their condition. A diagnostic prediction model may assist in screening and alleviating underdiagnosis.

Objectives Development and internal validation of diagnostic prediction models on elevated LDL-C (≥160 mg/dL) and non-HDL-C (≥160 mg/dL).

Methods Retrospective, single-centre, tertiary-care hospital annual health examination data from 29 March 2018 to 30 August 2023 was analysed. Two models with 11 predictors from anthropometry and bioimpedance are fitted with multivariable binary logistic regression predicting elevated LDL-C and non-HDL-C. Predictor selection used the backward stepwise elimination. Four performance metrics were quantified: discrimination using area under the receiver-operating characteristic curve (AuROC); calibration by calibration plot; utility by decision curve analysis and instability by performance instability plots. Internal validation was carried out using 500 repetitions of bootstrap-resampling.

Results Dataset included 2222 LDL-C and 5149 non-HDL-C investigations, 303 were classed as elevated LDL-C (13.6%) and 1013 as elevated non-HDL-C cases (19.7%). Two predictors, gender and metabolic age, were identified in the LDL-C model with AuROC 0.639 (95% CI 0.617 to 0.661), poor calibration, and utility in the 7%–25% probability range. Three predictors—gender, diastolic blood pressure and metabolic age—were identified in the non-HDL-C model with AuROC 0.722 (95% CI 0.705 to 0.738), good calibration and utility in 9%–55% probability range.

Discussion and conclusion Overall results demonstrated acceptable discrimination for non-HDL-C model but inadequate performance of LDL-C model for clinical practice. An external validation study should be planned for non-HDL-C model.

What is already known on this topic

  • Low-density lipoprotein cholesterol (LDL-C) and non-high-density lipoprotein cholesterol (non-HDL-C) play a major role in atherosclerotic cardiovascular disease risk management, however; a case detection strategy is needed to counteract underdiagnosis in young adults, especially in a low-resource setting.

What this study adds

  • Development and internal validation of diagnostic prediction models on elevated LDL-C and non-HDL-C in young Thai adults.

How this study might affect research, practice or policy

  • Three predictors—gender, diastolic blood pressure and metabolic age—can be used as predictors with acceptable discrimination and calibration for elevated non-HDL-C screening but have poor discrimination and calibration for elevated LDL-C screening.

Introduction

Low-density lipoprotein cholesterol (LDL-C) and non-high-density lipoprotein cholesterol (non-HDL-C) are the major risk factors contributing to the development of atherosclerotic cardiovascular disease (ASCVD), which can ultimately result in premature disability and mortality. An epidemiological study quantifying the association between exposure to LDL-C and ASCVD translated a lifetime cumulative exposure to LDL-C of 5000 mg-year, to an all-age ASCVD risk of 1%. Concerningly, this exposure–event relationship showed a logarithmic–linear association, resulting in a doubling of the risk every 1250 mg-year.1 The proportion of cases of ASCVD potentially prevented from LDL-C reduction was initially explored in the meta-analysis of randomised controlled intervention of statins in advanced age. The research found a relative risk reduction of 22% (adjusted risk ratio (aRR) 0.78, 95% CI 0.76 to 0.80).2 However, a Mendelian randomisation study investigating favourable genetic LDL-C single-nucleotide polymorphisms, which by implication confer a lifetime of lower LDL-C exposure, demonstrated a substantially larger relative risk reduction of 54% (aRR 0.46, 95% CI 0.41 to 0.52).3 4 In a recent prospective childhood to middle adult cohort study, it was found that both elevated LDL-C and non-HDL-C were independently associated with ASCVD (adjusted HR 1.35, 95% CI 1.13 to 1.60 and 1.94, 95% CI 1.23 to 3.06, respectively). The difference in ASCVD occurrence in both cases of lipid profile abnormalities can be attributed to exposure–time interaction. In summary, the risk period for the development of ASCVD needs to be minimised by earlier life course detection and intervention in cases with elevated LDL-C and non-HDL-C.5

The 2019 American College of Cardiology and American Heart Association guidelines regarding the primary prevention of cardiovascular disease emphasised control of lifelong ASCVD risk factors, including hyperlipidaemia. In young adults aged 20–39 years, discussions on treatment should be initiated at LDL-C ≥160 mg/dL or non-HDL-C ≥190 mg/dL and definitive treatment in suspected cases of familial hypercholesterolaemia with LDL-C ≥190 mg/dL.6 Nevertheless, access to lipid screening remains suboptimal in young adult sociodemographic groups in both countries, with 44.1% of American and 94.6% of Thai young adult cases unaware of their elevated LDL-C levels.7 8

Diagnostic prediction studies offer a potential strategy to enhance screening access for elevated LDL-C cases across countries in the low-income to middle-income group with limited resources as a major constraint.9–13 Regression-based statistical models with anthropometric, demographic and biological impedance predictors have been developed to detect the occurrence of dyslipidaemia,9 10 12 13 or specifically elevated LDL-C,11 using determinants include age,9–13 gender,9–13 family history,10 13 (diabetes,10 hypertension,10 cerebrovascular disease,10 dyslipidaemia13), education level,9 11–13 income,9 12 13 marital status,9–13 exercise,9–13 occupation,9 13 occupational risk factors,13 body mass index (BMI),10 11 waist circumference,9–11 body fat percentage,10 11 vital signs (systolic blood pressure (SBP),10 diastolic blood pressure (DBP),10 pulse10), smoking,10–13 drinking10 13 and personal underlying diseases (diabetes,12 heart disease12 and hypertension13). The limitations proposed by these prediction models can be summarised into three aspects. First, the source population consisted of elderly, rural populations, whereas contemporary concepts require detection of cases at an early age to prevent the progression of atherosclerotic plaque and residual ASCVD risk.6 Second, the prediction models had been constructed on different diagnostic endpoint definitions. Four out of five models focused on dyslipidaemia9 10 12 13 as a composite diagnostic endpoint of low HDL-C, elevated LDL-C, elevated triglyceride levels (TG) and elevated total cholesterol (TC). Nevertheless, current practice guidelines6 need a different therapeutic agent for each of the lipid type abnormality, thereby limiting applicability of these models. Third, different thresholds of elevated LDL-C were used, specifically ≥130 mg/dL,11 ≥160 mg/dL and ≥240 mg/dL.9 10 13 This leads to limited generalisability of the model to other countries with different clinical practice guidelines. Fourth, the prediction models have limited performance according to the area under the receiver-operating curve (AuROC) of 0.69–0.74.9–13 Finally, since direct LDL-C measurement is limited to only a few secondary to tertiary care centres, for the Thai guidelines on dyslipidaemia a surrogate marker, non-HDL-C, was used as an alternative measurement of lipid profile. As a consequence of these limitations this study aimed to develop and internally validate the model on specific lipid types: LDL-C and its surrogate marker, non-HDL-C, in a source population of young Thai adults, in order to increase the use of preventative screening for public and occupational health purposes.

Material and methods

Population, participants and source of data

The study population was consecutively recruited from a retrospective data base in the single-centre, tertiary-care, university hospital registry of annual occupational health examination carried out on employees during 29 March 2018 to 30 September 2020 and 27 January 2022 to 30 August 2023 (figure 1).

Participant flow diagram for the development and internal validation of elevated LDL-C and elevated non-HDL-C prediction model. BMI, body mass index; LDL-C, low-density lipoprotein cholesterol; non-HDL-C, non-high-density lipoprotein cholesterol.

Inclusion criteria were healthcare and non-healthcare workers in a tertiary-care hospital aged 20–40 years who had undergone the annual occupational health examination with lipid profile investigation in 2018, 2019, 2020, 2022 and 2023. No annual health examinations were carried out in 2021 due to limited manpower from the allocation of annual health examiners to the national COVID-19 vaccination campaign. The minimal investigation package consisting of TC and HDL-C is fully supported and incurs no charge. The examinee may request an additional LDL-C and TG test at a discount price. Exclusion criteria were any past medical diagnoses included in the International Classification of Diseases 10th edition before the check-up day which may affect serum cholesterol level. These could include essential or familial hypercholesterolemia (E78), hypothyroidism (E03), anorexia nervosa (F50), nephrotic syndrome (N04) and current pregnancy (Z34). Other exclusions were previous treatment by medication from any time or period earlier in life due to the direct effect of therapeutic agents on serum LDL-C level. These could include statins, ezetimibe, proprotein convertase subtilisin/kexin type 9 (PCSK9) inhibitors and fibrates.14 In addition, any subjects taking treatment in the form of medication in the previous 3 months due to the potential indirect effect on serum LDL-C level, for example, thiazide diuretics, glucocorticoids, amiodarone and ciclosporin were excluded.14

Predictors

11 predictors were identified, based on previous potential predictors from literature concerning related prediction models, in combination with the availability of the predictors in our setting. All 11 predictors were measured in the single health examination visit before the measurement of diagnostic endpoint as follows: age (years) calculated by visiting date minus date of birth divided by 365.25; gender, male or female; BMI (kg/m2) calculated by weight (kg) divided by square terms of height in metres (m2); waist circumference (cm) and measured at the umbilicus by a trained nurse practitioner. The Tanita model BC-418 body composition analyser (Tanita Corporation, Tokyo, Japan) was used for measurement of body composition, including estimated fat mass (kg), muscle mass (kg), visceral fat index (unit), basal metabolic rate (kcal/day) and metabolic age (years), through the bioelectrical impedance analysis (BIA) of bilateral plantar surface.15 16 The measurement protocol was carried out as follows. First, the equipment was cleaned at the measurement pad and placed in a flat space with electrical source plugged in to ensure proper calibration and measurement precision. Participants then removed their shoes and socks and stood still on the pedestal pad of the machine under guidance from the nurse. The height and age of the participant were entered into the equipment along with a prespecified recommendation of 0.6 kg deduction to exclude the clothes weight of the participant. The machine setting was set to standard body composition mode. The output slip was retrieved by the principal investigator for raw data. The SBP and DBP (mm Hg) were measured using an A&D TM-2657P machine (A&D Company, Ann Arbor, Michigan, USA).

Diagnostic endpoint

The diagnostic endpoint had been objectively determined without knowledge of the predictors by a separate team of biomedical technicians at the central laboratory of the hospital from the blood samples taken just before the start of the health examination. Quantification had been carried out using Cobas LDL-Cholesterol Generation 3 (LDLC3) (Roche Holding AG, Basel, Switzerland).

Sample size considerations

The calculation of sample size was done using the method based on mean absolute percentage error (MAPE)17 developed by van Smeden et al17 for the minimally required sample size calculation for the development of a multivariable clinical prediction model. Prespecified input was determined as follows: MAPE of 0.025 (2.5% absolute error from model predicted probability), LDL-C hypercholesterolaemia proportion of 0.17, according to findings from the Thai National Health Survey 2009 result7 and 11 candidate predictors. The result of calculation yielded a minimal sample size of 1374 participants, with 234 events, thereby equal to an event-per-variable index (EPV) of 21.3.

Data preparation, statistical analysis and missing data handling

Data preparation and quality checking were done on site after each day of health examination, recording being carried out using a Microsoft Access database. Descriptive statistics in each status of the binomial diagnostic endpoint were presented as mean±SD for normally distributed predictors, median (IQR) for skew-distributed continuous predictors and count with proportion (n (%)) for categorical predictors. Missing data were explored and handled by multiple imputation using chained multivariate regression,18 19 coupled with the predictive mean matching and K-nearest neighbour method with five nearest neighbours. Model specification was prespecified as follows, a substantive imputation model constituting all 11 potential predictors and auxiliary variables consisting of other lipid profile parameters: TG levels, TC and HDL-C. A preliminary 25 sets of burn-in imputation were entered for stability of imputation before proceeding to an additional 25 sets of multiple imputation. Primary data analyses were carried out in the imputed dataset, with alternative analyses being made for the complete case dataset.

Multivariable binary logistic regression was fitted without usage of target class imbalance handling. The initial model consisted of untransformed original values of the eleven predictors, in which proceeding predictor selection was performed by backward stepwise elimination (online supplemental material 1). Three major performance measures were elaborated. First, discrimination was presented as AuROC accompanied by the non-parametric AuROC plot. Second, calibration characteristics with expected to observed ratio (E:O), calibration slope, together with visual representation of the calibration plot. Third, decision curve analysis was presented to demonstrate patterns of predicted probability range and potential net benefit of the model (figure 2). Additionally, risk groups were created to enable model interpretation according to the predicted probability quartiles, with probability below the 25th percentile labelled as ‘low risk’, over the 75th percentile as ‘high risk’, other scores as ‘moderate risk’.

The performance of multivariable binary logistic regression diagnostic prediction model. (A) Discriminative performance of LDL-C model; (B) Discriminative performance of non-HDL-C model; (C) Calibration plot of LDL-C model; (D) Calibration plot of non-HDL-C model; (E) Decision curve analysis of LDL-C model; (F) Decision curve analysis of non-HDL-C model. AUC, area under the curve; CITL, calibration-in-the-large; E:O, expected to observe ratio; LDL-C, low-density lipoprotein cholesterol; non-HDL-C, non-high-density lipoprotein cholesterol; ROC, receiver-operating characteristic curve.

Bootstrapped internal validation was done with 500 repetitions of sampling-resampling on the entire development dataset to explore optimism of the models’ prediction, which usually had higher performances in the development dataset compared with the validating dataset due to specific distribution of predictors and diagnostic endpoints. We presented optimism of AuROC and E:O as the possible performance limitation in other relevant populations. Uniform bootstrapped shrinkage was applied to all model coefficients using the bootstrapped calibration slope. Finally, individual-level prediction instability derived from each bootstrapped model was explored including, prediction instability, calibration instability and MAPE20 (figure 3). Prediction model fairness was presented as performance measures across subgroups of social determinants in the relevant source population context, consisting of different occupation groups classified by the International Standard Classification of Occupations (online supplemental material 1).

The instability of multivariable binary logistic regression diagnostic prediction model. (A) Prediction instability of LDL-C model; (B) Prediction instability of non-HDL-C model; (C) Calibration instability of LDL-C model; (D) Calibration instability of non-HDL-C model; (E) mean absolute percentage error (MAPE) instability of LDL-C model; (F) MAPE instability of non-HDL-C model. LDL-C, low-density lipoprotein cholesterol; non-HDL-C, non-high-density lipoprotein cholesterol.

Stata V.18.0 (StataCorp) was used in the data analysis, with various user-written statistical components used as follows: <bsvalidation> 21 for bootstrapped internal validation, <running> 22 for generating symmetric nearest neighbour smoothing splines used in the visualisation of calibration instability, and<pmcalplot> 23 for visualisation of the calibration plot. The 2024 Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis statement was used as a guide for methodological reporting throughout the study (online supplemental material 2).24

Results

In total, 5149 participants were included in the study. All 5149 participants have non-HDL-C measurements taken from the minimal investigation package, while a subset consisting of 2222 participants (43.2%) have both non-HDL-C and LDL-C investigations from the additional package. Prevalence of elevated LDL-C was 13.6% (303 cases) and elevated non-HDL-C was 19.7% (1013 cases). Body composition analyser parameters demonstrated the highest proportion of missing data both in the non-HDL-C dataset (6.2%) and the LDL-C dataset (3.9%), with less than 1% missing in other predictors. Details of descriptive statistics and univariable analysis of each predictor are presented in the table 1.

Table 1
Participant characteristics for the development and internal validation of the prediction of elevated LDL-C and elevated non-HDL-C

LDL-C model

Two predictors, gender and metabolic age, were identified in the elevated LDL-C model giving an AuROC 0.639 (95% CI 0.617 to 0.661), poor calibration and clinical utility in 7%–25% predicted probability range. Predictor selection process and statistical modelling in each step of analysis were elaborated (online supplemental material 3, 4). Risk groups were categorised according to the predicted probability as low risk (<8.0%) moderate risk (8.0%–17.1%) and high risk (>17.1%). The logit coefficient and intercept are shown in table 2, the predicted probability being derived using:

Display Formula

Table 2
The multivariable binary logistic regression diagnostic model for prediction of elevated LDL-C and elevated non-HDL-C

Non-HDL-C model

Three predictors, gender, DBP and metabolic age, were identified in non-HDL-C model with AuROC 0.722 (95% CI 0.705 to 0.738), good calibration and clinical utility in 9%–55% predicted probability range. Predictor selection process and statistical modelling in each step of analysis were elaborated (online supplemental material 4). Risk groups were categorised according to the predicted probability as low risk (<9.1%), moderate risk (9.1%–26.8%) and high risk (>26.8%). The logit coefficient and intercept are provided in table 2, in which the predicted probability can be derived according to the equation as follows.

Display Formula

Presentation format

Access to the web-based calculator was provided with ShinyR,25 link as follows (https://wuttipatk.shinyapps.io/ShinyR2/) . Additional graphical scoring system charts for quick references without calculations are provided in the supplement for LDL-C model (online supplemental material 5) and non-HDL-C model (online supplemental material 6).

Discussion

This study investigated a potential diagnostic method for the prediction of the important ASCVD risk factors, elevated LDL-C and non-HDL-C, using anthropometric, demographic and body composition analyser parameters. From the analyses described throughout the study, both models have differences in performance measures. The non-HDL-C model demonstrates an acceptable level of discrimination and good calibration with ranges of useful predicted probability. However, the LDL-C model presented with poor discrimination and moderate miscalibration and narrow ranges of useful predicted probability. The discriminative performance of our LDL-C model was found to be poorer than other related prediction models pertinent to the composite diagnostic endpoint of dyslipidaemia in an elderly study population9 and a general population.10–13 We hypothesised that these discrepancies arose from differences in study domain, as our cohort consisted of health service employees within a narrow age range, higher-than-average health literacy and specific socioeconomic context. All may have contributed to lower prevalence of diagnostic endpoints and reduced the discriminative ability of certain predictors. It is suggestive that overfitting and optimism may have occurred to a greater degree in the other studies.17 26

Prediction instability was also examined in both models, with evidence of miscalibration presented as an underestimation of observed probability, especially in the higher probability ranges, reflecting a higher MAPE and deviation in the calibration slope (figure 3). A possible explanation was the sparsity of individuals with high predicted probability, thus hindering the valid estimate of risk at the distal ends of predicted probability ranges.20

Implications for practice include primary prevention of ASCVD early in life1 14 since lipid profile status acknowledgement can contribute to the medical advice and management crucial for the prevention of heart disease. In this study, both models were designed with the aim of characterising high-risk cases that require attention, with predictors that can be gathered objectively and conveniently by the intended users. Therefore, the intended users of these models constitute two groups: the hospital employees who would like to assess the risk themselves and the practising public health and general practitioners involved in personalised decision-making in occupational health service programme. Examples of clinical uses at the organisation level include setting a threshold for lipid profile investigation to the employees who have predicted probability higher than the third quartile of the target population, thereby assisting in using limited healthcare resource effectively. Furthermore, since the BIA machine can be characterised as a one-off payment to the occupational health service programme cost, in the long run, recurring expenditure in lipid profile screening may potentially be minimised. Finally, in a setting in which BIA cannot be undertaken, we recommend the calculation of metabolic age using three related parameters, age, gender and BMI. These were identified from the multivariable linear regression analysis on the study data (N=5149, R2=0.82, RMSE=5.19)

Display Formula

Implications for future research could include the necessary step of temporal external validation, to explore the changes in performance against time26 27 as the investigators plan for the 2024 health examinations. Further geographical external validation should also be carried out in other hospitals or healthcare services with the availability of the body composition analyser. Also, updating of the model should be initiated if external validation demonstrates miscalibration.

The limitations of this study include potential issues regarding the representative nature of the study domain and quality of predictors. The retrospective hospital registry data may not be a representative sample of the general population of all Thai young adults aged 20–40 years. As the majority of the hospital population can be categorised as working in the service sector the diagnostic prediction may not be valid for an industrial or agricultural setting, even within the same age group due to various unaccounted differences in socioeconomics and occupational exposure.28 29 Also, the quality of prespecified predictors can also be affected by measurement error, especially the waist circumference and a single measurement of blood pressure. These measurement errors, apart from decreasing the signal-to-noise ratio, can also potentially lead to both premature elimination of the variable from the stepwise selection process and increase in model instability, particularly when the final model retained these predictors.30

One of the strengths of our methodology includes the acknowledged handling of the missing data instead of a traditional complete-case analysis. In addition, exploration of prediction instability assists in identifying certain ranges of probability that require precaution in the usage of the prediction model.20 However, the EPV of 337.67 in the non-HDL-C model and 151.5 in the LDL-C model assisted in reducing model instability. Finally, parsimony allows our model to go through future external validation studies easier, due to less impediments for the complete set of predictors data.

  • Contributors: WK: conceptualisation, methodology, software, formal analysis, investigation, resources, data curation, writing–original draft, writing–review and editing, visualisation, project administration; VS: conceptualisation, methodology, resources, writing–original draft, writing–review and editing, supervision; WS: conceptualisation, methodology, software, validation, resources, writing–review and editing; PP: conceptualisation, methodology, software, resources, writing–review and editing. WS is responsible for the overall content as guarantor.

  • Funding: The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.

  • Competing interests: None declared.

  • Provenance and peer review: Not commissioned; externally peer reviewed.

  • Supplemental material: This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.

Data availability statement

Data are available on reasonable request. The deidentified individual patient data set (IPD) and analytical codes are available on reasonable request to the corresponding author.

Ethics statements

Patient consent for publication:
Ethics approval:

This study was approved by the Institutional Review Board of the Faculty of Medicine, Chiang Mai University (Study code COM-2566-0361; date of approval 20 September 2023).

Acknowledgements

We wish to thank the Health Promotion Unit, Maharaj Nakorn Chiang Mai Hospital for the organisation of the health examination and retrieval of the data for all the analysis conducted in this study.

  1. close Ference BA, Graham I, Tokgozoglu L, et al. Impact of Lipids on Cardiovascular Health: JACC Health Promotion Series. J Am Coll Cardiol 2018; 72:1141–56.
  2. close Ray KK, Seshasai SRK, Erqou S, et al. Statins and all-cause mortality in high-risk primary prevention: a meta-analysis of 11 randomized controlled trials involving 65,229 participants. Arch Intern Med 2010; 170:1024–31.
  3. close Ference BA, Yoo W, Alesh I, et al. Effect of long-term exposure to lower low-density lipoprotein cholesterol beginning early in life on the risk of coronary heart disease: a Mendelian randomization analysis. J Am Coll Cardiol 2012; 60:2631–9.
  4. close Tan Y-D, Xiao P, Guda C, et al. In-depth Mendelian randomization analysis of causal factors for coronary artery disease. Sci Rep 2020; 10:9208.
  5. close Wu F, Juonala M, Jacobs DR, et al. Childhood Non-HDL Cholesterol and LDL Cholesterol and Adult Atherosclerotic Cardiovascular Events. Circulation 2024; 149:217–26.
  6. close Arnett DK, Blumenthal RS, Albert MA, et al. 2019 ACC/AHA Guideline on the Primary Prevention of Cardiovascular Disease: Executive Summary: A Report of the American College of Cardiology/American Heart Association Task Force on Clinical Practice Guidelines. Circulation 2019; 140:e563–95.
  7. close Aekplakorn W, Taneepanichskul S, Kessomboon P, et al. Prevalence of Dyslipidemia and Management in the Thai Population. National Health Examination Survey IV 2009;
  8. close Bucholz EM, Gooding HC, de Ferranti SD, et al. Awareness of Cardiovascular Risk Factors in U.S. Young Adults Aged 18-39 Years. Am J Prev Med 2018; 54:e67–77.
  9. close Wang C-J, Li Y-Q, Wang L, et al. Development and evaluation of a simple and effective prediction approach for identifying those at high risk of dyslipidemia in rural adult residents. PLoS One 2012; 7.
  10. close Yang X, Xu C, Wang Y, et al. Risk prediction model of dyslipidaemia over a 5-year period based on the Taiwan MJ health check-up longitudinal database. Lipids Health Dis 2018; 17:259.
  11. close Rezaei M, Fakhri N, Pasdar Y, et al. Modeling the risk factors for dyslipidemia and blood lipid indices: Ravansar cohort study. Lipids Health Dis 2020; 19:176.
  12. close Seo J-H, Kim H-J, Lee J-Y, et al. Nomogram construction to predict dyslipidemia based on a logistic regression analysis. J Appl Stat 2020; 47:914–26.
  13. close Wu J, Qin S, Wang J, et al. Develop and Evaluate a New and Effective Approach for Predicting Dyslipidemia in Steel Workers. Front Bioeng Biotechnol 2020; 8.
  14. close Merćep I, Strikić D, Slišković AM, et al. New Therapeutic Approaches in Treatment of Dyslipidaemia-A Narrative Review. Pharmaceuticals (Basel) 2022; 15.
  15. close Miyuki shimomura (tokyo) wus, soji kurata (saitama), inventorus patent application for body composition analyzer patent application (application #20070027401). 2004;
  16. close Kelly J, Metcalfe J. Validity and reliability of body composition analysis using the tanita BC418-MA. J Exerc Physiol Online 2012; 15:74–83.
  17. close van Smeden M, Moons KG, de Groot JA, et al. Sample size for binary logistic prediction models: Beyond events per variable criteria. Stat Methods Med Res 2019; 28:2455–74.
  18. close Azur MJ, Stuart EA, Frangakis C, et al. Multiple imputation by chained equations: what is it and how does it work? Int J Methods Psychiatr Res 2011; 20:40–9.
  19. close Lacerda M, Ardington C, Leibbrandt M, et al. Sequential Regression Multiple Imputation for Incomplete Multivariate Data using Markov Chain Monte Carlo. Southern Africa Labour and Development Research Unit, University of Cape Town, SALDRU Working Papers. 2007;
  20. close Riley RD, Pate A, Dhiman P, et al. Clinical prediction models and the multiverse of madness. BMC Med 2023; 21:502.
  21. close Fernandez-Felix BM, García-Esquinas E, Muriel A, et al. Bootstrap internal validation command for predictive logistic regression models. The Stata Journal: Promoting communications on statistics and Stata 2021; 21:498–509.
  22. close Sasieni P, Royston P, Cox N. R, et al. Stata module for symmetric nearest neighbour smoothing. 2011;
  23. close Ensor J, Snell KI, Martin EC, et al. PMCALPLOT: Stata module to produce calibration plot of prediction model performance. 2023;
  24. close Collins GS, Moons KGM, Dhiman P, et al. TRIPOD+AI statement: updated guidance for reporting clinical prediction models that use regression or machine learning methods. BMJ 2024; 385.
  25. close Gebauer JE, Adler J. Using Shiny apps for statistical analyses and laboratory workflows. Journal of Laboratory Medicine 2023; 47:149–53.
  26. close Wolff RF, Moons KGM, Riley RD, et al. PROBAST: A Tool to Assess the Risk of Bias and Applicability of Prediction Model Studies. Ann Intern Med 2019; 170:51–8.
  27. close Steyerberg EW, Harrell FE. Prediction models need appropriate internal, internal-external, and external validation. J Clin Epidemiol 2016; 69:245–7.
  28. close Espírito Santo LR, Faria TO, Silva CSO, et al. Socioeconomic status and education level are associated with dyslipidemia in adults not taking lipid-lowering medication: a population-based study. Int Health 2022; 14:346–53.
  29. close Li L, Ouyang F, He J, et al. Associations of Socioeconomic Status and Healthy Lifestyle With Incidence of Dyslipidemia: A Prospective Chinese Governmental Employee Cohort Study. Front Public Health 2022; 10:878126.
  30. close Luijken K, Groenwold RHH, Van Calster B, et al. Impact of predictor measurement heterogeneity across settings on the performance of prediction models: A measurement error perspective. Stat Med 2019; 38:3444–59.

  • Received: 25 June 2024
  • Accepted: 4 January 2025
  • First published: 30 January 2025