Background and Objectives: Missing data exist in many studies, e.g. in regression models, and they decrease the model's efficacy. Many methods have been suggested for handling incomplete data: they have generally focused on missing outcome values. But covariate values can also be missing.
Materials and Methods: In this paper we study the missing imputation by the EM algorithm and auxiliary variable and compare the result with case-complete analysis in a logistic regression model dealing with factors that influence the choice of the delivery method.
Our data came from a cross-sectional study of factors associated with the choice of the delivery method in pregnant women. The sample size in this cross-sectional study was 365 and the data were collected through interviews, using questionnaires covering several demographic variables, delivery history, attitude, and some social factors. We used standard deviations to compare the efficiency of the two methods.
Results: The results show that maximum likelihood analysis by EM algorithm is more effective than case-complete analysis.
The problem of missing data is common in surveys and it causes bias and decreased model efficacy. Here we show that the EM algorithm for imputation in logistic regression with missing values for a discrete covariate is more effective than case-complete analysis.
Conclusion: On the other hand if missing values occur for a continuous covariate then we have to use other methods or change the variable into a discrete one.
Rights and permissions | |
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License. |