J Cancer Prev 2022; 27(2): 89-100
Published online June 30, 2022
© Korean Society of Cancer Prevention
1Department of Public Policy and Public Affairs, John McCormack Graduate School of Policy and Global Studies, University of Massachusetts Boston, Boston, MA, 2The Center for Global Health and Health Policy, Global Health and Education Projects, Inc., Riverdale, MD, USA
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License, which permits unrestricted noncommercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Most research on cancer patient survival uses registry-based (e.g., SEER) incidence and survival data that have limited socioeconomic status and health-risk information. In this study, we used the 1997-2015 National Health Interview Survey-National Death Index prospectively-linked pooled cohort database (n = 40,291 cancer patients) to examine disparities in patient survival by a broad range of social determinants, including race/ethnicity, nativity, educational attainment, income/poverty level, occupation, housing tenure, physical and mental health status, smoking, physical activity, body mass index, and alcohol consumption. We used Cox proportional hazards models to estimate mortality hazard ratios and cause-specific 1-year, 5-year, and 10-year survival rates for all-cancer and lung cancer. During 1997-2015, the 10-year age-adjusted (all-cause) survival rate for cancer patients with professional and managerial occupations was 89.66%, significantly higher than the survival rate of 83.17% for laborers or 83.66% for the unemployed. Cancer patients with renting house had significantly lower age-adjusted survival rates than those owning house (82.65% vs. 85.80%). The 10-year age-adjusted survival rates were significantly greater among cancer patients with regular physical activity than those without regular physical activity (90.18% vs. 83.24%). Age-adjusted survival rates were significantly reduced for cancer patients with lower income and education, poor health, and serious psychological distress, and among current and former smokers. The gap in survival narrowed with additional sociodemographic, health, or behavioral adjustment. Similarly large differentials were found in lung cancer survival. Marked disparities in all-cancer and lung cancer survival were found by a wide range of sociodemographic and health characteristics.
Keywords: Cancer, Survival, Social class, Health behaviors, Lung neoplasm
In 2019, there were more than 16.9 million cancer survivors in the United States . The 5-year relative survival rates increased from 49% during 1975-1977 to 67% during 2010-2016, due to advances in treatment and earlier diagnosis for some cancers [1,2]. Although survival of cancer patients has improved, 1.9 million new cancer cases were expected to be diagnosed in 2021 and 608,570 cancer patients were estimated to die [1,3]. Particularly, lung and bronchus, as the most common cause of cancer death for both men and women, accounted for 22% of total cancer deaths in 2021, which is two times higher than the number of prostate cancer deaths and 1.47 times higher than the number of breast cancer deaths . The 5-year relative survival rate for lung cancer patients was only 21% during 2010-2016, increasing from 12% during 1975-1977 [1,2].
Common risk factors of cancer incidence [4-7], mortality [4,5,7-9], and survival [4,5,8,10-14] are well-documented, including sex, race/ethnicity, income, education level, occupation, housing tenure, cigarette smoking, excess body weight, alcohol consumption, physical activity, and psychological distress [4-6,12,14,15]. Besides these factors, use of screening tests has decreased mortality rates [4,8], and stage at diagnosis and treatment affect cancer mortality or survival rates . Non-Hispanic Blacks have higher death rates for all cancers combined, compared with other racial/ethnic groups [5,8,16]. Low socioeconomic position in education, occupation, and income has been shown to increase risk of lung cancer incidence by 61%, 48%, and 37%, compared with the highest socioeconomic status (SES) levels , while one meta-analyses found that lung cancer survival was associated with income but not consistently with education . Health behaviors are important factors, given that 28.8% of cancer deaths are attributable to cigarette smoking, followed by obesity (6.5%), alcohol intake (4.0%), fruit and vegetable consumption (2.7%), and physical inactivity (2.2%) . Cancer mortality has been shown to be 33% higher for adults with serious psychological distress, compared to those without psychological distress .
Existing registry-based studies of survival have focused on disparities in survival of cancer patients by sex and race/ethnicity, limited measures of SES, comorbidity, stage, or area-level SES. Patient survival rates are found to be significantly lower for males and non-Hispanic blacks, compared with females and non-Hispanic whites [1,2,5,8]. During 2010-2016, the 5-year relative survival rate for all cancers combined for females was 68.5%, higher than the rate of 66.4% for males . The 5-year relative survival rate for whites was 68%, compared with 63% for blacks [1,2]. The 5-year survival for lung cancer female patients aged 66-74 years with no comorbidity was 60.3% at localized disease stage, 27.8% at regional disease stage, and 3.7% at distant disease stage, while those with severe comorbidity presented 36.3%, 15.9%, and 1.8% . Differences in cancer survival have also been found by socioeconomic deprivation. During 1988-1999, the 10-year survival rate for cancer patients was 41% in the most-deprived neighborhoods, compared with 60.4% for those in the least-deprived neighborhoods . However, most research on cancer patient survival uses registry-based (e.g., Surveillance, Epidemiology, and End Results [SEER]) incidence and survival data that have limited SES, nativity/immigrant status, and health-risk information.
To address this gap in research, given the highest rate of lung cancer incidence in the United States, our study examines disparities in the predicted all-cause and cancer survival rates among all cancer patients and lung cancer patients in the United States by sociodemographic, health, and behavioral characteristics, using a nationally representative longitudinal dataset with a 19-years of mortality follow-up.
The data for this study are derived from the 1997-2014 cross-sectional National Health Interview Survey (NHIS) linked to the 1997-2015 cumulative National Death Index (NDI) . As a nationally representative, annual cross-sectional household interview survey, NHIS provides demographic, socioeconomic, and health characteristics of the civilian, non-institutionalized population in the United States . The NHIS uses geographically clustered sampling techniques to select the sample of dwelling units . The sample is designed each month’s sample to be nationally representative and collected continuously from January to December each year . We used the NHIS Sample Adults file including more specific and detailed information such as cancer diagnosis or cancer type . NHIS sample size varies from year to year. In the Sample Adult files, for example, the 1997 NHIS contains 36,116 persons with an 80.4% response rate , and the 2014 NHIS contains 36,697 persons with a 58.9% response rate . The National Center for Health Statistics (NCHS) developed public-use versions of the NHIS linked with death certificate records from the NDI. For this study, we used the 1997-2015 pooled NHIS linked mortality file containing 19 years of longitudinal mortality follow-up data from the date of survey participation through December 31, 2015 [22,23].
The study sample was restricted to cancer patients aged 18 and older from the years of 1997 to 2014 NHIS sample adult files (n = 42,767, Fig. 1). Cancer patients were identified by asking respondents whether they were ever diagnosed by a doctor or other health professional as having cancer. Then, cancer patients without mortality status (n = 1,230), those without information on age at first diagnosis for cancer (n = 626), or sample with death month being earlier than interview month (n = 4), were eliminated from the analysis. The final pooled eligible sample size excluding missing values in covariates was 40,291 for all cancer patients and 1,287 lung cancer patients. Missing values, accounting for less than 1% of the study sample, for race/ethnicity (0.01%), nativity (0.10%), education (0.43%), marital status (0.14%), housing tenure (0.12%), self-reported physical health status (0.16%), and smoking (0.66%) were excluded from the analyses.
For missing values accounting for more than 1% of the total observations, poverty status (17.52%), psychological distress (1.94%), regular physical activity (2.96%), body mass index (BMI) (2.99%), and alcohol consumption (1.45%), we created missing categories so as to not lose too many observations in the analysis. Although a knowledgeable proxy can respond for sample adults, if they are physically or mentally unable to respond to survey, there might be a potential selection bias since survey respondents were healthy enough to participate in the survey and survived a substantial period after the cancer diagnosis.
We examined all-cause mortality and cancer mortality (ICD-10 codes: C00-C97)-related outcomes: age-adjusted and covariate-adjusted predicted cancer-specific survival rates for all cancer patients and for lung cancer patients.
Follow-up time for individuals who died during the study period was estimated by the number of months from age first diagnosed with cancer to the month/year of death. For individuals who were alive was estimated by the number of months from age first diagnosed with cancer to December 31, 2015. For cancer patients with more than one cancer, we calculated the mean of years since first diagnosis on multiple cancers. Since the NHIS-NDI database provides only the quarter of death, we assumed that death occurred in the middle of the quarter, February, May, August, or November .
Based on the previous literature, we selected the following determinants of health for model estimation of cancer mortality: age, sex, race/ethnicity, education, poverty status, occupation, housing tenure, marital status, nativity/immigrant status, region of residence, self-reported physical health status, psychological distress, regular physical activity, BMI, smoking status, alcohol consumption, and survey years [4-10]. Definition of measures is provided in Table S1.
Weighted proportions of all-cause death and cancer death by each covariate were calculated for all cancer patients and for lung cancer patients, using survey weight. For the lung cancer model, binary measure of the US-born/foreign-born variable was used and alcohol consumption was dropped. Cox proportional hazards models were used to derive predicted all-cause and cancer-specific survival rates among cancer patients, controlling for individual characteristics and year-fixed effects.
First, we examined the Cox models controlling for a different set of covariates: 1) age and survey year (age-adjusted model); 2) age, survey year, sex, race/ethnicity, education, poverty status, occupation, housing tenure, marital status, nativity/immigrant status, and region (SES-adjusted model); and 3) all covariates in the SES-adjusted model plus self-reported health status, psychological distress, regular physical activity, BMI, smoking status, and alcohol consumption (fully-adjusted model). All Cox regression analyses with hazard ratios were reported in Table S2, 3.
Second, the predicted survival rates were estimated for each of the sociodemographic, health, and behavioral characteristics using Stata post-estimation command,
We separately estimated and reported survival rates for each individual characteristic by all-cause and cancer specific death, by 1-year, 5-year, and 10 year-follow-up, and by age-, SES-, and fully-adjusted models for both all cancer patients and lung cancer patients . Cancer specific survival rate and SES- and fully-adjusted model analyses were reported in Tables S4-S7.
Our approach is different from previous studies using SEER data, in which expected survival rates are derived from life tables by SES, geography, and race developed by the SEER program [2,27,28]. For cancer death, individuals surviving beyond the follow-up period and those dying from causes other than cancer were treated as right-censored observations. The model assumes that hazard rates are a log-linear function of parameters representing the effects of covariates [29,30]. We checked the hazards proportionality assumption  by inspecting the plots of [log(-log) survival function] against survival time t for the various covariate categories including those for all covariates in the fully-adjusted model . These plots were found to be approximately parallel and hence the proportionality assumption was taken to be satisfied by the data. All analyses were conducted by Stata 17 (StataCorp LLC, College Station, TX, USA) , and the Cox model was fitted using stcox (Stata) procedure . The study was exempt from Institutional Review Board approval as it utilized a de-identified public use dataset.
Our study sample of cancer patients were more likely to be female (57.46%), non-Hispanic white (87.95%), those with a high school diploma (30.85%), those at or above 400% of the poverty level (34.35%), those with professional and managerial occupations (27.33%), homeowners (80.57%), currently married (61.61%), the US-born (93.21%), residents of the South (37.51%), those with excellent/very good/good health status (73.82%), those without psychological distress (42.31%), those with no regular physical activity (42.50%), those with BMI < 25 (36.62%), never smokers (45.29%), and current drinkers (57.40%) (Table 1). Lung cancer patients showed a similar pattern, except that they were more likely to be male (51.98%), unemployed or outside the labor force (31.82%), to report fair/poor health status (52.96%), and former smokers (64.05%). The total number of all-cause deaths and cancer deaths during the 19-year follow-up were 11,840 and 4,229 for all cancers combined, and 798 and 499 for lung cancer patients.
Lower survival was associated with racial minority and lower SES such as education, income, or occupation among cancer patients. The estimated difference in the 10-year age-adjusted all-cause survival rate between non-Hispanic black and non-Hispanic white cancer patients was 5.92 percentage points (79.52% vs. 85.44%). The estimated 10-year age-adjusted all-cause survival rate for cancer patients with less than high school was 81.58% and 88.38% for cancer patients with master’s degree or higher (Table 2). The estimated difference in the 10-year age-adjusted all-cause survival rate between cancer patients below the poverty level and those with income ≥ 400% of the FPL was 7.38 percentage points (80.63% vs. 88.01%). The 10-year age-adjusted all-cause survival rate for cancer patients with professional and managerial occupations was 89.66%, significantly higher than the survival rate for cancer patients with other occupations. The estimated 10-year age-adjusted all-cause survival rate for cancer patients renting house was 82.65%, significantly lower than the survival rate of 85.80% for cancer patients owning house. The 10-year age-adjusted all-cause survival rate for cancer patients currently married was 85.42%, higher than the survival rate for cancer patients divorced/separated (84.35%), or never married (82.75%). The estimated 10-year age-adjusted all-cause survival rate for US-born cancer patients was 84.98%, lower than the survival rate for foreign-born cancer patients residing in the US for 15 years or longer (85.94%). The gap in cancer survival narrowed with additional sociodemographic, health, or behavioral adjustment (Table S4).
Lower survival was consistently associated with poor physical and mental health status. The estimated 10-year age-adjusted all-cause survival rate for cancer patients with fair or poor health status was 77.92%, significantly lower than the survival rate of 87.99% for cancer patients with excellent/very good/good health status. The estimated 10-year age-adjusted all-cause survival rate for cancer patients with serious psychological distress was 79.57%, significantly lower than the survival rate of 86.71% for cancer patients with no psychological distress.
The estimated 10-year age-adjusted all-cause survival rate for cancer patients with regular physical activity was 90.18%, significantly higher than the survival rate for cancer patients without regular physical activity (83.24%) or those with less than regular physical activity (87.92%). The 10-year age-adjusted all-cause survival rate for cancer patients with BMI < 25 was 84.49%, significantly higher than the survival rate for cancer patients with severe obesity (BMI ≥ 40 [82.37%]) but lower than the survival rate for cancer patients with BMI 25 to 29 (85.93%), BMI 30 to 39 (84.83%). Among cancer patients, the 10-year age-adjusted all-cause survival rate for non-smokers was 88.31%, substantially higher than the survival rate for former smokers (84.48%) or current smokers (78.92%). Among cancer patients, the 10-year age-adjusted all-cause survival rate for lifetime abstainers was 85.10%, significantly higher than the survival rate for former drinkers (81.78%) but lower than the survival rate for current drinkers (86.49%). Similar to all-cause survival rates, the lower cancer-specific survival rates were associated with lower SES, poor health status, and health-risk behaviors (Table S6).
The covariate-adjusted survival rates for lung cancer were much lower than the survival rates for all cancers combined, but the pattern of association between individual characteristics and survivorship was consistent with all-cancer survival rates (Table 3, Table S5). The estimated 10-year age-adjusted all-cause survival rate for lung cancer patients with less than high school education was 46.67%, substantially lower than the survival rate of 57.51% for lung cancer patients with master’s degree or more (Table 3). The 10-year age-adjusted all-cause survival rate for lung cancer patients below the poverty level was 50.19%, significantly lower than the survival rate of 56.37% for lung cancer patients with income at or above 400% FPL. The 10-year age-adjusted all-cause survival rate for lung cancer patients with professional and managerial occupations was 59.22%, significantly higher than the survival rate of 51.43% for the unemployed. Differences in all-cause survival rates for lung cancer patients by housing tenure, marital status, and Nativity status were not statistically significant.
The estimated 10-year age-adjusted all-cause survival rate for lung cancer patients with fair or poor health status was 42.38%, significantly lower than the survival rate of 63.26% for lung cancer patients with excellent/very good/good health status. The estimated 1-year age-adjusted all-cause survival rate for lung cancer patients with serious psychological distress was 40.96%, significantly lower than the survival rate of 58.93% for lung cancer patients with no psychological distress.
The estimated 10-year age-adjusted all-cause survival rate for lung cancer patients with regular physical activity was 67.10%, significantly higher than the survival rate for lung cancer patients without regular physical activity (48.36%) or those with less than regular physical activity (61.57%). The estimated 10-year age-adjusted all-cause survival rate for lung cancer patients with BMI < 25 was 49.88%, lower than the survival rate for lung cancer patients with BMI 25 to 29 (54.45%), BMI 30 to 39 (59.75%), or BMI ≥ 40 (64.33%). Among lung cancer patients, the estimated 10-year age-adjusted all-cause survival rate for non-smokers was 65.00%, significantly higher than the survival rate for former smokers (53.52%) or current smokers (44.79%). We found a similar pattern of association between cancer-specific survival rates and sociodemographic, health, and behavioral characteristics for lung cancer patients (Table S7).
In this large prospective study of 40,291 US cancer patients using a relatively long mortality follow-up of 19 years, we found marked and consistent gradients in age-adjusted all-cause and cancer-specific survival rates by sociodemographic, health, and behavioral characteristics. Even after controlling for socioeconomic and demographic covariates, health-risk behaviors, and health status, significant disparities in survival rates existed but narrowed for the cancer patients. Computation of predicted survival rates for each characteristic, adjusting for other sociodemographic, health, and behavioral characteristics, is a particularly novel feature of our study.
Our findings on survival gap by sex, race/ethnicity, education, income was consistent with previous studies in terms of higher survival rates for females, whites, individuals with high education level, high income, or insurance, married individuals, home owners, or those in Western states, compared with their counterparts [1,2,11-14]. However, in terms of the magnitude, we found higher survival rates since we estimated survival rates for the prototypical individuals who are most likely to have higher survival rates than others given their better SES and health-risk profiles. Our estimates are not directly comparable with the average survival rates for cancer patients, but we recommend interpreting the results by comparing across covariate categories and focusing on the gap in survival, such as between homeowners and renters.
Our study contributes to the existing literature on social determinants of health and cancer health disparities by estimating survival rates by various individual-level sociodemographic, behavioral, and health status factors among cancer patients, which have not been well studied in prior studies. For example, estimating cancer survivorship associated with self-assessed health status, psychological distress, smoking, and BMI is especially new and shows the significance of these factors in determining cancer survival. Showing the large gap in all-cause survival between non-smokers and current smokers and between cancer patients with regular physical activity and those with no regular physical activity, our study findings provided the evidence of smoking status and physical activity as important factors in all-cause survival for cancer patients. Public health programs and policy interventions such as physical activity and tobacco control/cessation campaigns among cancer patients might be effective measures to improve survival rates.
Persistence of marked disparities in cancer survivorship among racial/ethnic and SES groups might reflect healthcare access and treatment disparities and shows the potential for further improvements in cancer survival [5,33]. Future research is needed to examine differentials in survival rates by neighborhood SES or built environmental factors, levels of urbanization/rurality [10,33,34], social supports , or quality of health care [36-38]. Although we focused on individual-level SES factors due to data availability, area-based SES has been strongly associated with cancer survival [10,33,34]. Cancer patients in neighborhoods with lower SES have markedly lower survival rates than those in higher SES neighborhoods [10,33,34]. Furthermore, it would be important to determine if similar individual patient-level socio-behavioral inequalities in survival exist for other major cancer sites such as colorectal, prostate, breast, and cervical cancer. Social isolation has been shown to increase risk of all-cause mortality by 66% and breast cancer mortality by 114% . Considering that our study found significant disparities in survival by psychological distress among cancer patients, it may be worthwhile to examine the effect of any mediators such as social network or supports, alleviating the association between psychological distress and cancer survival. High quality cancer care is also important in improving survivorship among cancer patients. Stratifying follow-up care by cancer patients’ risk, such as a supported self-management for patients at low risk, a shared care for those at moderate risk, and complex case management for those at high risk, will help improve quality of care [37,38]. Using patient-reported outcome measures integrated into eHealth platforms will help stratify risk among patients [37,38]. Based on risk-stratification, various healthcare professionals, including survivorship clinics, psychosocial and medical experts, and consultants, need to collaborate at all points of the cancer care pathways from cancer diagnosis through end of life care [36-38].
This study has limitations. First, our study only contains the NHIS sample eligible for linkage to the NDI. Excluding samples ineligible for linkage may lead to biased mortality or survival estimates. To address this bias, we used the adjusted original sampling weight to account for the NHIS-NDI mismatches . Second, our findings may be affected by the omitted-variable bias. While our Cox regression models were controlled for self-reported BMI, fair/poor health status, and psychological distress, there could be other potential confounders of cancer-related information, such as site, stage at diagnosis, and first course treatment, all of which are not available in the NHIS. Third, since the NHIS excludes the institutionalized population, cancer survivors participating in NHIS might be likely diagnosed with earlier stage, more treatable, and less fatal cancers than cancer survivors as a whole, which would have overestimated the survival time. Our survey analyses are not generalizable to the total cancer patients due to this potential selection bias. Finally, all the covariates in the NHIS-NDI database were time-fixed at the baseline as of the survey date. Several of the covariates such as SES, health status, behavioral risk factors, and psychological distress could have varied over the long mortality follow-up period of 19 years, which would have influenced their estimated impact on cancer mortality and survivorship. In particular, mental health-related indicators and subjective health status can be severely affected by reverse causation as the more advanced the stage, the worse these indicators can be. Future studies need to examine the association using longitudinal data on individual sociodemographic and health-related characteristics.
In a nationally representative study, we found marked and consistent gradients in age-adjusted all-cause and cancer survival rates by socioeconomic, demographic, health, and behavioral characteristics. The disparities in survival rates for the cancer patients narrowed with the additional covariate adjustment. Smoking status and physical activity were the most important risk factors influencing all-cause survival among cancer patients, while self-reported health status and psychological distress showed the largest differentials in cancer-specific survival.
No potential conflicts of interest were disclosed.
Achanta Sri Venakata Jagadeesh, Xizhu Fang, Seong Hoon Kim, Yanymee N. Guillen-Quispe, Jie Zheng, Young-Joon Surh, Su-Jung KimJ Cancer Prev 2022; 27(1): 7-15 https://doi.org/10.15430/JCP.2022.27.1.7
Hye Kyung Song, Sun Young KimJ Cancer Prev 2021; 26(2): 98-109 https://doi.org/10.15430/JCP.2021.26.2.98
Mei Lan Tan, Shahrul Bariyah Sahul HamidJ Cancer Prev 2021; 26(1): 1-17 https://doi.org/10.15430/JCP.2021.26.1.1