Article Text
Abstract
Background The transition from traditional office work to telework has accelerated significantly since the late 20th century, especially in light of the COVID-19 pandemic. Despite its widespread adoption, the long-term health impacts of telework remain unclear. This study seeks to clarify the telework–health relationship by integrating longitudinal self-reported health data with health-related administrative records.
Methods and analysis An online self-reported longitudinal survey with four follow-ups of 6 months each, starting in November 2024, will be set up and linked with administrative data sources. In total, a non-probabilistic sample of 5000 non-teleworkers and teleworkers will be recruited. This survey will mainly assess the effect of teleworking on mental (eg, depression and anxiety) and physical (eg, pain) health. Administrative data (eg, healthcare consumption contacts and socioeconomic status) will be extracted from Belgian administrative data sources (Statistics Belgium and the InterMutualistic Agency) for the same period. This administrative data will be linked to the survey data using the Social Security ID. The underlying relationships between telework and health will be analysed via regression models and mediation models embedded in the natural effects framework. The analysis will aim to (1) identify the impact of telework on self-reported health and administrative data, (2) identify the moderators and mediators between the telework–health relationship, (3) understand the long-term patterns of telework and health interaction and (4) predict the health outcomes of teleworkers. To mitigate biases associated with non-probabilistic samples and attrition, standardised probability weights scoring will be derived from the data.
Ethics and dissemination This study involves human participants and has been approved by the Ethics Committee of Universitair Ziekenhuis Gent (Nr°. ONZ-2023–0630). The participants will participate in the study after signing an informed consent form. The study will be disseminated in academic journals, on (social) media and on the project website.
- EPIDEMIOLOGY
- PUBLIC HEALTH
- Surveys and Questionnaires
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.
Statistics from Altmetric.com
STRENGTHS AND LIMITATIONS OF THIS STUDY
The administrative data will allow us to enrich self-reported measures and explore administrative markers of health (eg, healthcare consumption).
The longitudinal character of this study will allow us to investigate the link between telework and health over time.
The study is based on a non-probabilistic sample.
Participants may feel reticent to share their Social Security ID, leading to an unpredictable sample size for the data linkage part of the study.
Introduction
In the past years, the bullpen office has transitioned to hybrid forms of working, moving the traditional workplace beyond the companies’ walls. According to the European Framework Agreement of Telework, telework is defined as “a form of organizing and/or performing work, using information technology, in the context of an employment contract/relationship, where work that could be performed at the employer’s premises is carried out away from those premises on a regular basis”.1 This definition highlights that telework alters the spatial (eg, public spaces, train, home) and temporal (eg, occasional, full-time) dimensions of work. Consequently, these changes require the use of information and communication technologies, such as laptops and smartphones, which modify the way employees engage with their job.2
It became evident that telework gained momentum following the COVID-19, particularly in Belgium, where 18.9% of the Belgian population worked from home in 2019 and this increased to 32.2% in 2023.3 Telework rates vary by gender, job position and occupational sector in Belgium. For instance, the rates differ by gender, with 29.9% of men and 34.7% of women working from home. Furthermore, 64.1% of managers work from home, along with 58.4% of intellectual, scientific and artistic employees and 41.3% of administrative personnel. Moreover, the financial and insurance sector has the highest work-from-home rate with 68.4%, while only 9.5% in the mining and quarrying sector.4 Given the rise of telework in the past years, researchers and organisations have started to ask if, how and why telework affects health.
The job-demand resources framework (JDR) can help us analyse how telework may affect health and the pertinent mediators and moderators in this relationship. Here, the job characteristics can be classified into two categories: job demands and job resources. Job demands are aspects of an occupation that have a physical and/or a psychological impact (eg, work-life conflict, sedentary behaviour); whereas the aspects that assist in achieving work goals (eg, social support) are job resources.5 Studies have indicated that on a longitudinal basis, job demands are associated with negative health outcomes, and job resources are associated with positive health outcomes, such as work engagement.6 The effort-recovery model further elaborates on the underlying patterns that produce negative health outcomes. Here, it is thought that work activates psychophysiological systems, which then return to baseline during a recovery time (eg, a break, vacation). However, returning to baseline does not occur when these systems are constantly active (demands are high) and may lead to chronic stress reactions.7 Job-characteristics can thus be thought to be mediators in the telework–heath relationship.
Studies have suggested that telework modifies job characteristics, thereby explaining the link between telework and health. Blank et al. (2023) synthetised 96 records in an evidence-based model.8 Similarly to the JDR framework, the model indicates that work factors (eg, workload) have an influence on the telework–health relationship.8 For example, Sardershmukh et al. (2012) further investigated whether job resources (social support, autonomy and feedback) and job demands (time pressure, role ambiguity and role conflict) mediate the relationship between the extent of telework and health outcomes. In their mediation analysis, the researchers suggested that telework modifies the job characteristics, which in turn affects exhaustion and engagement.9 Similarly, Hornung and Glaser (2009) found that autonomy and work-life conflict mediate between the extent of telework and quality of life.10
In contrast, Vander Elst et al. (2017) investigated the relationship of telework, job characteristics and emotional exhaustion, cynicism, work engagement and cognitive stress complaints. Here, the authors investigated whether the health changes are attributed to the extent of telework, to job characteristics (social support, decision-making, autonomy and work-life conflict) or both. In their study, they did not find an indirect link between telework and health, with the exception of an indirect effect through social support. They further emphasised that the job characteristics rather than extent of telework predicts work-related well-being. For instance, task autonomy (job resource) was found to be negatively related to emotional exhaustion,11 as expected from the JDR framework. Although the current body of evidence does not agree on whether telework affects well-being and physical and mental health, they agree that job characteristics are essential to understand this relation. Here, we hypothesise that job characteristics are mediators in the telework–health relationship.
Job characteristics might provide insights into why telework affects health, nonetheless, it is also important to investigate individual (eg, household composition, gender) and organisational (eg, occupational sector) characteristics. Those characteristics can be thought to be moderators in the telework–health relationship. Let us analyse the example of gender. Traditionally, women perform more unpaid jobs (eg, caregiver activities) than men.12 Hilbrecht et al. (2008) investigated qualitatively women’s experiences while teleworking. Interestingly, women adapt their schedules to fit their caregiving activities by either working faster or fragmenting their jobs, which may lead to less leisure time.13 Similarly, work–family conflict and family–work conflict is more often reported among women compared to men.14 The blurring of boundaries between the home and work environment allows minimal recovery from work inside spaces that are supposed to serve as recovery (either inside or outside the home),7 which may increase the risk of burnout.2 Therefore, individual characteristics (eg, gender) may modify the relationship between telework and health. To the knowledge of the authors, multiple moderators have not been systematically addressed by the current body of research. Therefore, individual characteristics, such as gender, occupation and sector, are hypothesised to modify the relationship between telework and health.
In summary, we hypothesised that the extent of telework modifies the job resources and demands, which in turn impacts the health outcomes—a process that is called mediation. Moreover, the effect of telework on health can be modified by individual characteristics (eg, gender, occupation)—a process called moderation (or effect modification). As outlined in figure 1, the conceptual framework illustrates how job demands and resources mediate the telework—health relationship and how other factors modify this effect.
Conceptual model of the telework–health relationship. Illustration of the hypothesised relationship between telework and health, with the moderators and mediators.
Much of the current research on telework was conducted during COVID-19, which questions its applicability in postpandemic times. Moreover, many studies have either a small sample size or rely on cross-sectional data, which limits inferences in the telework–health relationship. To address those limitations, this study will implement a longitudinal study with a data linkage, to have a complete picture of why and for whom telework impacts health.
The primary goal of this study is to identify the causal longitudinal patterns of the telework–health relationship, by drawing from the existing occupational health models and focusing on health outcomes. In this paper, a longitudinal questionnaire (four questionnaires every 6 months) with an administrative data linkage methodology will be described. A data linkage is a powerful tool to (1) enrich current data sources and (2) obtain administrative health outcomes that are otherwise not available (eg, healthcare consumption). To the authors’ knowledge, Denzer and Grunau (2023) is the only study to have done a data linkage in the context of telework.15 However, this study will be a first in studying administrative health markers (eg, healthcare consumption).
Here, we will employ mediation analysis under the causal inference framework to investigate factors that influence the relationship between telework and health.
This study aims to:
Assess the direct associations of telework with health outcomes.
Identify the (work and health) moderators and mediators that connect telework and health.
Unravel the longitudinal patterns of the telework–health relationship.
Based on those aims, we hypothesise that:
H1. Telework affects the quality of life, pain, burnout, work engagement, anxiety, depression, healthcare costs and healthcare contacts directly.
H2. Telework affects health outcomes (quality of life, pain, burnout, work engagement, anxiety, depression, health expenditures and health contacts) through work characteristics (autonomy, social support, time pressure and overtime) and health characteristics (physical activity and sedentarism).
H2. The relationship between telework and health is modified by individual and work characteristics such as gender, occupation, education and income.
This study will map the contexts in which telework is harmful or beneficial by leveraging enriched survey data to identify key factors influencing employee health. The results might provide actionable actions to inform the public, organisations and policymakers to optimise teleworkers’ environment for improved health outcomes.
Methodology
Study population
In total, we aim to include a convenient sample of 5000 participants. The study participants need to have a job that can be performed somewhere else than the office (ie, ‘teleworkable’). This includes people who either telework or people who can do their job remotely but are not allowed to telework. Drawing from the report of Sostero et al. (2020) on teleworkability, this is evaluated by the self-rated physical and computer use of a job.16 To participate in the study, the following inclusion criteria will be applied:
18 years of age or older.
Performing a paid professional activity.
Being a screen worker.
Using information and communication technology to work (eg, laptops, telephone, etc).
The work can be performed in a place other than an office.
Working in the same company for at least 6 months.
Working in Belgium (at least 6 months before the start of the study).
Living in Belgium (at least 6 months before the start of the study).
Affiliated to a Belgian intermutualistic agency (at least 6 months before the start of the study).
The following exclusion criteria are applied:
Full-time students.
The work cannot be performed outside the walls of a company (eg, veterinarians, waiters).
The recruitment of the participating companies will be done by occupational health services. The companies will need to meet the following criteria:
Having a physical location in Belgium.
Having employees who perform a screen-based job.
Having known teleworkable jobs.
Survey development and design
To measure the long-term effects of telework on health, self-reported health measures will be collected longitudinally through an online questionnaire at four different time points: at 0 (wave 1), 6 (wave 2), 12 (wave 3), and 18 (wave 4) months. A follow-up of every 6 months has been chosen since that is the minimum period to perform a data linkage (figure 2).
General study timeline. Estimated timelines for the study, from the initiation to the completion of the project.
The main constructs that will be assessed in the questionnaire are general, mental and physical health, along with demographics and occupational context variables (table 1). The survey will be developed, based on 12 existing questionnaires and two self-constructed questionnaires. The justification for the inclusion of every single variable has been added to the online supplemental material 2), based on the current literature.
Supplemental material
Overview of the content in the questionnaire. Summary of the main constructs measured in the questionnaires used for the study, including references to the validated questionnaires used
Supplemental material
The questionnaire will be available in English, Dutch and French. The questionnaire will be validated on translation and content. Validated translations of the questionnaires will be used if available. The questionnaires with no validated translation and the self-constructed questions will be linguistically validated through back-translation. In short, two independent translators will translate the original version (English) into the target versions; the translations will be compared with the original version, and a third person will translate the documents back to English. In the end, the back-translated and original versions will be compared with each other.17 Moreover, the face validity of the questionnaire will be assessed by experts in the field (statisticians, occupational health researchers and psychologists).
To increase the response rate of the participants, four reminders, once every week on the first workday of the week, will be sent to the participants. In case a participant misses one wave, the participant will be able to participate in the next wave.
Telework exposure
The extent of telework is rarely an all-or-nothing practice.18 In this study, the frequency of telework will be quantified based on the number of teleworked hours per month. This is based on how many hours the participants work per week versus the number of days worked per month times 4.33 weeks each month (52 weeks/12 months). This means that composite scores can be derived on the extent of telework:
The telework exposure is quantified as the number of hours someone teleworked per month. For example, if a participant reports teleworking 10 hours per week and working 20 days per month, this translates to 43.33 hours of telework per month (10×4.33 = 43.33). To determine the average telework per workday, we divide the monthly telework hours by the number of working days in the month, resulting in an average of 2.17 teleworked hours per day (43.33÷20=2.17 hours per workday).
Administrative data linkage
To enrich the self-reported survey data, the data of participants will be linked with administrative data based on their Social Security ID (NISS/INSZ). Administrative data sources are especially interesting, given that (1) they enrich self-reported data, (2) they decrease participants’ survey burden and (3) they allow us to explore the associations of telework through variables that are not measured in a reliable way on self-reported questionnaires. Here, a subsample from the longitudinal questionnaire will be taken for the data linkage.
The data linkage will be done every 6 months, between every data wave.
Data variables
The following administrative data sources will be used: InterMutualistic Agency (IMA) and Statistics Belgium (Statbel). Table 2 shows the variables that will be used during the data linkage.
Administrative data linkage variables. Overview of the administrative variables coming from Statbel and Intermutualistic Agency (IMA) planned for the data linkage
IMA is a platform that collects health-related data from the seven healthcare insurance funds in Belgium to support policymakers and researchers to improve the Belgian healthcare system. The data is gathered on an individual level for all the citizens who are attached to a healthcare insurance fund, which is compulsory for all Belgian residents.19 This data source will be used to extract information about healthcare contacts, reimbursement status, direct healthcare costs and pharmaceutical cost group. A limitation of this database is the delay from the data input to the data availability, which usually takes up to 1–2 years.
Statbel is the Federal statistics agency of Belgium that gathers representative administrative data of the Belgian population. This agency collects data on household composition, income, job situation, energy consumption and other sociodemographic variables for all the Belgian citizens who belong to the Belgian citizen register.20 The variables for this study will entail the level of income in deciles, household composition based on the LIPRO typology (a household composition category),21 occupational codes (International Standard Classification of Occupations 2008, with eight major occupation groups based on skill level and skill specialisation) and the International Standard Classification of Education (classification in eight levels of education).
Data linkage process
To safeguard the participant’s privacy, a data flow will be set up between the occupational health service, the researchers, the administrative data sources and a third trusted party (TTP). The data flow will depict the responsibilities of every data provider and processor, including information on the anonymisation of the data and the access of the different parties to the data. Similarly, as described in De Pauw et al (2023),22 the following steps will be followed to safeguard the participant’s privacy:
The occupational health service will assign a participant ID, and they will collect the following personal data: the NISS/INSZ-codes, email and informed consent form (ICF) of the participants.
The longitudinal questionnaire data will be collected by the researchers, without personal data.
The TTP will link the administrative data from the administrative data sources and the longitudinal data from the researchers by using the NISS/INSZ code and participant ID received from the occupational health service.
The longitudinal data linked with the administrative data will be hosted on a secured server.
(Online supplemental material 3) shows the data linkage process.
Supplemental material
Legal framework and privacy
In accordance with the General Data Protection Regulation (GDPR) (EU) 2016/679 of 27 April 2016 and the Belgian law of 30 July 2018, the data will be protected, and the participants have the right to access their data. The legal basis of this data processing will be based on the explicit consent of the participants according to Article 6, paragraph 1 (a) and is necessary for the purpose of scientific research according to Article 9, paragraph 2 (j) of the GDPR. Moreover, a data protection impact assessment will be conducted by the data controller. This assessment analyses risks in large-scale data processing activity and it aims to address privacy-related operations.23 Furthermore, approval will be sought from the Information Safety Council (ISC), which is the main data governance body in Belgium that approves similar data linkage requests. Following the ISC, a small-cell risk analysis will be evaluated by one of the data providers.
Individual responses will not be sent to the company of the participants.
Statistical data analysis
Sample size determination
A non-probabilistic sample of 5000 employees will be recruited. The sample size calculations are performed assuming that 40% of the Belgian employees, who are able to perform teleworking activities, are also engaging in telework. This is further based on a pooled OR of 1.2, entailing the association between telework, mental and physical health.24 Taking into consideration a nominal significance level of 5% and a statistical power of 80%,25–27 this yielded a sample size of 3870 participants, which has been scaled up to 5000 to counteract potential attrition and compensate for the uncertainty expressed in the literature-derived estimates (± 25%). The attrition rate is based on previous literature. For Belgium, an attrition rate of 21% was found for the Influenzanet study in self-administered questionnaires.28 In high-income countries, this was found to be 26.3% for face-to-face household surveys.29
To derive population estimates, account for the selection bias intrinsic to non-probabilistic samples, and non-response profiles, standardised inversed probability weighting will be applied.30 This method helps to make groups within the sample comparable. Demographic variables such as sex, age and education will be used to derive those weights, similar to Höfler et al. (2005).31
Descriptive data and missing data
Demographics of the sample will be reported, for example, age, gender, occupation and telework frequency. The proportion of missing values will also be reported. Furthermore, available national norm values of the validated questionnaires will be compared against the sample responses.
Missing data will be corrected using the missing at random (MAR) multiple imputation method, more specifically through the Full Conditional Specification approach (FCS) as implemented in the mice package.32 33 The number of imputations and iterations performed per dataset will be reported. Furthermore, the effect sizes and the bootstrapped errors will be pooled across the different imputations based on Rubin’s rules.34 Complete case analysis will be compared against the FCS approach to explore the differences in the point estimates.
To see whether demographics have an influence on answer rate, those will be compared by using a generalised linear regression model.
Statistical inference
The main goal of the statistical analyses is to investigate the causal relationships between telework and health.
A total of three main datasets will be analysed in this study: a cross-sectional dataset (wave 1), a longitudinal self-reported dataset (wave 1 to wave 4) and an administrative-linked dataset. The association between telework and health will be assessed using a causal inference framework.35 To this end, a regression model will be constructed for each of the included outcome variables with telework as the exposure variable, using appropriate link functions. To evaluate variables that act as moderators and mediators in the telework–health relationship, mediation and moderation analyses will be done under the counterfactual framework, given the derived regression models. Natural effect models will be used as they simplify the interpretation of model estimates in nested models as described by Lange et al.36 Given that mediation analysis under the causal framework departs from the assumption that either the exposure is randomised35 or all the confounders have been measured (sequential ignorability assumption),37 sensitivity analyses will be reported along with the mediation analyses to understand the impact of unmeasured confounders on the data. To quantify that effect, the e-value approach as elaborated by VanderWeele and Ding can be applied.38
Moderation will be added as an interaction effect within the mediation-moderation models. Exploratively, moderated-mediation models will be fitted. This is especially interesting in the context of moderation by gender in the telework–mediator relationship. Diverse R-packages are available to conduct mediation and moderation such as medflex39 and mediation.40 Variables will also be tested on their role as mediators or moderators in the telework–health relationship. The different outcomes, mediators and moderators are shown in table 3.
Overview of the outcomes, mediators and moderators derived from the survey. Summary of the key constructs from the survey, organized by outcomes, mediators and moderators
Longitudinal patterns of telework and the health outcomes will be explored using the generalised linear mixed models, and generalised structural equation models will be used with time and organisation as random effects. Furthermore, to investigate the long-term associations of the mediators and moderators on the telework and health relationship, structural equation models will be fitted to investigate whether the latent constructs align with the conceptual model. For this, the software package lavaan is available.41 Figure 3 provides an overview of the planned statistical approach for the longitudinal analysis.
Overview of the longitudinal analyses of the telehealth protocol. Summary of the longitudinal analyses in the telehealth protocol. IPW, inverse probability weighting.
All the analyses will be performed in R.42
Data management
The longitudinal data from the self-collected data will be collected in the Research Electronic Data Capture (REDcap) platform, which will be hosted at the Sciensano secured server. REDcap is a secure, web-based software platform designed to support data capture for research studies, providing (1) an intuitive interface for validated data capture; (2) audit trails for tracking data manipulation and export procedures; (3) automated export procedures for seamless data downloads to common statistical packages and (4) procedures for data integration and interoperability with external sources.43 44 Furthermore, the linked longitudinal data will be hosted at one of the administrative data sources on their secure local server.
Datasets will be checked on logical inconsistencies (eg, negative values where positives are expected). Additional variables will be made to reduce the factors within the data, for instance, the severity of depression will be reduced to the likely presence of depression or none.
Sciensano is the data controller of the data whereas UCLouvain and UGent are the data processors. The data will be stored for 10 years on the servers of an administrative data holder.
Limitations
There are some limitations worth noting in this study. First, this study is based on a convenience sample, which means the sample may not be fully representative in terms of the demographic characteristics of the Belgian population. To mitigate this limitation, inverse probability weighting will be used to adjust for non-response and non-representative samples.
Moreover, the participants may feel reluctant to provide their Social Security ID, leading to an unpredictable sample size for the data linkage part of the study. To address this concern, communication strategies were developed to explain to the participants in layman’s language the study goals and objectives. Furthermore, occupational health services are used as trusted intermediaries.
Lastly, data linkage is a complex process that involves multiple stages, leading to delays in data delivery. We have accounted for this by scheduling the data analysis for 2026, allowing enough time for the completion of data collection and data analysis.
Patient and public involvement
Occupational healthcare professionals, statisticians and administrative data providers were involved in the development of the current protocol. Furthermore, the plausibility of the findings will undergo critical appraisal with subject-matter experts. Once available, the aggregated study results will be vulgarised for a non-academic audience to promote the understanding of the telework–health relationship through (social) media. Lastly, the research team has set up a web page to keep the stakeholders up-to-date on the research’s progress.
Ethics and dissemination
This study has been approved by the Ethics Committee of Universitair Ziekenhuis Gent (Nr°. ONZ-2023–0630) on 19 Feb 2024. Prior to filling in the questionnaire, the study participants will read the study information and sign the ICF. Participants can opt-out of participating in the administrative data linkage part of the study. No incentive will be given to the study participants. The study will be disseminated to the scientific community through scientific papers and to the general public via newspapers and the project website.
Ethics statements
Patient consent for publication
Acknowledgments
We would like to thank the people who participated in the testing of the survey and DJ for developing the data flow of this research.
References
Footnotes
Contributors RDP, BdG and BC conceptualised the project and acquired the funding. BC supervised the project. EABMdO and LIP designed the methodology. MV and EDW supported the development of the survey methodology. EABMdO wrote the original draft and is the guarantor. All authors reviewed and edited the manuscript.
Funding This work was supported by the Fonds Wetenschappelijk Onderzoek (FWO), and Fonds de la Recherche Scientifique (FNRS) under the Weave grant, under grant number G099523N. FWO is the lead agency.
Competing interests None declared.
Patient and public involvement Patients and/or the public were involved in the design, conduct, reporting or dissemination plans of this research. Refer to the Methods section for further details.
Provenance and peer review Not commissioned; externally peer reviewed.
Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.