Search for a command to run...
The primary approach for the prevention and management for the treatment of T2DM is lifestyle modification, which involves weight reduction and regular physical activity (ADA, 2024). In this context, accurate and standardized assessment of dietary intake is a critical component for evaluating lifestyle-based interventions. Repeated 24-hour dietary recalls (24HR) are highly valuable due to their sensitivity to dietary change and their ability to support detailed analyses of nutrient intake, dietary patterns, and food processing levels (Moshfegh et al., 2008).However, the implementation of 24HR in multicenter trials presents important methodological challenges, particularly in settings characterized by heterogeneous populations and variability in professional training. These challenges are further amplified in studies conducted in low-and middleincome countries (LMICs), where cultural, socioeconomic, and dietary diversity can compromise data consistency and comparability if not adequately addressed.The PROVEN-DIA trial exemplifies this complexity (Pagano, et al., 2025). Conducted in Brazil, a country marked by regional, cultural, socioeconomic, and dietary diversity, PROVEN-DIA required robust and scalable strategies to ensure the consistency, comparability, and quality of dietary intake data collected across multiple centers (Pagano, et al., 2025). Evidence from LMICs settings highlights the need for rigorous standardization of interviewer training, data collection procedures and quality control processes to adequately capture dietary intake amid cultural diversity and regional variability in food consumption patterns (Fisberg et al., 2016;Gibson & Ferguson, 2008). Accordingly, adopting standardized protocols is necessary to reduce measurement error and ensure data reliability (Gibson et al., 2017).To address these challenges, the PROVEN-DIA trial implemented a comprehensive framework that integrates standardized operating procedures, structured training, and continuous data quality monitoring across all participating sites. This framework supports consistent application of the 24HR method and enables robust analyses of nutrient intake, food group consumption, identification of ultraprocessed and organic foods, diet quality indicators, and estimation of usual dietary intake.In this article, we describe the standardized procedures and quality assurance mechanisms adopted for the 24HR data collection and entry in the PROVEN-DIA trial.The PROVEN-DIA trial (ClinicalTrials.gov: NCT06426277) is a multicenter, parallel-group, randomized controlled clinical trial designed to evaluate the effectiveness of a structured lifestyle modification program in preventing type 2 diabetes mellitus (T2DM) among adults with prediabetes.The trial is conducted at 30 sites across Brazil's five macro-regions and aims to enroll 1,590 participants. The trial comprises three groups: Usual Treatment group, PROVEN-DIA (hybrid) and TelePROVEN-DIA (virtual). Recruitment began in Nov 2024, with completion expected in June 2026 and follow-up continuing until 2029. The PROVEN-DIA trial includes five scheduled assessment visits over the 3-year follow-up period, conducted at baseline and at 6, 12, 24 and 36 months. Trained staff collect clinical, anthropometric, biochemical, and behavioral data using standardized procedures across all study sites.In this article, we focus specifically on the collection and quality assurance procedures for 24HR. Dietary intake is assessed at each study visit using two 24HR: the first is administered in person during the site visit with the support of a photographic manual, and the second is conducted by telephone on a non-consecutive day, within a maximum interval of seven days (Dodd et al., 2006). In total, participants will complete 10 dietary recalls over the course of the trial. It is important to note that researchers across the 30 participating sites have diverse professional backgrounds, including dietitians, nurses, physical education, physicians, pharmacists, among others.In the PROVEN-DIA Trial, the participants' 24HR are collected using the five-step Automated Multiple-Pass Method (Moshfegh, et al., 2008). This method comprises: (1) a quick list, in which participants report all foods and beverages consumed during the previous 24 hours; (2) probing for commonly forgotten foods or beverages;(3) identification of eating occasions and timing; (4) a detailed description cycle, including brand, preparation method, recipe, quantity, degree of food processing, and organic status; and (5) a final probing step to capture any additional items. The Automated Multiple-Pass Method is designed to improve recall accuracy and reduce measurement error in dietary assessment (Moshfegh, et al., 2008). More details were provided below.After completion of each 24HR, data recorded in the individual case report form (CRF), which serves as the source document, is entered into the Vivanda® system within seven days. The second 24HR is collected by telephone on a non-consecutive day within a protocol-defined interval and entered within the same seven-day timeframe.Vivanda® is a secure, centralized, web-based system for dietary data management, where all reported foods, beverages, and energy-contributing supplements are recorded alongside mealtimes, classification of foods as ultra-processed and/or organic, and an indication of whether the recall day was typical or atypical of the participant's usual intake. To support the standardized implementation of the 24HR, all study sites received a comprehensive set of training and reference materials. These materials included two instructional videos focused on best practices and common errors, as well as a Standard Operating Manual containing a dedicated section on 24HR procedures and the use of the Vivanda® electronic system. In addition, a Guideline for the Standardization of Recipes and Foods were provided, offering standardized information on foods and recipes, including portion sizes (e.g., household measures and gram weights), as a reference tool to support data collection and assist interviewers in resolving uncertainties throughout the 24HR process.To further support standardized data entry, eleven additional instructional videos were provided to all sites, covering system access, participant registration, and the standardization procedures used in the study.A centralized harmonization process was implemented to balance cultural representativeness and methodological standardization across sites. The process included: (i) a structured workflow for registering new foods, recipes, and household measures via a centralized spreadsheet; and (ii) analyst review prior to inclusion to ensure consistency in nomenclature, portion sizes, and preparation assumptions (de Keyzer et al., 2013).The Guideline for the Standardization of Recipes and Foods also defined reference utensils and volumes, and established default decision rules for incomplete information (e.g., whole milk when unspecified; medium unit when fruit size was not reported). A photographic guide supported participant recall and interviewer portion estimation.Given the sensitivity of sodium and fat estimates to cooking practices, harmonization of salt and oil content was applied through two standardized recipe profiles for staple dishes: lower-salt/oil (e.g., 1% salt and 2% soybean oil) and higher-salt/oil (e.g., 2% salt and 5% soybean oil). During 24HR interviews, adherence to recommended preparation practices was verified. When adherence was confirmed, the lower-salt/oil profile was selected, otherwise the higher-salt/oil profile was applied by default. This decision rule was consistently used for commonly consumed foods (e.g., rice, beans, meats, and vegetables).These harmonization procedures were systematically incorporated into the verification-based and record-based monitoring processes and informed the study's conformity and error-rate indicators, particularly by standardizing assumptions related to food preparation practices.Data monitoring in the PROVEN-DIA trial is implemented as a continuous, centralized, and collaborative process involving the participating sites and the coordinating center's data management team. Dietary data are stored in a .txt format and exported in full or by study site and period using a standardized, semicolon-delimited structure, enabling automated monitoring routines and subsequent analyses of nutrient intake, food groups, dietary patterns and degree of food processing. As part of the centralized monitoring strategy, a standardized Data Quality Report is generated periodically to consolidate predefined indicators of data completeness, internal consistency, and protocol adherence. This report serves as the primary operational tool for identifying discrepancies, guiding corrective actions, and providing structured feedback to participating sites.Data quality was assessed using three central indicators reflecting correctness of data entry, consistency, and protocol adherence: (1) conformity rate (CR), represents the proportion of correctly entered food items relative to the total number of recorded items; (2) error rate (ER), the percentage of errors identified in the Data Quality Report; and (3) completeness rate (CoR), the proportion of participants with 24HR collected within the protocol-defined time window (Ocké et al., 2014).D This component focuses on assessing conformity between data recorded in the CRF and the information transcribed into the Vivanda® system. Coordinating centers periodically review a sample of records to ensure transcription accuracy and to identify potential deviations or missing information.Identified discrepancies are documented, and feedback is provided to the respective sites to guide data correction and reinforce adherence to the standardized data collection protocol.Identified inconsistencies are classified according to their origin, enabling transparent documentation of data quality issues. Two main categories were defined: source document-related inconsistencies and system-related inconsistencies. Source document-related inconsistencies include lack of recipe standardization (e.g., failure to specify whether foods were home-prepared or commercially obtained, or whether standardized amounts of salt and oil were used), insufficient preparation details (e.g., cooking method, use of seasoning, or added sugar), and missing complementary information (e.g., classification as ultra-processed or organic foods). System-related inconsistencies include incorrect selection of foods or household measures, quantity errors, missing records, or discrepancies between the electronic database and the CRF (Supplementary Data 1).For each site, an initial CR is calculated based on these indicators and recalculated after documented corrections, providing a transparent and auditable metric of data quality improvement over time. The CR is derived from the number of identified errors and is defined as the ratio between the total number of errors arising from both source document-related inconsistencies and system-related inconsistencies and the total number of recorded food items. It is expressed as a percentage ranging from 0 to 100% conformity. This indicator reflects the accuracy of data entry and the level of concordance between CRFs and electronic records. Identified discrepancies are documented and discussed during scheduled monitoring meetings. The indicator is recalculated six months after the initiation of follow-up of the first participant through a subsequent verification-based monitoring process, allowing assessment of data quality improvement over time and supporting the reproducibility of the monitoring process (Supplementary Data 2).) □□ 100T The second monitoring component relies exclusively on the electronic database and applies predefined and reproducible indicators of data completeness and internal consistency. A complete export of the electronic database is performed monthly to support automated verification. These routines are executed using a purpose-built script developed in RStudio (R) to systematically identify potential inconsistencies in the dataset. Automated and standardized routines perform systematic checks for missing values, logical inconsistencies between variables, and outliers in dietary intake data, generating periodic Data Quality Reports. These reports are independently reviewed by the central data management team and shared with participating sites, enabling transparent communication and traceable correction procedures. Based on these analyses, a monthly Data Quality Report is generated, consolidating all issues identified during the evaluation period.The report includes a list of participants with delayed 24HR and identified inconsistencies. In addition, records with implausible total energy intake values (e.g., <1,000 kcal or >3,000 kcal) are flagged for further review. The report is made available through a dedicated SharePoint® platform for each site, with exclusive access restricted to the respective site. Participants are identified exclusively by unique study IDs to ensure blinding. Research teams are notified via institutional email upon report release and are provided with a predefined deadline to correct pending issues or submit documented justifications, ensuring transparent and traceable data correction procedures.Following the issuance of the Data Quality Report, the second quality indicator, the ER, is automatically generated within the R script (Supplementary Code 1). The ER is defined as the proportion of identified errors relative to the total number of food records for the evaluated period and is calculated as:□□□□ (%) = ( □□□□□□□□□□□□ □□□□ □□□□□□□□□□□□ □□□□□□□□□□ □□□□□□□□□□□□ □□□□ □□□□□□□□ □□□□□□□□□□□□□□ ) □□ 100This indicator quantifies the number of inconsistencies detected during data verification and serves as a standardized measure of data reliability and precision. Sites are responsible for correcting the identified discrepancies within predefined timelines after receiving the Data Quality Report.Weekly monitoring is conducted to track each site's progress and to identify 24HR that should be available in the system but have not yet been entered. A standardized monitoring spreadsheet is generated and shared with coordinating center monitors, who are responsible for maintaining structured communication with participating sites and providing case-specific support when needed.Based on the monitoring of delayed 24HR records, the third quality indicator, the CoR, is calculated weekly. The CoR represents the proportion of participants with 24HR collected within the protocol-defined time window and is calculated as:□□□□□□ (%) = ( □□□□□□□□□□□□ □□□□ 24□□□□ □□□□□□□□□□□□□□ □□□□□□□□□□□□□□ □□□□□□□□□□ □□□□□□□□□□□□ □□□□ □□□□□□□□□□□□□□□□ 24□□□□ □□□□□□□□□□□□□□ ) □□ 100This indicator reflects the completeness of dietary data entry for each site.All the rates are continuously updated, enabling longitudinal monitoring of site performance and early detection of systematic error patterns (Figure 1). Record-based monitoring combines automated verification, manual review, and continuous communication between participating sites and the coordinating center.A site with good data quality is expected to present an ER below 5% and both CoR and CR . This methodology was subsequently adapted for Latin American populations through the GloboDiet system, demonstrating feasibility of standardized approaches in diverse cultural contexts (Bel-Serrat et al., 2017). The Women's Intervention Nutrition Study (WINS) established a quality assurance system including centralized review and feedback to sites, though details on specific metrics and correction procedures were limited (Copeland et al., 2000). Similarly, the INTERMAP study described built-in quality checks and structured training across four countries, emphasizing the importance of systematic approaches to reduce measurement error (Dennis et al., 2003). The Girls Health Enrichment Multisite Studies (GEMS) evaluated the impact of different review phases (local, coordinating center) on dietary recall data quality, finding that quality control procedures primarily reduced nutrient variances rather than shifting means (Cullen et al., 2004). However, these studies focused predominantly on interviewer training and periodic review rather than continuous, metricdriven monitoring.Harmonization of recipe and food composition data has received less systematic attention in the literature. Most studies describe the use of standardized food composition databases but provide limited detail on procedures for handling regional variations in food preparation or recipes. More recently, attention has shifted to automated quality checks and digital data capture systems. Guan et al. (2019) applied source data verification to evaluate dietary intake coding quality and identified substantial discrepancy rates between diet history interviews and software outputs, highlighting the need for enhanced support tools and quality procedures. The ASCOT trial documented the need for additional data processing procedures to assess adherence to dietary guidelines, including manual adjustments to portion sizes and food item selection to improve accuracy (Ireland et al., 2025).Similarly, the Nova24h web-based tool implemented standardized procedures to impute missing information about food processing level and preparation methods, using distribution patterns observed in the larger cohort (Neri et al., 2023). In low-and middle-income settings, Gibson et al. (2017) emphasized that standardized protocols and quality control procedures are essential to minimize random errors in 24-hour recall protocols, though they noted that implementation of such standardization remains rare across many regions.The PROVEN-DIA trial addresses these challenges and extends previous approaches by implementing a comprehensive framework that combines several elements. Future trials may benefit from automated validation checks, artificial intelligence for anomaly detection, and periodic inter-site reliability assessments. Despite these limitations, this framework offers a systematic, replicable approach to enhancing dietary data quality in complex multicenter settings.This article aimed to describe the framework, procedures, and monitoring strategies implemented to support the collection and entry 24HR data in the multicenter PROVEN-DIA trial.Rather than proposing a definitive solution to the well-recognized limitations of dietary assessment, this work documents the practical efforts undertaken to minimize systematic errors and improve transparency, consistency, and traceability of dietary data in a complex research setting.Dietary intake assessment is inherently prone to measurement error, recall bias, and variability in reporting, particularly in large-scale, culturally diverse, and multicenter studies. Challenges such as heterogeneity in cooking practices, differences in interviewer experience, limitations of participant recall, and operational constraints remain unavoidable and were continuously encountered throughout the study. Accordingly, the strategies described herein were designed not to eliminate error, but to detect, document, and mitigate its impact through standardized training, harmonized decision rules, digital data capture, and continuous quality monitoring.Within this context, the PROVEN-DIA trial experience demonstrates that a structured and adaptive framework (combining operational guidance, centralized oversight, and feedback loops) may enhance the reliability and comparability of 24HR data over time. Importantly, the monitoring routines and quality indicators are used iteratively to refine training and data collection practices, reinforcing a process of ongoing improvement rather than a fixed endpoint.Although these strategies do not overcome all methodological limitations inherent to dietary assessment, they offer a scalable and replicable approach for implementing the 24HR method, including remote and telephone-based collection, in complex research environments. Beyond supporting internal data quality, this framework contributes practical methodological guidance for future nutrition studies and multicenter trials, highlighting the central role of operational planning, transparency, and continuous quality oversight in generating robust and reproducible dietary intake data.acquisition, resources, software and supervision. RP: project administration. DCF, ABNS, RP and ACBF: validation. ALF, TLVDPO, ABNS and ACBF: visualization. All authors: writing, reviewing, and editing of original text. All authors contributed to the article and approved the submitted version.