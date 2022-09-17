Several filtering steps were applied to both datasets to ensure data quality. Participant and valid minute counts at each filtering step are provided in Table 1. Figure 1 provides a flowchart of the data analysis process with exemplary data.

Table 1 Number of participants and minutes remaining after each step in filtering.

Figure 1 The process in which we classify participants after generating a model for them. Participants are assigned one of five activity intensity profiles based on the shape of their energy expenditure distribution. A participant is considered to be non-vigorous if they possess fewer than five data points in the vigorous activity range. If the participant’s vigorous activity level has a significant deviation between the area under the curves of the model’s slope and the actual data points, the participant is considered to be extremely active. If the participant has neither a Non-Vigorous nor an Extremely Active activity intensity profile and they have an R2 value of less than 0.9, they are classified as an outlier. The remaining participants are then sorted into Consistent if there is no significant change between the second and third slopes of the participant, or Moderately Active if there is a change.

The modeling approach proposed here posits that cumulative energy expenditure as measured by hip worn accelerometers can be well represented by a piecewise exponential decay, accounting for individual variability in cut-points and the rate of decay. Employing the piecewise linear fitting algorithm from32 on histograms of log transformed cumulative activity we can Transform the empirical histogram into a parametric distribution. This model corresponds to a piecewise exponential distribution, where the components of the piecewise distribution are analogous to different activity intensities (sedentary, light, moderate, and vigorous activity), that are often operationalized using cut-points based on Actigraph counts in the literature12,22,23,24.

Under this model, once the cut-point has been defined, the probability of increased intensity (probability) of activity within an activity level decays exponentially with a given rate, dependent on the individual and activity level class. Piecewise fitting was applied to all 513 INTERACT and 7668 NHANES participants which remained after filtering. Because this model was fit using a piecewise linear algorithm on log transformed data, the quality of fit can be evaluated by the R2 value of the linear fit (shown in Fig. 3). Note that the probability axis is log-scaled, indicating that the distribution of R2 values ​​are log-normal. Overall, the model fits exceptionally well for the majority of participants. For INTERACT data, 90% of the fits have an R2 of at least 0.9 and 99% have an R2 of at least 0.8, with a median R2 of approximately 0.95. For NHANES data 99% of fits have a R2 of at least 0.9 and 99.9% have an R2 of at least 0.8, with a median fit value of approximately 0.96. Note that the final bin contains very high R2 values, but none that are identically 1. The large number of R2 values ​​in excess of 0.9 for both counts/min and MIMS units provides confidence that the proposed piecewise exponential model represents energy expenditure probability.

Four distinct regions were identified, corresponding to sedentary, light, moderate, and vigorous activity, were detected and fit. Results for the INTERACT data based on Actigraph counts show that the range of cut-points we detect using the piecewise model are between sedentary and light (mean = 153.24, range − 20 to 443.22, standard deviation = 51.54), between light and moderate ( mean = 1930.08, range − 800 to 2500, standard deviation = 410.43), and between moderate and vigorous (mean = 5444.07, range − 1900 to 7000, standard deviation = 878.69). For the NHANES data we showed that cut-points are between sedentary and light (mean = 0.62, range − 0.5 to 1.5, standard deviation = 0.25), between light and moderate (mean = 8.99, range − 2 to 20, standard deviation = 3.41), and between moderate and vigorous (mean = 25.60, range − 5 to 50, standard deviation = 5.77).

Rarely, the low activity region had a small positive slope, indicating increasing likelihood of activity, as shown in the Fig. 1c. Occasionally the change in activity rate decay between the low and moderate intensity was small, as shown in Fig. 1b. Sometimes, the data captured an individual who was sufficiently inactive to have never registered any vigorous activity, having a behavior better represented by a three piece model Fig. 1a. Rarely participants exhibited peaks in vigorous activity, indicating some kind of regular high intensity physical activity like high intensity interval training, Fig. 1d.

Because the examples in Fig. 1 are log-linear plots, every increment in counts/min or MIMS is exponentially less likely to occur. As energy expended during a study is the integral of that exponential decay, the rate of decay has a profound impact on total energy expenditure. In all examples, more than 90% of minutes are classified below vigorous.

If we aggregate the per participant probabilities, one can create a heatmap of the likelihood of a participant having a likelihood of a specific activity rate. This heatmap visualizes the variability across the different activity rate regimes, showing where differences in activity levels for populations are likely to be evident. Figure 2 shows heatmap visualizations for the INTERACT and NHANES data. In both cases, sedentary behavior is largely consistent across participants. As activity rate increases the probability decreases, as expected, but the variance across participants increases. Figure 3 indicates, for the participants examined here, variation will be more apparent in the moderate and vigorous activity rate intensities. Both marginal count distributions indicate an exponential decay with energy expenditure. While at an individual level, there is variation in the rate of decay of energy expenditure with activity level, marginalized over the population, the result is a single exponential decay. Note that because counts/min and MIMS are not linearly related, we cannot directly compare the decay coefficients.

Figure 2 The distributions of all participants for the NHANES and INTERACT studies. The boxplots in each figure represent the distributions of each breakpoint, with the orange line representing the median, the box illustrating the upper and lower quartiles of each breakpoint, and the whiskers indicating the range of breakpoints.

Figure 3 The distributions of the four slope metrics, the three breakpoint metrics, the R2 values, and the tail metrics for both INTERACT (top) and NHANES (bottom).

Because energy expenditure can be well represented by the model, we can examine the fit parameters of the model as descriptors of individual or population behavior. The model is fit to the log transformed data with the cut-points and exponential decay. Figure 3 shows the distribution of cut-points, exponential decay, R2 values, and maximum values ​​for INTERACT and MIMS data.

The cut-points produced by our modeling approach are similar to Freedson’s cut-points for Actigraph counts among adults; SED (< 99 counts/min), LA (100–759 counts/min), MA (760–5724 counts/min), and VA (5725–max counts/min)22 and Troiano cut-points for Actigraph counts; SED (0–99 counts/min), LA (100–2019 counts/min), MA (2020–5998 counts/min), and VA (5999–max counts/min)23. Our modeled cut-points tended to be higher for the transition from sedentary to light activity and lower for the transition from moderate to vigorous activity. Critically, our cut-points are distributed across the population, accounting for individual differences. These differences could be due to inconsistency in lab based protocols used to develop previous cut-points and our free living samples. Differences could also be due to demographic differences between lab-based studies and large population-based samples used here. While individuals may exhibit different decay rates between light and moderate activity intensities, across the population, these rates are consistent, as was evident in the marginals in Fig. 3. The Greatest variance is observed in the vigorous activity break point and maximum activity recorded, as individuals had a great deal of variability in their maximum achievable activity intensity. The variability in cut-points, but consistency in decay, indicate that the increasing variability in activity intensity seen in Fig. 3 may be due to a number of factors. Regardless of your Absolute ability to carry out moderate and vigorous activity, there is limited time, motivation, or ability to do so for extended periods. We explore the possibilities of the days of the week and participant demographics being a factor in the Supplemental materials.