Sedentary Behavior in Children by Wearable Cameras: Development of an Annotation Protocol

Introduction: There is increasing evidence that not all types of sedentary behavior have the same harmful effects on children ’ s health. Hence, there has been a growing interest in the use of wearable cameras. The aim of this study is to develop a protocol to categorize children ’ s wearable camera data into sedentary behavior components. Methods: Wearable camera data were collected in 3 different samples of children in 2014. A development sample (3 children aged 4 − 8 years) was used to design the annotation protocol. A training sample (4 children aged 10 years) was used to train 3 different coders. The independent reliability sample (14 children aged 9 − 11 years) was used for independent coding of wearable camera images and to esti-mate inter-rater agreement. Data were analyzed in 2018. Cohen ’ s k was calculated for every rater pair on a per-participant basis. Means and SDs were then calculated across per-participant k scores. Results: A total of 41,651 images from 14 participants were considered for analysis. Inter-rater agreement over all raters over all the sedentary behavior components was almost perfect (mean k =0.85, 95% CI=0.83, 0.87). Inter-rater reliability for screen-based sedentary behavior (mean k =0.72, 95% CI=0.62, 0.82) and nonscreen sedentary behavior ( k =0.69, 95% CI=0.65, 0.72) showed substantial agreement. Inter-rater reliability for location ( k =0.91, 95% CI=0.88, 0.93) showed almost perfect agreement. Conclusions: A reliable annotation protocol to categorize wearable camera data of children into sedentary behavior components was developed. Once applied to larger samples in children, this protocol can ultimately help to better understand the potential harms of


INTRODUCTION
R elatively little is known so far about the potential harmful or beneficial effects of screen time 1−3 and sedentary behavior (SB). 4−6 One reason for this lack of knowledge is the limitation in the current methods to assess different components of SB. Components of SB are also called facets of SB, as described by the Taxonomy of SB. 7 For example, accelerometers are unable to provide reliable information on the activity type, posture, context, or location of SB, and self-report methods have shown insufficient validity in children. 8,9 Wearable camera data provide information on SB components in adults 10−15 and are feasible for data collection in large samples of children. 16−21 The process of annotating wearable camera data to identify a certain component of SB is based on subjective decisions made by the data analyst. Previous studies assessing SB or other behavior components by wearable cameras 13,14,16,19 each developed an independent annotation protocol. These studies aimed to categorize physical activity type, 13 SB, 14 travel behavior, 16 and exposure to food marketing. 19 Up to now, no rigorous protocols exist to extract reliable measures of components of screen time SB, nonscreen time SB, and location from cameras in children. Rigorous protocols are needed to extract reliable measures of screen and nonscreen SB components in children.
The aim of this study is to develop a standardized protocol to categorize children's wearable camera data into components of SB by applying existing categories 7 and exploring children's wearable camera data. The SB components of interest were screen-and nonscreen-based leisure-time SB and the location where the SB takes place. The aim is to evaluate the inter-rater reliability of this annotation protocol in the assessment of screen time, SB, and location in children.

Study Population
In 2014, a total of 336 children were asked to participate in the study to investigate contexts of activity behavior in children. The aim was to collect data from 40 children in 2 schools in Basel, Switzerland. The study was reviewed by the Ethics Review Board of the canton Aargau and granted an exemption from requiring ethics approval. To comply with data protection and privacy rules of participants and third parties, the study was reviewed by the Federal Data Protection and Information Commissioner in Switzerland. The procedures of this study adhered to the ethical framework proposed by Kelly et al. 22 for the use of wearable cameras in health-related research. A total of 3 researchers, 2 researchers from Switzerland (JH, SS) and 1 researcher from Cyprus (EC) were involved in the development and application of the annotation protocol using different samples described in Table 1.

Measures
Data were collected from May to December 2014. SB was assessed by a wearable camera (Autographer). The Autographer captures everyday life activities through a first-person point-of-view perspective. 23,24 It is a lightweight digital camera, worn around the neck, which automatically captures photographs throughout the day. The camera has a claimed battery life of 16 hours of continuous recording if capturing images once per minute and a storage capacity capable of approximately 32,000 images (8 GB). To capture spontaneous movement patterns of children, 25,26 the image sampling rate was set to the highest rate of approximately 7 seconds. Children were asked to wear the camera for 7 consecutive days during leisure time. To respect the privacy of participants, their parents, and third parties, participants were allowed to (1) switch the camera off, (2) close the lens of the camera, or (3) remove the camera at any time. Children were told not to wear the camera during school time, in changing rooms, in common bathrooms, or in other situations where it might be inappropriate to wear a camera and third parties might be disturbed. All children were provided an information leaflet to carry with them during data collection in case they were asked about the camera. This gave third parties the possibility to contact the study investigators to ask questions or delete unwanted images. Self-reported screen time and nonscreen SB was assessed by the Adolescent Sedentary Activity Questionnaire. 27

Development of the Coder Protocol
The annotation protocol was developed on the basis of the International Taxonomy of SB 7 and the Compendium of physical activities in children. 28 The Taxonomy of SB refers to adult SB. First, 3 researchers collected wearable camera data and screened their own images to identify visual cues for the annotation protocol categories. Annotating self-collected data offers the advantage of the researcher being more likely to deduce what behavior a set of images should be annotated with. Second, children's image data of the development sample were screened to adapt the Taxonomy of SB for child-specific behaviors, choose relevant annotation categories, and adapt visual cues.
The annotation protocol was structured into different annotation passes ( Figure 1). An annotation pass is an annotation cycle where a researcher goes through all the images and categorizes only the SB components specific to that pass. Except for the first pass, the uncodable pass, which discards all images that are not codable, each annotation pass referred to an SB subcomponent as described in the Taxonomy of SB. The second annotation pass was a sedentary nonscreen type, which included the subcomponents reading/memorizing, writing, eating/drinking, playing music, spiritual, household, playing quietly, handicraft, relaxing/ sitting/talking/lying down, and personal care. The third annotation pass was a sedentary screen type including the  Note: Components surrounded by a dashed line are newly included components that were not included in the Taxonomy of SB by Chastin et al. 7 An annotation pass is an annotation cycle where a researcher goes through all the images and categorizes only the SB components specific to that task. A total of 1 annotation pass looks only at 1 component at a time, which should make annotating the images easier for raters. This allows a rater to look only at certain subcomponents at a time and discard other components for the moment. This structure allows images to be annotated with multiple (sub)components. For example, an image can be annotated for the location in 1 annotation pass and simultaneously for screen-based behavior in the other annotation pass. SB, sedentary behavior.
subcomponents watching TV, computer use, gaming console use, mobile phone use, tablet use, iPod/MP3 player use, cinema, and other. The fourth annotation pass was location including the outdoor locations nature; urban green space; gray space street; other/ mixed; and the indoor locations home, school, daycare, shops, sports facility, and other. After a first version of the annotation protocol was developed, raters were asked to annotate data of the development sample. Wearable camera data were annotated using the Oxford wearable camera browser. 29 This browser allows a researcher to view images, create behavioral episodes, and then annotate them as belonging to a certain SB component. Episodes can be annotated by applying the target SB category. On the basis of nominal group technique, which refers to an interactive cycle of blind coding followed by discussing all disagreement, which is resolved by group consensus, 30 the annotation protocol was continuously adapted, improved, and enhanced. Visual cues were defined to facilitate and objectify image annotation.
The annotation of events of ≥5 consecutive images was chosen and not image by image. The advantages are that this is less time consuming than image-by-image annotation and longer episodes of SB will not be lost if they are interrupted by <5 images. The 5image rule was defined as the following: activities will be split into episodic events, each containing ≥5 images. The start of an event is the first image in a set of 5 (or more) consecutive images that depict the same component or where the researcher is almost certain that the same component is occurring across the images. An event ends when an annotation component is no longer visible/ happening or interrupted by >5 images that show a different component. Appendix Figure 1 (available online) displays an example of how events are split. This means that events of SB <5 images (approximately 35 seconds) are not captured.

Coder Training
Training of coders should result in a high inter-rater agreement between images annotated independently by different raters. This step included the annotation of the coder training sample of 4 children according to the children's SB wearable camera annotation protocol. After the annotation of every participant, image-byimage inter-rater reliability was calculated by a weighted Cohen's k statistic. 31 A cross table was created to display disagreement among raters. After discussing disagreements among raters and correcting annotations, adapting or clarifying the annotation protocol, and visual cues, the calculation of Cohen's k was repeated to see whether the k statistic improved. This process was repeated until the agreement between raters achieved a score of ≥0.81, which is considered an almost perfect level of agreement. 32 The final protocol is detailed in Appendix Text 1 (available online).

Independent Reliability Sample
The aim was to evaluate the inter-rater reliability of this annotation protocol in the assessment of screen time, SB, and location in children when 3 different raters annotated each image independently. After coder training was completed, images of the independent reliability sample (n=14) ( Table 1) were independently annotated by 3 different raters.

Statistical Analysis
Data were analyzed in 2018. Cohen's k was calculated for all combinations of rater pairs (rater A versus B, rater A versus C, and rater B versus C), on a per-participant basis, within and across all passes. Means and SDs of all the 3 rater pairs were then calculated across the relevant per-participant k scores. Statistics were calculated using SPSS, version 24. Inter-rater reliability was interpreted using the guidelines of Landis and Koch. 32

RESULTS
A total of 53,864 images were collected across 18 days in 14 participants aged 9−11 years. Table 2 shows the demographics, mean number of images collected, and mean wearable camera wear time per day for the training and independent reliability samples.
For the independent reliability sample, 41,651 images were independently annotated by 3 different raters. The number of images annotated by each rater in each annotation category is available in Appendix File 2 (available online). Table 3 shows the number of images annotated by each rater and the rater agreement for each and overall annotation passes. The average agreement across all categories was k=0.85 (95% CI=0.83, 0.87), which is considered almost a perfect agreement. 32

DISCUSSION
In this study, an annotation protocol to categorize wearable camera data of children into components of SB was applied. Substantial to almost perfect inter-rater reliability was found across 3 different raters for almost all annotation categories when applying the annotation protocol on a sample of 14 children. Substantial to almost perfect rater agreement was also found for the separate annotation passes of the protocol. This annotation protocol provides a tool that systematically guides through the categorization process of wearable camera data. It was demonstrated that different raters can apply the same annotation protocol and achieve strong rater agreement, which demonstrates face validity. This supports the use of wearable cameras for assessing SB objectively.
A method to extract SB components from children's wearable camera data with high inter-rater reliability was developed. The only annotation passes <0.81 were the sedentary nonscreen pass and the uncodable pass. 32 Self-reports have shown limited validity in the assessment of SB in children, 9 and accelerometers are not able to assess SB components; for example, the type of behavior. Wearable cameras offer a new possibility to assess SB components that are of strong interest because current evidence indicates that not all types of SB have the same negative health effects. 33,34 An annotation protocol was developed by applying the categories of the International Taxonomy of SB 7 and the exploration of wearable camera images observed in free-living environments. A strength of this study was that the development of the coding protocol was guided by the structure of the Taxonomy of SB over which there is a consensus among researchers for the description of SB components. 7 These components were adapted for child-specific behavior by exploring children's wearable camera data. This helped investigators to avoid missing activities that could be of relevance in children's behavior and are not described by the Taxonomy of SB. Another strength of the study is that the protocol was developed on the basis of wearable camera data of slightly younger children than those included in the independent reliability sample. Younger children's SB is more interrupted, and episodes of continuous SB are shorter. Interruptions of longer episodes of SB make wearable camera data more difficult to annotate and therefore offer a good sample to stress test the development of this annotation protocol. The good inter-rater reliability indicates that 2 independent raters obtain a comparable result when applying the coding protocol, independent of the age group. This annotation protocol can be used pragmatically in children aged ≥4 years.

Limitations
Not all SB components were assessed in this paper, such as purpose, posture, and social components. Nevertheless, this annotation protocol could assess different types of leisure, screen time, and environment, which can help improve the understanding of different components of SB. Although the annotation protocol does allow the annotation of multitasking behavior across different annotation passes, it does not facilitate this within an annotation pass. For example, eating while TV viewing can be annotated in the nonscreen sedentary pass as eating and simultaneously as TV viewing in the screen time sedentary pass, but using the mobile phone while watching TV  cannot be annotated simultaneously because these activities belong to the same pass. In this case, the primary activity has to be identified and annotated. The average wear time in this study was only approximately 6 hours per day owing to the setting of image capture to a high frequency that resulted in shorter battery life. This represents a relatively low wear time, which means that some of the SB components might not be captured fully. Other studies have shown a median wear time of 10−13 hours per day when collecting wearable camera data in children. 19,20,35 Given the large amount of data that wearable cameras produce, it would be helpful to develop automated techniques to replace the laborintensive manual annotation process. Recent advances in automated techniques for medical image annotation have shown promise, and similar techniques could be developed for wearable camera data. 36

CONCLUSIONS
An annotation protocol for the annotation of SB components in children was developed, and this study showed that it is reliable and has face validity. Once applied to larger samples in children, this protocol can help to better understand the potential harms and benefits of screen time and SB in children.