Articles

Real-Time Monitoring of Kindergarten Safety Using YOLO-11-Based Detection of Children and Adults

Ensuring the safety and well-being of children in kindergartens requires continuous monitoring of their interactions with caregivers and the surrounding environment, as even short periods of inattentiveness can lead to accidents or unnoticed risky behavior. In this work, we present a computer-vision–based monitoring system that uses an improved YOLO-11 object detection model to localize and classify adults and children in surveillance video streams in real time. Based on the detection results, the system infers whether each child is currently supervised or unsupervised, and whether a child is present near predefined dangerous zones (such as exits, staircases, or other restricted areas) defined in the camera field of view.

To support this task, a custom dataset was created and annotated with bounding boxes for “child” and “adult” classes using both publicly available images and collected video frames from kindergarten-like environments, covering different viewpoints, illumination conditions, and crowd levels. The YOLO-11 model was trained and evaluated using standard detection metrics (precision, recall, F1-score and mAP) on separate training, validation, and test splits. In addition, a simple geometric reasoning module was implemented on top of the detector outputs to derive high-level safety events, such as “unsupervised child in the room” and “child entering a danger zone.”

A prototype implementation demonstrates that the proposed approach can robustly separate adults and children, operate at real-time frame rates on GPU hardware, and automatically flag frames where a child remains alone or moves toward restricted areas, thus providing timely cues for caregivers. These preliminary results confirm the feasibility of applying modern YOLO-family detectors to real-time kindergarten safety monitoring and provide a practical foundation for further extensions toward action recognition (e.g., falling, aggression, social isolation), spatio-temporal behavior analysis, and affective state estimation in early childhood education settings.

A Self-Learning Object Detection Method Based on Highly Reliable Sample Mining

Visual object detection is an artificial intelligence technique that locates specific objects from images, which is of great significance for practical applications. However, training general object detection models require many manually annotated images, bringing more labour and time cost. In order to improve the adaptability of the object detection model to the data environment changes, this paper proposes a self-learning object detection system based on high-reliability sample mining. We first train a SampleNet that can better mine reliable training samples from unlabeled data. We then use the combination of SampleNet and the basic object detection model to build a complementary residual training framework, continuously improving the sample mining ability and object detection tasks during the training process. The experimental results show that SampleNet can stably provide reliable pseudo samples for model training, and the complementary residual training framework improves the performance of basic object detection tasks.