1. Introduction
Economic and social progress in the Republic of Korea resulted in an enhanced standard of living, which subsequently led to enormous amounts of waste from enriching consumer goods. A significant societal issue is created by this rise in garbage levels, which also harms the environment [
1]. Additionally, used-up household items, garbage, and construction waste produce foul odors and pollutants, ruining the urban landscape and threatening citizens’ health. To address this issue and develop a clean, garbage-less environment, the government implemented a volume-rate waste disposal system in 1995. The new program has a pricing model that enables people to bear a volume-rate cost from their garbage to voluntarily reduce waste and maximize the separate disposal of recyclable items, in contrast to the existing program that imposed incremental fees based on the sizes of houses or the rate of property tax [
2].
Waste eligible for volume-rate disposal corresponds to municipal waste generated by households and small enterprises. Standardized volume-based bags must be purchased to dispose of waste. As a motivation for minimizing a pollutant’s effect on health and the environment and an economic incentive to improve optimal waste disposal and increase knowledge of the citizens, the volume-based garbage disposal system aims to convey a need for the reduction of illegal garbage dumping and the cooperation and participation [
3]. The method can lessen the burden and cost associated with gathering, moving, and processing waste. However, regular instances of illegal rubbish dumping are caused by the bother of having to purchase conventional garbage bags on one’s own and the challenging process of handling enormous waste. The uncovered cases of illegal garbage dumping in Seoul went from 99,098 in 2014 to 128,144 in 2020, revealing a year-on-year increase, and it is one of the numerous social problems that must be overcome [
4]. Notable in particular are the rising instances of unlawful rubbish disposal in non-standard bags, such as white disposable delivery plastic bags or black disposable plastic bags, as more take-out food deliveries take place. Such illicit dumping is steadily increasing in the absence of aggressive prosecution, necessitating different measures.
Watchpersons or government officials patrol to find illegal dumping situations occasionally, but such efforts need a larger labor force in wide areas. The recently installed closed-circuit television (CCTV) in locations with a concentration of unlawful dumping contains video recordings. However, the lack of manpower to conduct ongoing surveillance or analyze every single film makes it difficult to bring charges for illegal dumping [
5]. Another comparable technique employs CCTV and human body identification sensors to send out an audio warning to onlookers to promote awareness, but the alert does not reveal illegal dumping; it causes noise disturbances due to the frequent pointless broadcasts. This approach may temporarily frighten illegal dumpers psychologically but has limited impacts in ending illegal dumping.
Figure 1 depicts the illegal dumping monitoring system that is now in use with the CCTV and audio broadcasts as being surrounded by various forms of unlawfully placed rubbish. This demonstrates the limitations of the current illegal dumping monitoring system despite significant initial investment in the system.
Recently proposed methods combine deep-learning object detection technology widely in use with camera-based monitoring to monitor illegal dumping. The new approach can address the limitations of the existing methods requiring significant manpower and have the benefit of reducing unnecessary noise by enhancing false alarm rates. Min and Lee [
6] proposed a way of catching illegal garbage dumping using a deep neural network trained on the joints of persons that are collected by image processing. By separating dumping postures from the other non-dumping postures, their system determines whether dumping is legal or illegal. Bae et al. [
7] used the real-time object detection model, You Only Look Once (YOLO), to learn about the illegal dumping operation itself and to create zones for observation and non-observation in order to lower the system’s false alert rate. The trained model detects an act of dumping and then identifies it as illegal only when the coordinates of the activities are within the observation zone. Jeong et al. [
8] used the Gaussian Mixture Model to examine object changes that are based on histogram differences. Their suggested approach is based on the idea that at the point of dumping, there is a divide between the dumper and the trash. Kim et al. [
9] proposed a system that detects illegal dumping using probabilistic analysis of the object trajectory.
As a result, several techniques exist to track unlawful dumping using object detection and video analysis technologies based on convolutional neural networks (CNNs), as well as detecting sensors. Nevertheless, Refs. [
6,
7] consider an act of dumping as illegal when a non-dumping posture is similar to a dumping posture, even in the absence of garbage in hand, thus raising frequent false alarms. Therefore, Refs. [
7,
9] designated an observation zone for illegal dumping. As a result, their system cannot detect illegal dumping when it occurs outside of the surveillance zone and is susceptible to numerous missed detections. Therefore, Refs. [
6,
7,
8,
9] merely identifies characteristic changes of a dumper or only differentiates standard or non-standard garbage bags, which may raise a false alarm even when garbage is in a standard bag, all of which are issues still to be addressed. As a result, a more comprehensive monitoring system for unlawful dumping is required, one that goes beyond the dumping acts itself or isolated, small surveillance zones.
This study suggests a strategy of augmented illegal dumping monitoring (AIDM) that determines the distance between the dumper’s wrist and the garbage bag. To estimate the dumper’s wrist joint, Single Person Pose Estimation, which is a method for estimating spatial dependence combinations between body parts, is required and is largely divided into a tree-structured graphical model [
10,
11] and a non-tree model [
12,
13]. Afterward, CNN was applied to increase the reliability of joint estimation [
14,
15]. However, when two people are detected on one screen, the precise joint of each person cannot be extracted, so research on Multi-Person Pose Estimation [
16,
17] has been actively conducted. Among them, the OpenPose [
18] model has been used in many fields and introduced in this study because it extracts joint points at a relatively high speed, and the amount of computation does not increase significantly even if the number of people increases.
The proposed method uses the OpenPose model [
18] that can determine the articulation points of a person to extract the wrist joint and then uses the YOLO method [
19] to classify four types of garbage bags. Additionally, to reduce errors from the unwarranted calculation of the distance of the wrist joint to the already dumped garbage bag or the issue of not identifying the same garbage due to the change in frames, we implement a Simple Online Realtime Tracking with A Deep Association Metric (DeepSORT) [
20] that can keep track of multiple objects for tracking the garbage bag identifiers (IDs). We suggest an algorithm that can identify illegal dumping by keeping track of garbage bags that have already been dumped and those that are still to be dumped separately and deciding when the distance between the dumper’s wrist and the bag of trash is more than a certain threshold. The test findings demonstrate that our method of determining illegal dumping based on the distance of the actual dumper’s wrist to the garbage bag has better efficacy than other recently published methods that are based on behavior recognition or dumping zone designation. This research has the following contributions:
With improved detection performance, the proposed monitoring system for illegal dumping can reduce noises caused by unnecessary audio guidance due to the inaccuracies of the existing illegal dumping broadcasting system;
Using the object detection model, YOLO can differentiate the standard bags that are legal for garbage dumping and the other non-standard bags. Also, the proposed technique can minimize errors of falsely recognizing dumping-like behavior as illegal dumping through OpenPose, which can extract the articulation points;
Our suggested method tracks the objects throughout the entire video without the use of specifically designated observation zones to evaluate whether illegal dumping happened;
By introducing the object tracking model DeepSORT, we give IDs to already dumped garbage and garbage held in a dumper’s hand and track the objects to detect illegal dumping, thus lowering the missed detection rate.
In this Section, we discussed the need for an illegal dumping monitoring system and the goal of the study. In
Section 2, we introduce the components of our illegal dumping monitoring system. In
Section 3, we describe the design process of the proposed system. In
Section 4, we describe the experimental conditions, testing, and results for the evaluation of the proposed system’s performance. In the last section,
Section 5, we conclude our research.
3. Proposed Architecture Design
In this section, we describe the detailed procedure for designing the proposed monitoring system that detects illegal dumping based on the distance between the potential dumper’s wrist joints found using OpenPose and the garbage bag location obtained through YOLO and DeepSORT. The block diagram in
Figure 3 shows the system schematically.
3.1. Extraction of the Articular Points of the Wrist Using OpenPose
The articular points of the person’s wrist are retrieved from the video I(t) of a possible dumper walking into the observation zone while holding the trash. To accomplish this, we input the given image to VGG-19 in OpenPose to generate a feature map, which is then used to generate a confidence map for displaying the locations of the joints and an affinity field for demonstrating the correlation between the body parts. As we detect illegal dumping based on the point in time when a part of the extracted joints separates from the garbage, in the case of the finger closest to the trash, the next closest wrist joint is selected because the joint coordinates cannot be extracted when the finger is often obscured by other objects. As a result, of the 18 joint coordinates that are retrieved, we only use the elbow and shoulder that are connected to the wrist, and we disregard the remaining 12 coordinates that are beyond the area of interest. The three joints of the shoulder, elbow, and wrist are displayed on the screen in a state where the left arm and the right arm are separated. Then, the joint coordinates of the left wrist and the joint coordinates of the right wrist are finally estimated.
3.2. Tracking the Garbage Bag Using YOLO and DeepSORT
To identify the garbage bag held by the potential dumper, we employ the real-time object detection model YOLO to obtain the bounding box
of the garbage bag as the identified object. Then, from the bounding box, we extract the top centroid
, which can be expressed as
. Furthermore, to identify illegal dumping in real time, we employ DeepSORT to determine whether the object in the previous frame
and the object in the current frame
are the same. Here, the Kalman filter, the matching cascade, and the IoU matching [
20] are conducted recursively to determine the similarity between each object. Using three states, the matched tracks for the objects being tracked continuously, the unmatched detections for designating a newly discovered object as the final object, and the unmatched tracks for designating a temporary status to the object when the tracked object is not found and the tracking cannot continue, the IoU matching finally defines an ID to the object. Here, the
contains the types of detected objects and the order
that the objects are detected. This enables the continuous recognition of the same garbage bag even when it is occluded by other obstacles. Moreover, it is possible to suppress the ID switching that may occur due to the movement of multiple garbage bags instead of one garbage bag. Accordingly, even if the detected garbage bag is dumped, it can be made to have the same ID, making a judgment on illegal dumping possible.
3.3. Discriminator for the Determination of Illegal Dumping
As described above, to determine the illegality of the garbage bag held by the potential dumper, we compute the Euclidean distance between the wrist joint coordinates
and
obtained from OpenPose and the top centroid
of the bounding box obtained from YOLO, as shown below:
As the final step, we check if the or that are calculated per frame exceeds the pre-defined threshold to evaluate whether the garbage bag being tracked is dumped illegally. When and are below the threshold, we set the object ID to 1 to indicate that the potential dumper has the garbage bag. The ID remains 1 while every frame is examined until the point of garbage bag dumping. By contrast, for the garbage bags that are dumped already, and both surpass the threshold. As a result, we set the object ID to zero (0) to indicate that the garbage bag is not held by the dumper. Thus, immediately after the garbage bag is dumped, that is, when , a judgment is made that the object is dumped, the ID changes from 1 to 0, and the alarm goes off. Furthermore, as the already dumped garbage bags are detected and set to 0, they are not falsely identified as those being held by the dumper even when the dumper’s wrist gets close to the garbage bag.
4. Experimental Results
To assess the performance of the proposed illegal dumping monitoring method, we took into account eight scenarios that were similar to actual instances of illegal dumping, including garbage dumping by one hand, dumping by both hands, garbage dumping without bending the waist, and dumping yet to have occurred with the garbage in the dumper’s hand. We then gathered the data for these cases. Furthermore, to determine the performance against the existing garbage dumping monitoring techniques, we included the approach [
7] that learns the dumping postures to decide on illegal dumping and the method, Post+det, that learns the dumping postures as well as the garbage bags. There were a total of eight situations included in the performance test.
4.1. Experimental Environment
The proposed illegal garbage dumping monitoring system was implemented by NVIDIA GeForce GTX 1060 Ti and Intel Core i7-8700 CPU. To train YOLOv4 for real-time object detection, we collected illegal dumping films for each situation using a Logitech C920 PRO HD. The dataset includes videos of the simulation of actual illegal dumping scenes, with 30 videos of about 10 s for each scenario.
Commonly dumped garbage includes black plastic bags, white plastic bags, and paper bags containing general garbage, as well as volume-based bags that are recommended to be used. We selected four types of bags that are dumped the most, as shown in
Figure 4a, to simulate actual dumping scenes under the environment in
Figure 4b. We labeled the black plastic bag trashBLK, the white plastic bag trashWHT, the paper bag trashPBG, and the standard bag trashAUT. For the YOLOv4 training, we utilized a total of 12,891 images, with the image size set to
, the batch size to 8, and the maximum number of batch learning to 15,000. There may be several items in a single photograph. There are 13,186, 16,147, 15,611 and 11,711 trashBLK, trashWHT, trashPBG, and trashAUT in all of the photos, respectively.
4.2. Evaluation of Object Detection Performance
We used the average precision (AP) as a performance indicator for assessing the performance of the object detection model YOLOv4, which is trained on the different types of collected garbage bags. To denote the model’s performance as a single numerical value, we utilized the precision-recall curve and the accuracy to evaluate the confidence of the object identified by the model. Precision is the rate of the correctly detected objects among the detected objects, recall is the rate of the detected objects among all the objects that should be detected, and accuracy is the rate of the correctly detected objects among all the objects, as demonstrated below [
6]:
where the True Positive (TP) means the object that should be identified is correctly detected, the False Positive (FP) means the object that should not be detected is wrongly detected, the False Negative (FN) means the object that should be detected is not detected, and the True Negative (TN) means the object that should not be detected is not detected. Seven hundred and ninety-eight images were used to determine object detection, and the results are shown in
Table 1.
As illustrated in the table, when the IoU is 0.5, the detection performance indicator, AP, for each class is mostly above 99%, while the average indicator, meanAP (mAP), for all classes is 99.38%, indicating that the model can classify all four objects with high accuracy. However, trashBLK indicates a lower precision than the other types of garbage bags due to the occasional false recognition of a person’s black hair or shoes.
4.3. Evaluation of the Illegal Dumping Monitoring Performance
The data gathered for the evaluation has a total of four types of garbage bags previously described. As shown in
Figure 5, we developed eight different dumping scenarios, S1 through S8, which are comparable to real garbage dumps.
The proposed AIDM determines illegal dumping based on the distance (
) between the wrist joints of a dumper and the detected object, not the dumping posture. To achieve this, we established a threshold (
) to
, taking into account the installation angle and the distance between the camera and the visible object. To verify the utility of the proposed method, we performed a comparison against the existing monitoring techniques: the technique [
7] that determines whether illegal dumping has occurred solely based on a dumping posture with the body bent forward, and the technique, Post+det, that monitors illicit dumping through the detection of garbage and dumping postures. The test results are reported in
Table 2 in terms of the reliability of the determination of illegality at the site of dumping using the scenarios S1 to S8.
As can be seen from the comparison, [
7] recorded a lower accuracy in the scenarios S1, S4, S5, and S7 because it determines whether dumping is legal by learning the shapes of the dumpers rather than the garbage bags, in contrast to the Post+det and the ADIM, which can identify the standard bags that can be legally dumped. Furthermore, the Post+det appears to demonstrate a higher detection performance overall than [
7]. However, it occasionally failed to detect suspicious dumping actions, leading to lower accuracy in scenarios S2, S3, and S6. Particularly for S7, it failed to detect anything since the garbage dumping occurred without bending the body. In contrast, the proposed model demonstrated at least 93% accuracy in identifying illegal dumping in all the scenarios, demonstrating that it is a stable illegal dumping monitoring system. On the whole, the average accuracy of [
7], the Post+det, and the AIDM for detecting illegal dumping are 0.43, 0.63, and 0.97, respectively. Therefore, it can be said that the proposed AIDM has a more robust and improved detection performance than the existing method.
Figure 6 shows the test results for scenario S4, where a legal volume-based waste bag is thrown on one hand. From top to bottom, the results are taken from each time point of
-,
-,
-, and
-seconds. At
s, the dumper is shown walking with the garbage in hand to the designated dumping site. In
Figure 6a, there is no change since the dumper has to bend his body for the dumping to be detected as such. In
Figure 6b, the system found the legal standard bag trashAUT, and in
Figure 6c, it concurrently located the person’s joints and detected trashAUT. The dumper dropped the trash bag at the 3T/4-s point. [
7] detected the dumping posture only and not the type of garbage bag, identifying it as illegal and indicating the red alarm. On the other hand, the Post+det and the AIDM can differentiate the standard bag, showing the green alarm after detecting the dumping action and deeming it legal. The
-second mark is the moment right before the dumper departs the site after dumping the garbage. The alarm was no longer displayed in [
7] and the Post+det for garbage dumping as the dumper stopped bending their body, whereas the AIDM kept the green alarm as the garbage bag discarded by the dumper had a unique ID.
Figure 7 additionally demonstrates the test results for scenario S7, where the dumper dumps the non-standard garbage bags without bending their body. Similar to the above instances, at
s, [
7] did not identify anything, while the Post+det detected three types of garbage bags, trashBLK, trashWHT, and trashPBG. The AIDM found the person’s articular points and, like the Post+det, detected all three types of garbage bags. At the
-second mark, in which the garbage is dumped, [
7] the Post+det failed to detect a dumping action as the dumper did not bend his body. On the other hand, the AIDM identified the non-standard garbage bag and determined that the distance from the wrist to the bag was above the threshold, thus deeming it unlawful and showing the red alarm.