Kal 5
Kal 5
Essay
Transfer Learning in the Transformer Model for Thermal
Comfort Prediction: A Case of Limited Data
Xin Zhang 1 and Peng Li 2, *
1 College of Information Science and Engineering, Henan University of Technology, Zhengzhou 450001, China;
xinzhang@haut.edu.cn
2 Beijing Institute of Technology, Beijing 100081, China
* Correspondence: lipeng360@bit.edu.cn
Abstract: The HVAC (Heating, Ventilation, and Air Conditioning) system is an important component
of a building’s energy consumption, and its primary function is to provide a comfortable thermal
environment for occupants. Accurate prediction of occupant thermal comfort is essential for improv-
ing building energy utilization as well as health and work efficiency. Therefore, the development of
accurate thermal comfort prediction models is of great value. Deep learning based on data-driven
techniques has excellent potential for predicting thermal comfort due to the development of artificial
intelligence. However, the inability to obtain large quantities of detailed thermal comfort labeling
data from residents presents a substantial challenge to the modeling endeavor. This paper proposes
a building-to-building transfer learning framework to make deep learning models applicable in
data-limited interior building environments, thereby resolving the issue and enhancing model pre-
dictive performance. The transfer learning method (TL) is applied to a novel technology dubbed
the Transformer model, which has demonstrated outstanding performance in data trend prediction.
The model exploits the spatiotemporal relationship of data regarding thermal comfort. Experiments
are conducted using the source dataset (Scales project dataset and ASHRAE RP-884 dataset) and the
target dataset (Medium US office dataset), and the results show that the proposed TL-Transformer
achieves 62.6% accuracy, 57% precision, and a 59% F1 score, and the prediction performance is better
than other existing methods. The model is useful for predicting indoor thermal comfort in buildings
with limited data, and its validity is verified by experimental results.
Citation: Zhang, X.; Li, P. Transfer
Learning in the Transformer Model
Keywords: HVAC; thermal comfort; buildings; energy efficiency; deep learning; transfer learning
for Thermal Comfort Prediction: A
Case of Limited Data. Energies 2023,
16, 7137. https://doi.org/10.3390/
en16207137
1. Introduction
Academic Editors: Hom
Buildings account for roughly 40% of the globe’s energy consumption and 30% of
Bahadur Rijal, Manoj Kumar Singh
and Sally Shahzad
its greenhouse gas emissions [1]. In the built environment, heating, ventilation, and air
conditioning (HVAC) systems are crucial in producing comfortable and healthy living
Received: 20 September 2023 conditions [2], even in environments without HVAC systems, and the choice of heating
Revised: 14 October 2023 system is particularly important for energy and thermal comfort [3]. They are a significant
Accepted: 16 October 2023 contributor to building energy consumption. Despite their high energy consumption,
Published: 18 October 2023
HVAC systems frequently fail to provide residents with ideal thermal comfort [4]. In vari-
ous building areas, occupants may experience vastly different localized temperatures due
to differences in personal preferences and environments. Even under identical thermal
Copyright: © 2023 by the authors.
conditions, occupants’ thermal preferences can vary. There is a significant divergence
Licensee MDPI, Basel, Switzerland. between energy consumption and occupant satisfaction. Not only that, but thermal storage
This article is an open access article systems in buildings can have a significant impact on thermal comfort and energy con-
distributed under the terms and sumption [5]. For example, utilizing latent heat storage in phase change materials (PCMs)
conditions of the Creative Commons can provide thermal comfort and achieve sufficient energy savings even in the absence of
Attribution (CC BY) license (https:// an HVAC system [6]. Indeed, inefficient HVAC control systems are now in place to regulate
creativecommons.org/licenses/by/ environmental comfort [7]. However, the implementation of such tactics necessitates a
4.0/). substantial energy expenditure.
regulate environmental comfort [7]. However, the implementation of such tactics necessi-
Energies 2023, 16, 7137 2 of 19
tates a substantial energy expenditure.
Thermal comfort prediction modeling for HVAC control fills the comfort and energy
efficiency gap. In actuality, thermal comfort prediction modeling uses occupant comfort
Thermal comfort prediction modeling for HVAC control fills the comfort and energy
as a criterion for the energy output saved by controlling HVAC equipment [8]. In addition,
efficiency gap. In actuality, thermal comfort prediction modeling uses occupant comfort as
aaccurate
criterion comfort prediction
for the energy outputincreases
saved by residents’
controllingproductivity
HVAC equipment and health indices. Exten-
[8]. In addition,
sive experiments have been performed to predict the thermal comfort of
accurate comfort prediction increases residents’ productivity and health indices. Extensive building occu-
pants, with have
experiments Fanger et al.’s
been Predicted
performed Mean the
to predict Vote (PMV)comfort
thermal model of being one of
building the most prom-
occupants,
inent
with [9]. The
Fanger PMV
et al.’s modelMean
Predicted is devised based
Vote (PMV) on the
model theory
being one ofofthe
human heat balance
most prominent [9]. and a
large
The PMVcorpus
modelofisexperimental
devised based data,
on theincorporating
theory of humansixheat
variables
balanceto calculate
and the state
a large corpus of of hu-
experimental data, incorporating
man heat balance six variables to calculate the state of human heat balance
(see Figure 1).
(see Figure 1).
The Transformer model is widely applicable and is regarded as one of the most promis-
ing models recently proposed. Wen et al. [20] explored the performance of Transformer in
time series forecasting tasks. They discovered that the Transformer, which relies on large
amounts of training data, can handle long-range dependencies and is readily extensible to
other tasks.
In this study, a model called Transfer Learning Transformer (TL-Transformer) is pro-
posed that captures temporal and spatial correlations in the data and uses transfer learning
to train the model on the source dataset to improve the performance of thermal comfort
prediction on the target dataset when the target dataset is limited. The ASHRAE RP-884
and Scales Project datasets were selected as source datasets, while the Medium US Office
dataset [18] was chosen as the target dataset. The advantages of the proposed method are
the low cost of prediction (low-cost data) and easy scalability to built environments with
limited data sets. Numerous tests on each of these publicly available datasets demonstrate
that the proposed thermal comfort model outperforms well-known knowledge-based and
data-driven models. The following are some contributions to this study:
1. This is the first time a Transfer Learning-based Transformer (TL-Transformer) model
has been used to accurately predict the thermal comfort of a building in the presence
of limited modeling data from diverse climate zones.
2. The TL-Transformer model employs a particular data succession strategy to capture
the temporal and spatial relationships within the input data, thereby facilitating
effective modeling.
3. We investigate the effect of different feature sets on thermal comfort modeling. The
most accurate predictive performance for the transfer learning-based thermal comfort
model was achieved by combining personal factors, outdoor environmental factors,
and the six PMV model factors.
4. The experiments demonstrate that the proposed TL-Transformer model for predicting
thermal comfort surpasses the knowledge-driven and data-driven models and can be
applied to buildings with less labeled thermal comfort data.
2. Related Work
In this portion, we enumerate the relevant prior literature on traditional thermal
comfort modeling approaches, transfer learning applications, and Transformer model
applications that demonstrate the value of this study.
the actual needs of the occupants while also assisting in the establishment of individualized
indoor thermal comfort environments for diverse individuals. Similarly, Peng et al. [25]
created a platform to mimic distinct thermal feelings in occupants and developed a hybrid
SVM-LDA thermal comfort classifier. They compared their classifier to various machine
learning methods for thermal comfort prediction, and the findings revealed that the model
performed the best. One reason for this conclusion is that SVMs perform well with small
sample sizes, which is one of the key reasons why SVMs are so popular. Zhang et al. [26]
proposed an improved random forest algorithm-based indoor thermal comfort model for
offices. They used data from an intelligent building monitoring system to assess thermal
comfort in various thermal environments. The researchers used the K-means algorithm
to select decision trees with low similarity and combined them into a new random forest
model based on the experimental results. This new model performed well on the test
set, and the random forest-based prediction model outperformed other machine learning
algorithm models regarding generalization. Hu et al. [27] successfully applied emerging
machine learning techniques and extensive IoT-based sensing technologies to create a
black-box MLP neural network for thermal comfort modeling. Excitingly, this network
demonstrated superior predictive performance compared to PMV and traditional white-box
machine learning models. The reason for this success can be summarized by the fact that
the MLP neural network has a powerful expressive capability that can be trained by back-
propagation algorithms, which enables automatic learning of features and patterns, further
enhancing its effectiveness in thermal comfort prediction. However, the aforementioned
thermal comfort modeling approaches require occupants’ private information, such as
physiological and environmental data, which may result in a privacy breach. Compared
to, for instance, conventional PMV models, machine learning algorithms have enhanced
performance in predicting thermal comfort and are more adaptable.
Deep learning models have distinct advantages over machine learning in thermal
comfort prediction. Deep learning can automatically learn features in complex thermal
comfort environments, handle complex unstructured data, adapt to multiple tasks and
data distributions, and have a highly flexible model structure that can efficiently handle
large-scale data and achieve high-precision predictions. Deep learning can now excel at
predicting thermal comfort.
Artificial Neural Networks (ANN), as a non-traditional algorithm to learn and repre-
sent data via a multi-level neural network structure, excel at predicting thermal comfort.
Sanjeev et al. [28] proposed a new Bayesian optimization algorithm based on neighborhood
component analysis for developing a heat index prediction model. The study’s findings
show that the artificial neural network (ANN) model accurately and reliably predicts heat
perception in real-time environments. Furthermore, because building thermal comfort data
exhibits spatiotemporal relationships, time series prediction models in deep learning have
become a powerful tool for building thermal comfort prediction. Chennapragada et al. [29]
designed the thermal preference prediction task as a multivariate, multi-class classification
problem and used deep learning and time series methods for prediction via an LSTM net-
work. The results show that this approach outperforms state-of-the-art machine learning
methods for heat preference prediction when applied to the same task. This highlights the
significant advantages of deep learning time series models when dealing with spatially and
temporally related building thermal comfort data. Similarly, Furkan et al. [30] introduced a
Convolutional Neural Network-Long Short-Term Memory (CNN-LSTM) architecture that
combines the specific feature extraction of the convolutional layer with the ability of Long
Short-Term Memory (LSTM) to learn sequential dependencies, with significantly improved
prediction performance compared to the comparison model.
crucial to performance. Obtaining comprehensive and accurate training data can be chal-
lenging in thermal comfort prediction. Transfer learning can reduce the need for large
amounts of labeled data by transferring knowledge from existing tasks to new ones. Utiliz-
ing data and models from the source domain can lead to better performance on the target
domain, even if the target domain has relatively little data. With transfer learning, a model
can learn common feature representations or patterns from one or more related tasks, thus
improving generalization on new tasks. Transfer learning can help models better compre-
hend and capture shared structure in the data, thereby improving predictive performance.
Hansaem et al. [31] developed a deep learning model based on transfer learning to address
the issue of poor generalization performance due to insufficient datasets of individual target
subjects. The study’s findings show that the model performs better on the target dataset
after training on the source dataset. Furthermore, the transfer learning-based model can
maintain high prediction accuracy even when the target building’s dataset is imbalanced.
To address the difficulty of generalizing their developed model to other building occupants,
Das et al. [32] proposed a transfer learning framework that employs Adversarial Domain
Adaptation (ADA) techniques to develop thermal comfort predictors personalized for the
target occupants in an unsupervised manner. The study’s findings show that the model
trained using transfer learning can accurately predict thermal comfort for most popula-
tions. Similarly, Somu et al. [33] proposed a Convolutional Neural Network-Long-Term
Short-Term Memory Neural Network (TL CNN-LSTM) based on transfer learning to model
spatio-temporal relationships in thermal comfort data. They also used the transfer learning
technique to address the issue of insufficient data. The study’s findings show that this
model has good accuracy in thermal comfort prediction, highlighting the effectiveness
of transfer learning in dealing with spatio-temporal relationships and data insufficiency
issues in thermal comfort data.
the problem of poor model performance due to insufficient data, which is a significant inno-
vation. Second, the Transformer model is used innovatively to capture the spatio-temporal
relationship of thermal comfort data, improving prediction accuracy and reliability. Finally,
the model uses data that is more readily available, less expensive to collect, and takes full
account of user privacy, making it more feasible and acceptable for practical applications.
3. Methodology
3.1. Problem Statement
Thermal comfort refers to the level of comfort or discomfort the human organism
perceives. The thermal sensations of residents are measured on a 7-point scale, which
includes ‘Very cold’ (−3), ‘Cold’ (−2), ‘Slightly cool’ (−1), ‘Neutral’ (0), ‘Slightly warm’
(+1), ‘Hot’ (+2), and ‘Very hot’ (+3) [36]. Due to the small number of occurrences for −3
and +3, we merged with −2 and +2, respectively, reducing the thermal comfort scale from
7 to 5 points. This transformation seeks to enhance the precision of thermal measurements
and equalize the distribution of data across categories.
After numerically representing the thermal comfort sensation, it is possible to reduce
the thermal comfort prediction problem to a numerical classification problem. The thermal
comfort prediction model for transfer learning [37] proposed in this paper consists of
multiple source domains Ds(ds1 . . . dsn) and a single target domain Dt, with the source
and target domains expressed as follows:
ds = { a, P( X )} (1)
Figure2.2.Architecture
Figure Architectureofof transfer
transfer learning.
learning.
ASHRAE RP-884 dataset: Comprises 25,000 raw thermal comfort data sets containing
data from various climates and buildings in various regions of the world.
Scales project dataset: This data set includes thermal comfort ratings from 8225 partici-
pants in 57 localities and 30 countries. It contains 137 variables related to thermal comfort.
Medium US Office dataset: This data set contains information regarding thermal com-
fort compared to twenty-four people at the Friend Center office building in Philadelphia,
Pennsylvania, United States. The data were recorded three times daily (morning, noon,
and afternoon) over two weeks for each of the four seasons.
Figure 3 displays the thermal sensation distribution of the datasets. The three afore-
mentioned datasets have a similar distribution of thermal sensations, with occupants feeling
comfortable, slightly dry, or uncomfortable with the thermal environment the majority of
the time and uncomfortable very infrequently, which is consistent with our lived experience.
The Scales Project dataset and the ASHRAE dataset exhibit greater diversity compared to
the Medium US Office dataset due to their inclusion of a broad range of data originating
from various climatic zones across the globe. In contrast, the dataset pertaining to the
Medium US Office comprises exclusively data derived from a solitary building.
The air temperature is among the most influential factors in residents’ thermal comfort.
Figure 4 illustrates the correlation between the scale of thermal feeling and the temperature
of the inside environment, with higher indoor air temperatures correlating to higher thermal
sensation ratings and other factors that influence thermal sensation, including age, gender,
relative humidity, and outdoor weather. Therefore, we should use as much information as
possible to build more accurate and reliable models for predicting thermal comfort.
According to the analyses of the above datasets, the Medium US Office datasets and
the ASHRAE Scales Project differ significantly. This is because the buildings in each of
the datasets are situated in different climatic regions (which contain temperate climate
zones, tropical climate zones, subtropical climate zones, and Mediterranean climate zones),
and these different climatic regions result in different perceptions, activities, and personal
preferences, which all influence how thermally comfortable a building’s occupants feel.
majority of the time and uncomfortable very infrequently, which is consistent with our
lived experience. The Scales Project dataset and the ASHRAE dataset exhibit greater di-
versity compared to the Medium US Office dataset due to their inclusion of a broad range
of data originating from various climatic zones across the globe. In contrast, the dataset
Energies 2023, 16, 7137 pertaining to the Medium US Office comprises exclusively data derived from a solitary 8 of 19
building.
The air temperature is among the most influential factors in residents’ thermal com-
fort. Figure 4 illustrates the correlation between the scale of thermal feeling and the tem-
perature of the inside environment, with higher indoor air temperatures correlating to
higher thermal sensation ratings and other factors that influence thermal sensation, in-
cluding age, gender, relative humidity, and outdoor weather. Therefore, we should use as
much information as possible to build more accurate and reliable models for predicting
thermal comfort.
(a) ASHRAE RP-884 (b) The Scales Project (c) Medium US Office
Boxplotsofofthermal
Figure4.4.Boxplots
Figure thermalsensation
sensationand
andthe
theindoor
indoortemperature.
temperature.
According to the analyses of the above datasets, the Medium US Office datasets and
the ASHRAE Scales Project differ significantly. This is because the buildings in each of the
datasets are situated in different climatic regions (which contain temperate climate zones,
tropical climate zones, subtropical climate zones, and Mediterranean climate zones), and
Energies 2023, 16, 7137 9 of 19
(1) Indoor environmental factors: wind speed, air temperature, relative radiant temper-
ature, and relative humidity are the indoor factors that influence thermal comfort
the most directly. Wind velocity is the airflow rate relative to a stationary point in
a chamber; the mean temperature of the air in a room is the air temperature; rela-
tive humidity is the ratio between the saturated water vapor pressure and the air’s
water vapor pressure at the same temperature; and relative radiant temperature is a
comparison of the radiant temperature of the human body’s surface to the ambient
temperature, which is used to assess the body’s ability to regulate its temperature.
(2) Outdoor environmental factors: The outdoor air temperature and relative humidity
substantially impact the thermal comfort of indoor occupants. For instance, during
the frigid winter months, the outdoor environment has a direct effect on the thermal
comfort of a building.
(3) Individual factors: age, gender, clothing insulation coefficient, and metabolic rate
are crucial factors that affect thermal comfort; they have a substantial effect on the
regulation of the human body’s thermal balance and the generation of thermal sensa-
tion. Different ages and genders will have different metabolic rates, and according
to personal preferences, they will wear different clothing insulation coefficients. The
individual differences will have an impact on thermal comfort.
Principlesof
Figure5.5.Principles
Figure ofthe
theAdaptive
Adaptive Synthetic
Synthetic Sampling
Sampling(ADASYN)
(ADASYN)algorithm.
algorithm.
Initially,an
Initially, anevaluation
evaluation waswas conducted
conductedto todetermine
determinethetheextent
extentofof
category
category imbalance
imbalance
by calculating the proportion between the number of minority group samples and the
by calculating the proportion between the number of minority group samples and the
number of majority category samples. If the degree of imbalance exceeds a pre-established
number of majority category samples. If the degree of imbalance exceeds a pre-established
threshold, the quantity of synthetic data generated is decided to achieve data balance.
threshold, the quantity of synthetic data generated is decided to achieve data balance. Af-
After calculating the ratios between samples belonging to each minority category and
tersamples
calculating the ratios
belonging to thebetween
majoritysamples belonging
category, to each
these ratios are minority
normalized category and sam-
to generate a
ples belonging
density to the majority
distribution. category, of
The determination these
the ratios areofnormalized
quantity to generate
synthetic data a density
to be generated
distribution.
for each sampleThe is
determination of the quantity
achieved by integrating of synthetic
the normalized datawith
ratios to be
thegenerated for each
overall amount
sample is achieved by integrating the normalized ratios with the overall
of synthetic data. During the generation phase, synthetic data is produced by calculatingamount of syn-
thetic data. During
a weighted differencethevector.
generation
This is phase,
achieved synthetic data
by picking is produced
samples closest tobythe
calculating
minority a
weighted
category difference vector.maintaining
samples, thereby This is achieved by picking
the essential samples
attributes closest
of the to the
original minority
data. This
approach has the potential to enhance the uneven distribution of data and
category samples, thereby maintaining the essential attributes of the original data. This offer more
comprehensive
approach has thedata for thetomodel,
potential enhance hence
the enhancing the model’sofgeneralization
uneven distribution data and offercapability
more com-
and performance.
prehensive data for the model, hence enhancing the model’s generalization capability and
performance.
3.5. Evaluation Criteria
This experiment explores using a Transformer based on transfer learning for model-
ing thermal comfort prediction. Three evaluation metrics, including accuracy, precision,
and F1 score, are employed to compare the performance of the proposed model with
other models [38].
Accuracy: Accuracy is one of the most frequently employed evaluation metrics in
classification prediction, which calculates the proportion of samples correctly predicted by
a model. Accuracy measures how accurately the model predicts the classification task as
a whole. However, accuracy is less effective in predicting a few categories when dealing
with unbalanced datasets.
TP + TN
Accuracy = (3)
TP + TN + FP + FN
where TP represents the number of true positive predictions, TN represents the number of
true negative predictions, FP represents the number of false positive predictions, and FN
represents the number of false negative predictions.
Precision: Precision measures the proportion of true positive predictions among the
samples predicted as positive. It helps identify the occurrence of false positives in the
model’s positive predictions. Precision is an important evaluation metric in scenarios where
reducing false positives is crucial.
TP
Precision = (4)
TP + FP
Energies 2023, 16, 7137 11 of 19
Energies 2023, 16, x FOR PEER REVIEW 12 of 20
Figure6.6.Architecture
Figure Architectureof
of the
the Transformer
Transformer model.
Encoder: The encoder architecture consists of an input layer, a positional coding layer,
4. Experiment
and
4.1. Datacoding
four layers that are identical. Initially, the input layer transforms the symbols
Processing
or tokens (e.g., words, characters, etc.) of the input sequence into a compact, continuous
In this research, a collection of publicly accessible datasets was utilized. The dataset
vector representation. In this experiment, the input data is the data collected by the sensor,
was processed as follows: (1) Data preparation: data columns of non-key features in the
and the letter ‘t’ in the figure denotes the input data, where the application of a multi-
dataset were deleted, followed by the deletion of rows containing null data; data of gender
attention mechanism is necessary to completely capture the multifaceted characteristics
feature columns in the dataset were replaced with ‘1’ for males and ‘2’ for females; and
of the input data. The position encoding layer then transmits the position information
four classifications of age (18, 19–30, 31–45, and 45+) were created. In addition, merge −3
by embedding a set of position encoding vectors within the embedded representation of
(very cold) with −2 (cold) and +2 (hot) with +3 (very hot) in the dataset’s thermal sensation
the input sequence, where the dimensions of these vectors correspond to sine and cosine
vote column. (2) Outlier management: Z-Score is a commonly used outlier detection
functions of varying frequencies. Four encoder layers are supplied with the position-
method that determines whether a data point is an outlier by calculating the standard
encoding
deviationvectors
betweenthat
theare generated.
data point andInternally,
its mean. each encodermethod
The Z-Score layer comprises
can be useda multi-head
to detect
self-attentive sublayer and a feedforward, entirely connected sublayer. Each
outliers in both univariate and multivariate data. Standardization is a frequently sublayer
em-is
followed by a normalization layer and residual connectivity to assure information
ployed data scaling method that modifies the data according to the mean and standard flow
and training stability. The encoder ultimately outputs a vector of model dimensions
deviation to exhibit a normal distribution with a mean of 0 and a standard deviation of 1. for the
decoder to further process. This structure demonstrates a high level of hierarchical logic
and sequence data processing specialization.
Energies 2023, 16, 7137 12 of 19
Decoder: The decoder comprises an input layer, four overlapping decoder layers, and
an output layer. The initial input is derived from the last data point in the encoder output
sequence. By mapping the input of the decoder to a vector representation of the model’s
dimensions, the embedding layer is represented. The decoder layer is comprised of three
sub-layers: the self-attention layer and the feed-forward neural network layer, which are
shared components. In contrast, each decoder layer contains an additional sub-layer for
focusing on the encoder output and capturing pertinent information during the generation
process. The output layer maps the output of the final decoder layer to the time series
of interest. A look-ahead mask limits exposure to only forward data at each time step
and introduces a positional bias between decoder inputs and target outputs to ensure that
future data points are not used for time series prediction. Such a design allows the decoder
to generate the desired sequence efficiently.
4. Experiment
4.1. Data Processing
In this research, a collection of publicly accessible datasets was utilized. The dataset
was processed as follows: (1) Data preparation: data columns of non-key features in the
dataset were deleted, followed by the deletion of rows containing null data; data of gender
feature columns in the dataset were replaced with ‘1’ for males and ‘2’ for females; and
four classifications of age (18, 19–30, 31–45, and 45+) were created. In addition, merge
−3 (very cold) with −2 (cold) and +2 (hot) with +3 (very hot) in the dataset’s thermal
sensation vote column. (2) Outlier management: Z-Score is a commonly used outlier
detection method that determines whether a data point is an outlier by calculating the
standard deviation between the data point and its mean. The Z-Score method can be used
to detect outliers in both univariate and multivariate data. Standardization is a frequently
employed data scaling method that modifies the data according to the mean and standard
deviation to exhibit a normal distribution with a mean of 0 and a standard deviation
of 1. Standardization eliminates magnitude differences between distinct characteristics,
rendering the data comparable and unaffected by outliers.
the most essential, and the precision rate is used to test the false alarms in the case of a
positive prediction to measure the model’s accuracy in the case of a positive prediction. The
F1 Score metric is used as one of the performance metrics to achieve a better equilibrium
between precision and recall. The F1 score metric balances the accuracy of the model
with underreporting.
In this study, the PMV model, several random forest models, and deep learning models
are chosen as baselines to compare with the proposed method. The PMV model is the most
well-known thermal comfort model, and it has been extensively implemented in a variety
of indoor environmental design disciplines. KNN is uncomplicated and straightforward
to implement, making it ideal for multi-classification tasks. Support Vector Machine is
appropriate for high-dimensional datasets, while random forest is more efficient for large-
scale datasets due to its relatively rapid training time. AdaBoost is quick, straightforward,
user-friendly, and requires no parameter tuning. The machine learning algorithms have
more input features than the PMV model and are better able to account for individual
differences. Similarly, we compare our model to the neural network models LSTM and
CNN-LSTM, which are applied to time series data. We also compare the proposed TL-
Transformer model to the Transformer model without transfer learning, using the Scales
Project databases and ASHRAE as source domains and the Medium US Office dataset as
the target domain.
Table 3 demonstrates that the PMV model outperforms the SVM algorithm in machine
learning regarding accuracy and F1 score metrics. However, the SVM model is more
accurate than the PMV model because machine learning employs more features. In addi-
tion, all machine learning algorithms outperformed the PMV model, with random forest
performing the best, consistent with previous findings that random forest is one of the
best classification algorithms for limited datasets [39]. The prediction performance of deep
learning models is superior to that of machine learning models because LSTM and CNN-
Energies 2023, 16, 7137 14 of 19
LSTM can capture the spatial and temporal relationships in the thermal comfort dataset [33].
However, the deep learning model that incorporates transfer learning performs better, pri-
marily because it can preserve and transfer the source domain’s higher-order relationships
to the target domain. In conclusion, the performance metrics of the TL-Transformer model
significantly outperform those of the PMV model and the machine learning model, and its
ability to handle long series data and capture long-term dependencies, as well as its ability
to use pre-training knowledge and parameters to accelerate the model training process,
reduce the data requirements, and improve generalization in the target domain, are the
primary reasons for the TL-Transformer’s superior performance [40].
Table 4. Prediction performance on random forest and deep learning models with different feature sets.
For feature set FS1 , the PMV model has better accuracy than Transformer but poorer
precision and F1 score; TL-Transformer achieves the highest level of precision; and the
data-driven thermal comfort-based model outperforms FS1 on feature set FS2 , indicating
that personal information (age and gender) can be utilized to enhance thermal comfort
prediction. Random forest, Transformer, and TL-Transformer performed best in feature set
FS3 when compared to feature sets FS1 and FS2 , highlighting the importance of personal
factors and external environmental factors in accurate thermal comfort modeling.
In addition, the confusion matrix of the TL-Transformer and the Transformer models
and the best-performing machine learning (random forest) were analyzed (see Figure 7) to
further investigate the performance of the TL-Transformer.
Energies
Energies2023,
2023,16,
16,x7137
FOR PEER REVIEW 1615ofof21
19
(a) (b)
(c)
Figure
Figure7.7.Confusion
Confusion matrix:
matrix: (a) randomforest,
(a) random forest,(b)
(b)Transformer,
Transformer,andand (c) Transfer
(c) Transfer learning
learning (TL)-Trans-
(TL)-Transformer.
former.
The random forest model predicts a probability of 0.58 for labels in the “0” category,
whereasThe Receiver Operating
the Transformer Characteristic
model (ROC) curvesoffor
predicts probabilities 0.59theand
TL-Transformer model,
0.40, respectively, for
random
labels in the “0” and “−2” categories. However, there is a high probability that both are
forest model, and Transformer model are depicted in Figure 8. From a to e the
models
ROC curves the
will classify for remaining
five different prediction
three labels.
labeled data Theas
points ROC
“0” comprises
labels. a graph with True
Positive Rate (FPR)
In contrast, and False Positive
TL-Transformer Rateaccuracy
has high (TPR) asfor theeach
horizontal
category,and vertical
with coordi-
49% accuracy
nates, respectively.
for category “−2”,AUC denotes the
59% accuracy forarea under“−
category the1”,ROC
and curve, which isfor
66% accuracy mainly used“0”.
category to
measure
Categorythe generalization
“1” has an accuracy performance
of 61%, andof category
the model, “2”i.e.,
hashow good the of
an accuracy classification
55%, whichis,is
and the larger
consistent withthe AUC is,
common the better
sense as the the model accuracy
prediction effect is. Based on the ROC
of the majority curveisgraphs,
category slightly
ithigher
is known
thanthat TL-Transformer
the minority category.has higher AUC values for each class compared to the
random Theforest andOperating
Receiver Transformer models, which
Characteristic (ROC) indicates
curves for thatthe
TL-Transformer
TL-Transformer is model,
accu-
rately
random categorizing all types.
forest model, and Transformer model are depicted in Figure 8. From a to e are
the ROC curves for five different prediction labels. The ROC comprises a graph with True
Positive Rate (FPR) and False Positive Rate (TPR) as the horizontal and vertical coordinates,
respectively. AUC denotes the area under the ROC curve, which is mainly used to measure
the generalization performance of the model, i.e., how good the classification is, and the
larger the AUC is, the better the model effect is. Based on the ROC curve graphs, it is known
that TL-Transformer has higher AUC values for each class compared to the random forest
and Transformer models, which indicates that TL-Transformer is accurately categorizing
all types.
nates, respectively. AUC denotes the area under the ROC curve, which is mainly used to
measure the generalization performance of the model, i.e., how good the classification is,
and the larger the AUC is, the better the model effect is. Based on the ROC curve graphs,
it is known that TL-Transformer has higher AUC values for each class compared to the
Energies 2023, 16, 7137
random forest and Transformer models, which indicates that TL-Transformer is accu- 16 of 19
rately categorizing all types.
(a) (b)
(c) (d)
(e)
Figure 8. Receiver
Figure Operating
8. Receiver Characteristic
Operating Characteristic(ROC)
(ROC)curves
curvesof
ofrandom
random forest,
forest, Transformer, and Trans-
Transfer
fer learning
learning (TL)-Transformer.
(TL)-Transformer.
4.3.3. Impact
4.3.3. Impact ofof
the Size
the Sizeofofthe
theTarget
TargetDataset
Dataset
WeWe have examined
have examinedTransformer and and
Transformer Transfer Learning-Transformer
Transfer Learning-Transformer(TL-Transformer)
(TL-Trans-
in former)
depth on in target
depth datasets
on target of varying
datasets sizes. Figure
of varying 9 shows
sizes. Figure the accuracy
9 shows performance
the accuracy perfor-
of Transformer and TL-Transformer
mance of Transformer on different
and TL-Transformer proportions
on different of the training
proportions set, and
of the training the
set,
accuracy
and theofaccuracy
TL-Transformer is consistently
of TL-Transformer higher than
is consistently thatthan
higher of Transformer on different
that of Transformer on
proportions of the data,of
different proportions indicating that Transformer
the data, indicating based on based
that Transformer migration learninglearning
on migration performs
performs better. This finding validates the suitability of the TL-Transformer for thermal
comfort modeling of fresh construction and demonstrates the potential for enhanced
building management efficiency.
Energies 2023, 16, 7137 17 of 19
better.
Energies 2023, 16, x FOR PEER REVIEW This finding validates the suitability of the TL-Transformer for thermal
18 ofcomfort
20
modeling of fresh construction and demonstrates the potential for enhanced building
management efficiency.
Performance
Figure9.9.Performance
Figure of of Transformer
Transformer andand Transfer
Transfer learning
learning (TL)-Transformer
(TL)-Transformer on on different
different percentage
percent-
training sets.
age training sets.
5.5.Conclusions
Conclusions
Thermal
Thermal comfort
comfort hashas developed
developed into into a prominent
a prominent field offield of and
study, study, and credible
credible thermal ther-
mal comfort prediction is necessary for optimizing building design,
comfort prediction is necessary for optimizing building design, enhancing indoor envi-enhancing indoor
ronmental comfort, increasing productivity, reducing energy costs, and enhancing the the
environmental comfort, increasing productivity, reducing energy costs, and enhancing
userexperience.
user experience. However,
However, building
building design
design and construction,
and construction, individual
individual differences
differences in oc- in
occupants,
cupants, andand
the the difficulty
difficulty of obtaining
of obtaining high-quality
high-quality data data present
present several
several challenges
challenges in in
predictingthermal
predicting thermal comfort
comfort indices
indices in buildings.
in buildings. Therefore,
Therefore, the proposed
the proposed TL-Transformer
TL-Transformer
modelmakes
model makesanan important
important contribution
contribution to the
to the field
field of thermal
of thermal comfort
comfort prediction.
prediction.
1.1. TheTheproposed
proposed Transformer
Transformer model basedbased
model on deep ontransfer learning learning
deep transfer provides relatively
provides rela-
accurate thermal comfort predictions with higher accuracy
tively accurate thermal comfort predictions with higher accuracy for prediction for prediction than tradi- than
tional PMV models and machine learning
traditional PMV models and machine learning models. models.
2.2. This
Thisstudy
study tackles
tacklesthe the
issueissue
of inadequate
of inadequate data leading
data leadingto models that cannot
to models thatbecannot
ad- be
equately trained, affecting model performance. Additionally,
adequately trained, affecting model performance. Additionally, the study compared the study compared
various
various feature
feature combinations
combinations in the dataset
in the to find
dataset the best
to find feature
the best combination.
feature combination.
3.3. InInthe meantime, the category imbalance issue
the meantime, the category imbalance issue was addressed using was addressed using the ADASYN
the ADASYN
(Adaptive
(Adaptive Synthetic
Synthetic Sampling)
Sampling) method. Resolving
method. these issues
Resolving these offers
issuestrustworthy so-
offers trustworthy
lutions and approaches to get around the target dataset’s limitations
solutions and approaches to get around the target dataset’s limitations and raise the and raise the
predictive
predictive accuracy
accuracy of of
thermal
thermal comfort
comfortmodels.
models.
Experiments on three datasets confirmed that the TL-Transformer model is superior
Experiments on three datasets confirmed that the TL-Transformer model is superior
to existing technical algorithms, and evaluation metrics, including accuracy, F1 score, and
to existing technical algorithms,
precision, demonstrated our model’s andhighevaluation metrics,
performance and including accuracy,
dependability. ThisF1study
score, and
demonstrates the prospect of predicting thermal comfort using small datasets by increas-study
precision, demonstrated our model’s high performance and dependability. This
demonstrates
ing the scale ofthe theprospect
training of datapredicting
through thermal comfort
the transfer usingtechnique
learning small datasets by increasing
and using cli-
the scale of the training data through the transfer learning
mate-like region data for modeling the thermal comfort of target buildings. In the context technique and using climate-
like region data for modeling the thermal comfort of target
of new construction and limited sensor equipment, this model utilizes easily accessible buildings. In the context of
new construction and limited sensor equipment, this model
and low-cost data to achieve accurate thermal comfort predictions, thus providing users utilizes easily accessible and
low-cost
with dataindoor
a higher to achieve
comfortaccurate thermal
experience. Thiscomfort
innovative predictions,
approachthus providing
not only helps tousers
im- with
a higher indoor comfort experience. This innovative
prove thermal comfort for users but also significantly improves energy efficiency. approach not only helps to improve
thermal comfort for users but also significantly improves energy
Nevertheless, there are limitations to our study: (1) The characteristics of our experi- efficiency.
mentalNevertheless, there arevaried,
data are insufficiently limitations
and the todata
our size
study: (1) The characteristics
is insufficient, of our experi-
which may influence
mental
the data
model’s are insufficiently
performance. (2) Thevaried, and the data
target regional climate sizeforisthis
insufficient,
experiment which may influence
was “temper-
ate climate”. Future research should be conducted on buildings in more diverse climate
Energies 2023, 16, 7137 18 of 19
the model’s performance. (2) The target regional climate for this experiment was “temper-
ate climate”. Future research should be conducted on buildings in more diverse climate
zones. (3) In the case of non-equilibrium thermal comfort data, we have relatively low
predictive accuracy for very few categories (e.g., −2 (cold) and 2 (hot)). In future research,
we need to delve deeper into the impact of more characterization data on model accuracy,
and in addition, investigating more data imbalance techniques on model predictions is an
area worth exploring. We also need to further investigate data transfer between climate
and buildings to better understand its impact on thermal comfort predictions.
Author Contributions: Conceptualization, X.Z. and P.L.; methodology, X.Z.; software, X.Z.; vali-
dation, X.Z. and P.L.; formal analysis, X.Z.; investigation, X.Z.; resources, P.L.; data curation, X.Z.;
writing—original draft preparation, X.Z.; writing—review and editing, X.Z.; visualization, X.Z.;
supervision, P.L.; project administration, X.Z.; funding acquisition, P.L. All authors have read and
agreed to the published version of the manuscript.
Funding: This research received no external funding.
Data Availability Statement: This data comes from the DRYAD.
Acknowledgments: This experimental research project was supported by the Institute of Intelligent
Building Research at Henan University of Technology, and the open access to experimental data is
gratefully acknowledged.
Conflicts of Interest: The authors declare no conflict of interest.
References
1. Huang, H.; Wang, H.; Hu, Y.-J.; Li, C.; Wang, X. The development trends of existing building energy conservation and emission
reduction—A comprehensive review. Energy Rep. 2022, 8, 13170–13188. [CrossRef]
2. Che, W.W.; Tso, C.Y.; Sun, L.; Ip, D.Y.; Lee, H.; Chao, C.Y.; Lau, A.K. Energy consumption, indoor thermal comfort and air quality
in a commercial office with retrofitted heat, ventilation and air conditioning (HVAC) system. Energy Build. 2019, 201, 202–215.
[CrossRef]
3. Szczepanik-Scislo, N.; Scislo, L. Dynamic Real-Time Measurements and a Comparison of Gas and Wood Furnaces in a Dual-Fuel
Heating System in Order to Evaluate the Occupants’ Safety and Indoor Air Quality. Buildings 2023, 13, 2125. [CrossRef]
4. Wu, Z.; Li, N.; Peng, J.; Cui, H.; Liu, P.; Li, H.; Li, X. Using an ensemble machine learning methodology-Bagging to predict
occupants’ thermal comfort in buildings. Energy Build. 2018, 173, 117–127. [CrossRef]
5. Musiał, M.; Lichołai, L.; Katunský, D. Modern Thermal Energy Storage Systems Dedicated to Autonomous Buildings. Energies
2023, 16, 4442. [CrossRef]
6. Momeni, M.; Fartaj, A. Numerical thermal performance analysis of a PCM-to-air and liquid heat exchanger implementing latent
heat thermal energy storage. J. Energy Storage 2023, 58, 106363. [CrossRef]
7. Castilla, M.; Álvarez, J.; Berenguel, M.; Rodríguez, F.; Guzmán, J.; Pérez, M. A comparison of thermal comfort predictive control
strategies. Energy Build. 2011, 43, 2737–2746. [CrossRef]
8. Lin, C.J.; Wang, K.-J.; Dagne, T.B.; Woldegiorgis, B.H. Balancing thermal comfort and energy conservation—A multi-objective
optimization model for controlling air-condition and mechanical ventilation systems. Build. Environ. 2022, 219, 109237. [CrossRef]
9. Fanger, P.O. Thermal Comfort: Analysis and Applications in Environmental Engineering; McGraw-Hill: New York, NY, USA, 1970.
10. Luo, M.; Xie, J.; Yan, Y.; Ke, Z.; Yu, P.; Wang, Z.; Zhang, J. Comparing machine learning algorithms in predicting thermal sensation
using ASHRAE Comfort Database II. Energy Build. 2020, 210, 109776. [CrossRef]
11. Höppe, P. Different aspects of assessing indoor and outdoor thermal comfort. Energy Build. 2002, 34, 661–665. [CrossRef]
12. Du, X.; Cai, Y.; Wang, S.; Zhang, L. Overview of deep learning. In Proceedings of the 2016 31st Youth Academic Annual
Conference of Chinese Association of Automation (YAC), Wuhan, China, 11–13 November 2016; pp. 159–164.
13. Gorbachev, Y.; Fedorov, M.; Slavutin, I.; Tugarev, A.; Fatekhov, M.; Tarkan, Y. Openvino deep learning workbench: Comprehensive
analysis and tuning of neural networks inference. In Proceedings of the IEEE/CVF International Conference on Computer Vision
Workshops, Seoul, Republic of Korea, 27–28 October 2019.
14. Raja, I.A.; Nicol, J.F.; McCartney, K.J.; Humphreys, M.A. Thermal comfort: Use of controls in naturally ventilated buildings.
Energy Build. 2001, 33, 235–244. [CrossRef]
15. Wagner, A.; Gossauer, E.; Moosmann, C.; Gropp, T.; Leonhart, R. Thermal comfort and workplace occupant satisfaction—Results
of field studies in German low energy office buildings. Energy Build. 2007, 39, 758–769. [CrossRef]
16. Scislo, L.; Szczepanik-Scislo, N. Air quality sensor data collection and analytics with iot for an apartment with mechanical
ventilation. In Proceedings of the 2021 11th IEEE International Conference on Intelligent Data Acquisition and Advanced
Computing Systems: Technology and Applications (IDAACS), Cracow, Poland, 22–25 September 2021; pp. 932–936.
Energies 2023, 16, 7137 19 of 19
17. Rodriguez-Galiano, V.; Mendes, M.P.; Garcia-Soldado, M.J.; Chica-Olmo, M.; Ribeiro, L. Predictive modeling of groundwater
nitrate pollution using Random Forest and multisource variables related to intrinsic and specific vulnerability: A case study in an
agricultural setting (Southern Spain). Sci. Total Environ. 2014, 476, 189–206. [CrossRef] [PubMed]
18. Feng, Y.; Liu, S.; Wang, J.; Yang, J.; Jao, Y.-L.; Wang, N. Data-driven personal thermal comfort prediction: A literature review.
Renew. Sustain. Energy Rev. 2022, 161, 112357. [CrossRef]
19. Martins, L.A.; Soebarto, V.; Williamson, T. A systematic review of personal thermal comfort models. Build. Environ. 2022,
207, 108502. [CrossRef]
20. Wen, Q.; Sun, L.; Yang, F.; Song, X.; Gao, J.; Wang, X.; Xu, H. Time series data augmentation for deep learning: A survey. arXiv
2020, arXiv:2002.12478.
21. de Dear, R.J.; Brager, G.S. Developing an adaptive model of thermal comfort and preference. ASHRAE Trans. 1998, 104, 145–167.
22. Čulić, A.; Nižetić, S.; Šolić, P.; Perković, T.; Čongradac, V. Smart monitoring technologies for personal thermal comfort: A review.
J. Clean. Prod. 2021, 312, 127685. [CrossRef]
23. Fard, Z.Q.; Zomorodian, Z.S.; Korsavi, S.S. Application of machine learning in thermal comfort studies: A review of methods,
performance and challenges. Energy Build. 2022, 256, 111771. [CrossRef]
24. Xiong, L.; Yao, Y. Study on an adaptive thermal comfort model with K-nearest-neighbors (KNN) algorithm. Build. Environ. 2021,
202, 108026. [CrossRef]
25. Peng, B.; Hsieh, S.-J. Data-driven thermal comfort prediction with support vector machine. In Proceedings of the International
Manufacturing Science and Engineering Conference, Los Angeles, CA, USA, 4–8 June 2017; p. V003T004A044.
26. Zhang, H.; Yang, X.; Tu, R.; Huang, J.; Li, Y. Thermal Comfort Modeling of Office Buildings Based on Improved Random Forest
Algorithm. In Proceedings of the 2022 IEEE 11th Data Driven Control and Learning Systems Conference (DDCLS), Chengdu,
China, 3–5 August 2022; pp. 1369–1376.
27. Hu, W.; Wen, Y.; Guan, K.; Jin, G.; Tseng, K.J. iTCM: Toward learning-based thermal comfort modeling via pervasive sensing for
smart buildings. IEEE Internet Things J. 2018, 5, 4164–4177. [CrossRef]
28. Kumar, T.S.; Kurian, C.P. Real-time data based thermal comfort prediction leading to temperature setpoint control. J. Ambient
Intell. Humaniz. Comput. 2023, 14, 12049–12060. [CrossRef]
29. Chennapragada, A.; Periyakoil, D.; Das, H.P.; Spanos, C.J. Time series-based deep learning model for personal thermal comfort
prediction. In Proceedings of the Thirteenth ACM International Conference on Future Energy Systems, Virtual, 28 June–1 July 2022;
pp. 552–555.
30. Elmaz, F.; Eyckerman, R.; Casteels, W.; Latré, S.; Hellinckx, P. CNN-LSTM architecture for predictive indoor temperature
modeling. Build. Environ. 2021, 206, 108327. [CrossRef]
31. Park, H.; Park, D.Y. Prediction of individual thermal comfort based on ensemble transfer learning method using wearable and
environmental sensors. Build. Environ. 2022, 207, 108492. [CrossRef]
32. Das, H.P.; Schiavon, S.; Spanos, C.J. Unsupervised personal thermal comfort prediction via adversarial domain adaptation. In
Proceedings of the 8th ACM International Conference on Systems for Energy-Efficient Buildings, Cities, and Transportation,
Coimbra, Portugal, 17–18 November 2021; pp. 230–231.
33. Somu, N.; Sriram, A.; Kowli, A.; Ramamritham, K. A hybrid deep transfer learning strategy for thermal comfort prediction in
buildings. Build. Environ. 2021, 204, 108133. [CrossRef]
34. Sun, K.; Qaisar, I.; Khan, M.A.; Xing, T.; Zhao, Q. Building Occupancy Number Prediction: A Transformer Approach. Build.
Environ. 2023, 244, 110807. [CrossRef]
35. Wang, C.; Wang, Y.; Ding, Z.; Zheng, T.; Hu, J.; Zhang, K. A transformer-based method of multienergy load forecasting in
integrated energy system. IEEE Trans. Smart Grid 2022, 13, 2703–2714. [CrossRef]
36. Rijal, H.; Humphreys, M.; Nicol, J. Adaptive model and the adaptive mechanisms for thermal comfort in Japanese dwellings.
Energy Build. 2019, 202, 109371. [CrossRef]
37. Torrey, L.; Shavlik, J. Transfer learning. In Handbook of Research on Machine Learning Applications and Trends: Algorithms, Methods,
and Techniques; IGI global: Hershey, PA, USA, 2010; pp. 242–264.
38. Sokolova, M.; Japkowicz, N.; Szpakowicz, S. Beyond accuracy, F-score and ROC: A family of discriminant measures for
performance evaluation. In Proceedings of the Australasian Joint Conference on Artificial Intelligence, Hobart, Australia,
4–8 December 2006; pp. 1015–1021.
39. Hu, W.; Luo, Y.; Lu, Z.; Wen, Y. Heterogeneous transfer learning for thermal comfort modeling. In Proceedings of the 6th
ACM International Conference on Systems for Energy-Efficient Buildings, Cities, and Transportation, New York, NY, USA,
13–14 November 2019; pp. 61–70.
40. Jin, Z.; Kim, J.; Yeo, H.; Choi, S. Transformer-based map-matching model with limited labeled data using transfer-learning
approach. Transp. Res. Part C Emerg. Technol. 2022, 140, 103668. [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.