Smartphones and smartwatches have contributed significantly to fitness monitoring by providing real-time statistics,
thanks to accurate tracking of physiological indices such as heart rate. However, the estimation of calories burned
during exercise is inaccurate and cannot be used for medical diagnosis. In this work, we present JoulesEye,
a smartphone thermal camera-based system that can accurately estimate calorie burn by monitoring respiration rate.
We evaluated JoulesEye on 54 participants who performed high intensity cycling and running.
The mean absolute percentage error (MAPE) of JoulesEye was 5.8%, which is significantly better than the MAPE of
37.6% observed with commercial smartwatch-based methods that only use heart rate. Finally, we show that an
ultra-low-resolution thermal camera that is small enough to fit inside a watch or other wearables is sufficient
for accurate calorie burn estimation. These results suggest that JoulesEye is a promising new method for accurate
and reliable calorie burn estimation.
Main Idea
We used a thermal camera attachment for phones to accurately estimate EE (Figure 1). Breathing causes change
in temperature in the nostrils which results in variations in pixel intensity. We used classical region tracking
approach like Channel and Spatial Reliability Filter [14] to retrieve the pixel intensity in the nostrils. The intensity
information over time gives us the proxy of the breathing signal. We also used temperature and heart rate data
to improve the results. To extract temperature, we monitored multiple points on the forehead. The forehead
is an area of bony prominence where the probability of observing the change in temperature due to workout
is high [11]. While prior work has estimated respiration rate using thermal images [2, 6], experiments have
not been performed when the participants move exaggeratedly while exercising. Motion is also a challenge for
wireless signals-based respiration monitoring as removing motion artifacts has been a longstanding challenge. Our
algorithms work even when the user is cycling or running vigorously. The sensed respiration rate, temperature
and heart rate information from thermal data are fed into a deep learning model to estimate energy expenditure.
Fig. 1. JoulesEye estimates Energy Expenditure (EE) from respiration rate. In a) the participant is riding a cycle with thermal
camera and phone fixed on the handrail. b) Shows a frame of the thermal video. c) Shows the respiration rate detection
pipeline during motion combined with deep learning architecture to predict energy expenditure.
JoulesEye Setup
Our goal is to determine how many calories a person has burned while exercising by measuring the respiration
rate. The breathing or respiratory rate is detected as a result of the temperature fluctuations due to airflow in
the nasal. The physical phenomenon is based on the radiative and convective heat transfer component during
the breathing cycle, which results in a periodic increase and decrease of the temperature at the tissues around
the nasal cavity. These observable temperature fluctuations are quantifiable in a thermal video as pixel intensity
variations of the nostrils ROI [16]. In this section, we first describe the setup of our system and then the algorithm for obtaining
the temperature and respiration rate. The setup is divided into two major category of systems: Ground Truth Devices and JoulesEye System.
Ground Truth Devices
First, we discuss the components used for collected ground truth data followed by other components used in the
design of JoulesEye.
Indirect Calorimeter
We have used the Fitmate Pro [18] which is an indirect calorimeter used to collect VO2 (volume of oxygen) data during sub-maximal
as well as maximal exercise by measuring the volume of oxygen consumed and the volume of carbon dioxide
produced. Submaximal exercise is performed at a level below the maximum capacity of an individual. During
physical assessment, only sub-maximal exercises should be performed by participants in the absence of a clinical
physician [8] The main components of the Fitmate Pro are:
Oxygen Sensor: Measures the oxygen consumption and the carbon dioxide expulsion from the body. The
concentration of oxygen and carbon dioxide is directly proportional to the energy expenditure.
Flow Sensor: Measures the volume of air breathed in and out by the user.
Microprocessor Unit: Analyzes the data from the sensors and calculates the energy expenditure based on
proprietary algorithms.
Display Screen: It is shown in Figure 2(d). It displays the results of the energy expenditure calculation,
including the number of calories burned, in real-time.
Mouthpiece or Mask: Attaches to the face and connects to the calorimeter, allowing the measurement of
inhaled and exhaled air.
Respiration Belt
The ground truth of respiration rate is available from the indirect calorimeter. We also
collect the respiration rate, using the Vernier GoDirect [7] respiration belt (Figure 2(e)). We collected res
piration rate from two sources because it is impossible to use the calorimeter and JoulesEye simultaneously.
Thus, instead of comparing JoulesEye with the gold standard output of the calorimeter, we use the chest belt
as the reference measurement of respiration. The belt consists of a flexible, stretchable material that is worn
around the chest, and it contains a sensor that detects the pressure changes caused by breathing. It has a mea
surement range of 0-100 breaths per minute with an error of ± 1 breath per minute. It has a sampling rate of 0.1 Hz.
Fig. 2. JoulesEye’s is composed of a thermal camera retrofitted in an iPhone as shown in a). JoulesEye can be used in a
smartwatch as shown in b). The camera in b) is a low resolution (32x24) thermal camera. c) and e) show the ground truth
data collection procedure with indirect calorimeter while running and biking. d shows a screen grab from the indirect
calorimeter recording the energy expenditure during an exercise session.
JoulesEye System
JoulesEye consists of a thermal camera to record the respiration rate of a person. The thermal camera (Figure 2(a,
b)) is used to retrieve the estimated value of respiration rate, temperature and used to estimate the EE. We used a
FLIR One Pro [9] smartphone attachment thermal imaging camera. To take thermal videos, the device needs to
be attached to an iPhone and connected to the FLIR ONE mobile application. A user can select the video mode
and start recording. The thermal video will be recorded in real-time, showing temperature differences and heat
patterns in the scene. The camera has a sampling rate of 8.6 frames per second with a temperature range of -20°C
to 120°C. The combined unit of the smartphone and the thermal camera was securely mounted on the handgrip
of an ergometer or affixed near the display screen of a treadmill in order to capture thermal video data of the
face. We also developed a wristband prototype JoulesEye as shown in Figure 2(b).
Data Collection
All participants between 18 to 70 years of age without any prior heart ailment could become a participant in
the study. In total, 54 volunteers participated in an approximately 45 minutes study session (Table 1). The entry
survey consisted of a questionnaire where the participants self-declared their age, weight, sex, time of the last
meal and recent illnesses. Our data collection method followed the best practice validation protocol mandated by
the Network of Physical Activity Assessment (INTERLIVE) [3]. The participants were shown how to wear the
indirect calorimeter mask. All participants participated in two back-to-back data collection sessions.
Table 1. Demographic information for the participants
Total participants (n)
54
Participants who performed cycling on ergometer
41
Participants who performed running on treadmill
13
Female (n, %)
24, (44.4%)
Age (in years) (mean, range)
28.4 (25–54)
Session 1: Data Collection Using JoulesEye
Participants cycled on a stationary bike or ran on a treadmill for
both the sessions. In the first session, the participant ran for three minutes at a high intensity (4-5 miles/h running
and 2.5-3 miles/h cycling). We limited the high intensity session to three minutes keeping the participant’s
comfort in mind. Figure 3(a) shows a frame of the face during this session. The following data are collected during
this session:
Thermal video data of the upper body with the frame covering the face. This data is later processed to
extract respiration rate.
Respiration rate from the chest belt.
Fig. 3. In a) the participant has not donned the indirect calorimeter mask and hence the region tracking algorithm is able to
keep track of the nostrils (nostril also shown in inset image). In b) the nostrils are covered by the mask making respiration
detection impossible. We call a) as the JoulesEye data collection. During a) we could not collect the indirect calorimeter data
in parallel as otherwise the nostrils would be occluded. Here, we use the respiration data from chest belt as the reference
values. Thus, we could quantitatively evaluate the performance of JoulesEye’s respiration rate pipeline with ground truth
respiration data from the belt. We later used the respiration rate from JoulesEye data to estimate energy expenditure
Session 2: Data Collection Using Indirect Calorimeter
In this session, the participant donned the indirect
calorimeter mask along with chest belt and performed cycling or running for 15 minutes comprising of High
Intensity Interval Training (HIIT). The thermal camera recorded the face of the person during this session as well.
Figure 3(b) shows a thermal frame of this session where the participant has donned the mask. Note that the nostrils
are not visible and this thermal data cannot be used to extract respiration rate. The following data are collected
during this session:
Thermal data with frame covering the upper body including the face. The nostrils are now occluded by the
mask.
Respiration rate from the chest belt.
Energy Expenditure, volume of exhaled air and respiration rate from the indirect calorimeter.
Additional Data: Heart Rate and Temperature
We evaluated how temperature data and heart rate data can affect the energy expenditure estimation. We are
interested in heart rate because, heart rate is one of the most common proxies for energy
expenditure and together with respiration rate, the estimations can improve.
Extracting Temperature
Our pilot experiments showed that temperature change occurs in the region of
face with bony prominence like forehead, jawline and nose tip when a person is cycling an ergometer or running
on the treadmill. It is known that physical activity increases the metabolic rate and generates heat in the body.
This increased heat is transmitted through the blood vessels and nerves in the bony regions, leading to an increase
in skin temperature in these regions [5]. We extracted temperature information from the forehead.
Heart Rate
To make a fair comparison of the energy expenditure (EE) estimates produced by our approach,
it was important to compare it with the currently accepted EE estimates produced by smart watches. The heart
rate data from a Apple Watch was collected continuously during cycling and running, providing a continuous
measurement of the individual’s heart rate. This heart rate data was used as an additional optional input to our
approach of estimate energy expenditure in combination with other physiological signals such as respiration
rate and temperature. We also used the energy expenditure data from the Apple Watch as the reference for
comparison with the energy expenditure estimates produced by our approach. By using the energy expenditure
data from the Apple Watch as well as from our own model, it was possible to make a fair comparison of the
accuracy of the energy expenditure estimates with respect to the ground truth.
Modeling
Energy Expenditure (EE) is represented in cal/min, deduced from VO2. We aim to estimate Energy
Expenditure (EE) from Respiration Rate (RR). We do this in two phases:
We will first estimate the volume of exhaled air (\(v\)) from RR.
Next, we will use the estimated volume information (\(v\)) to estimate the measures the oxygen concentration
in a breath or VO2.
The inspiration of using this two-phased approach comes from the indirect calorimeter, which measures the
oxygen concentration (O2) in a breath. O2 concentration in the inhaled air vary depending on factors like gas
exchange efficiency and body composition. By trying to model the relationship between the amount of O2
consumed and the volume of exhaled air (\(v\)), we can account for efficiency and body composition. But, \(v\) is
not readily available without the indirect calorimeter. Therefore, our first objective is to estimate \(v\) from RR
data. Using RR alone to estimate VO2 can lead to inaccuracies because it does not take into account individual
differences in lung capacities and breathing patterns. We expect our model to learn these factors to estimate \(v\)
from RR alone. Our second model would then learn the transfer function and estimate unmeasured factors that
would determine VO2 from \(v\).
Predicting Volume from RR
Both our models are an adaptation of the Temporal Convolution Network with
residuals (TCN) [4]. TCN leverages causal convolutions and dilation. Causal convolution enforces a unidirectional
information flow, while dilation allows to capture long-range dependencies of the input. In our work, the model
tries to learn a function \(f_1\) that best predicts the volume \(v_t\) at time stamp \(t\) such that
\[
v_t = f_1(v_{t-k:t-1}, RR_{t-k:t})
\]
The model iterates over multiple samples of input and output to learn the function \(f_1\). During prediction,
subsequent samples (\(v_{t+1}, v_{t+2}...\)), are predicted autoregressively i.e.
\[
v_{t+1} = f_1(v_{t-k+1:t}, RR_{t-k+1:t+1})
\]
where the predicted volume is used as an input to the next model which predics VO2.
Predicting VO2 from Volume
The approach to modeling VO2 from volume is similar to the previous modeling
approach where we use the TCN network, but this time we only use the volume information to predict VO2, i.e.
\[
vo_{t+p} = f_2(v_t : v_{t+p-1})
\]
where \(p\) is the number of samples of volume. Therefore, to predict the first sample of VO2, we need in total \(k + p\)
samples of respiration rate.
Fig. 4. We build a deep learning network similar to the Temporal Convolution Network (TCN) with residuals to estimate
volume as a function of respiration rate and volume i.e. vt = f1(v(t-k:t-1),RR(t-k:t)). Additionally, we also evaluated
the performance of the model with additional covariates, namely heart rate (HR) and temperature (T) collected from the
forehead. On using HR and T, the equation becomes, vt = f1(v(t-k:t-1),HR(t-k:t),T(t-k:t),RR(t-k:t)). The residual blocks
are composed of 1D dilated causal convolution (the first layer has no dilation), a ReLU activation and dropout [10]. A similar
convolution is used to later predict VO2 (calorie or energy expenditure) from Volume.
Using Heart Rate (HR) and Temperature Data (T)
During data collection, we retrieved heart rate data from both a chest belt and a smartwatch. Additionally, we obtained approximate temperature data
from thermal readings. To enhance our analysis, we incorporated Heart Rate (HR) data from the smartwatch and
forehead temperature (T) data as additional covariates. These supplementary variables enabled us to evaluate
the performance of estimating \(v\) using different combinations of covariates, including HR alone, RR alone, a
combination of RR and HR, and a combination of RR, HR, and T. For example, with an input of RR , HR and T, the
equation to estimate \(v\) becomes
Figure 4 illustrates the corresponding TCN model for this combination of inputs. By modifying one (e.g., using
only T and RR or T and HR) or two covariates (using only RR), we adjusted the input dimension of the model,
necessitating corresponding adaptations to the kernel dimension while maintaining the dimensions of the tensors
within the residual network unchanged.
Results and Discussion
In this section, we first discuss the performance of respiration detection from thermal video when compared to
ground truth from a respiration belt. Next, we discuss the performance of energy expenditure estimation from
respiration, heart rate, and temperature data.
Result on Estimating Respiration Rate
With the data from the first session we observed that the error between
respiration rate detection from thermal data when compared to respiration belt is 2.1% (Figure 7 (A)). Furthermore,
from the data collected from the second session we quantified that the error between respiration rate detection
from indirect calorimeter and respiration belt is 1.68%.
Fig. 5. The data obtained during the first session (A) serves the purpose of quantifying the discrepancy between the respiration
signal extracted from the thermal video and the signal obtained from the belt. This quantification holds significance for
subsequent insights, as depicted in Figure 6. The data collected during the second session (B) showcases the error in
estimating VO2 or EE when employing the respiration signal from an indirect calorimeter or a chest belt. Notably, it is
important to recall that using an indirect calorimeter obstructs the view of the nostril from the thermal camera.
Both these numbers (2.1% and 1.68%) are better compared to previous work [1] which
uses Electrocardiogram and Photoplethysmogram to calculate respiration rate. Since both, respiration rate and
energy expenditure are on different scales, using MAPE gives us a good idea of how changing one modality
impacts the other.
Fig. 6. JoulesEye EE estimation pipeline: The calorimeter’s mask obstructs direct thermal-based respiration retrieval by
blocking the camera’s view of the nostrils. To replicate thermal-based respiration, we added noise to the belt-derived
respiration signal, introducing a 2.1% error to simulate the difference between the reference respiration from the belt and the
thermal video. The resulting noisy respiration rate (RR) signal was then input into the first TCN model for volume estimation.
The estimated volume was subsequently passed to the second TCN model to predict VO2 or energy expenditure.
Result on Estimating Energy Expenditure
Using True and Reference Respiration Rate
Figure 5(B) shows the pipeline of estimating energy expenditure
or VO2 from ground truth respiration rate from the calorimeter and the reference respiration rate from the chest
belt. The first TCN model is used to estimate volume of exhaled air from respiration rate data. The estimated
volume of exhaled air is then used as an input to the second TCN model which estimates energy expenditure
or VO2. Figure 5 shows that the best result of 5% Mean Absolute Percentage Error (MAPE) was obtained when
ground truth respiration rate was input into the model. Using the belt’s respiration rate as an input gives us an
MAPE of 5.2%. To put these numbers into context, we compared the performance of using respiration rate as a
predictor versus heart rate and temperature. We also compared the result obtained from Apple Smart Watch. In Figure 7,
the following inputs are shown:
True HR: This is the heart rate obtained from the indirect calorimeter chest band for heart rate.
Estimated HR: This is the heart rate obtained using Apple Smartwatch.
True RR: This is the respiration rate obtained from indirect calorimeter.
Estimated RR: This is the RR generated by adding noise to the RR from the respiration belt.
Estimated RR and HR: This means the estimated RR data and Apple Watch HR data.
Estimated RR, HR and T: This means the estimated RR data, Apple Watch HR data and the temperature
data collected during session 2.
Fig. 7. Comparision of True and Estimated HR/RR with MAPE analysis.
Using Proxy Respiration Rate
We described earlier, why respiration rate obtained from thermal
data is not available with ground truth of energy expenditure. But, from comparison of Session 1 data (Figure 5),
we know that the error between respiration rate obtained from thermal data and belt is 2.1%. Thus, we can use
the respiration data of belt from Session 2 (Figure 5) and add noise to it so that we generate new respiration
data which has an error of 2.1%. We use this respiration data to predict energy expenditure as shown in Figure 6.
Mathematically,
\[NoisyRR_i = BeltRR_i + \epsilon_i\]
\[\epsilon_i \sim N(\mu = 0.44, \sigma = 0.35)\]
where NoisyRR is the noisy respiration data and BeltRR is the respiration rate from the belt. We refer to this
noisy respiration data as a proxy to the estimated respiration data. The choice of mean and standard deviation
was such that so that the error between the noisy respiration rate and the ground truth respiration rate is 2.1%. A
quantile-quantile probability plot confirmed that the respiration data from the belt follows a normal distribution
and hence we choose the generate the noise from a normal distribution.
Fig. 8. Error Trends in estimation of EE using HR and RR while Cycling and Running
We compare the estimates from Apple Watch heart rate data and the estimates from estimated respiration rate in
Figure 8. For cycling activity, the energy expenditure estimated from the heart rate data are relatively inaccurate
as apparent from the noisy data in the lower portion of Figure 8(a). The same trend is not observed in Figure 8(b)
when respiration rate is used as a predictor. It is important to note that, for each participant, we changed the
demographic information before data collection in the Apple Health app.
Effect of Occlusion
Figure 9 (a) demonstrates that when three or more frames are consecutively occluded, the respiration rate
estimation exhibits a high Mean Absolute Error of 20.1, while one or two frame occlusions result in a significantly
lower error rate. The occurrence of prolonged occlusion, lasting three or more frames, leads to the loss of nostril
tracking by the Region of Interest (ROI) tracker, causing alterations in the mean intensity signal, as illustrated in
Figure 9 (b). As a consequence, this deviation in the intensity signal adversely affects the accuracy of respiration
estimation. However, in such instances, we activate the RGB camera to re-establish the tracking of nostrils
through landmark detection. By continuously tracking the nostril with the RGB camera, we can successfully
retrieve the correct mean intensity signal, as shown in Figure 9 (c).
Fig. 9. Impact of Occlusion and RGB-based nostril tracking enhancement.
Discussion
According to literature [12, 13, 17], respiration rate
information helps in explaining body composition or adiposity which is an important determiner for EE. Body
composition plays a significant role in determining energy expenditure because each type of tissue in the body
requires a different amount of energy to maintain. Muscle tissue is more metabolically active than fat tissue,
meaning it requires more energy to sustain. To analyse if body composition affects the energy expenditure estimates, we split our data into people with
normal and overweight Body Mass Index (BMI). Table 2 shows the MAPE of energy expenditure from Apple
Watch and the MAPE of energy expenditure estimates from respiration rate.
Table 2: EE estimates by Apple Watch is higher for people with high Body Mass Index
(BMI) and relatively better for people with normal BMI.
All Participants
Participants with Normal BMI
Participants with Overweight BMI
Error (Apple Watch)
37.6%
29.7%
51.8%
Error (JoulesEye) with RR
5.8%
5.2%
6.9%
Another reason why respiration rate explains the change in EE can be deduced from Figure 10 which shows that
the heart rate, respiration rate and EE are well correlated, however the correlation between heart rate and energy
expenditure is lower (Pearson Correlation = 0.78) compared to respiration rate and EE (0.93). Figure 10 suggests
that high frequency information in the EE signal are captured by the respiration rate and not the heart rate. Heart
rate signal is smoother resulting where no frequent changes are observed unlike in respiration rate and EE signal.
Fig. 10. Correlation between Calorie, Respiration Rate and Heart Rate.
Result with Reduced Video Resolution
The FLIR Thermal camera needs to be retrofitted with an iPhone and its
video recordings are saved in 1440x1080 pixel resolution without access to any raw data. But, for JoulesEye to be
practical, we envision the smartwatch might come with a low resolution thermal camera. The primary advantage of using a low resolution thermal camera is reduced power and privacy concerns. As shown in Figure 11-b), we
designed a 32x24 pixel resolution MLX90640 based thermal imaging system. It also has a RGB camera beside it.
The RGB camera helps initially locate the nostrils and thereafter, the CSRT algorithm keeps track of the nostril.
We evaluated our low-resolution thermal system for respiration rate
detection on 5 participants. These participants were asked to run on a treadmill at 4 miles per hour for a minute
with the constraint that they look into the JoulesEye smartwatch thermal camera by extending their hand, akin
to looking into a smartwatch.
Fig. 11. In a)
we show what our future smartwatches can look like. In (b) we show our first prototype wristband thermal camera which is
composed of a low resolution thermal camera and a RGB camera.
When compared to ground truth respiration rate data collected via the belt, we
observed that the MAPE of estimating respiration rate is 8.1%. This high error arose because we were not able to
achieve a high frame rate for the thermal camera. The current frame rate is 3 frames per second which is fine for
slow or no movements but it causes a dithering effect when there is too much movement from the participant.
We repeated the procedure (discussed earlier) of adding noise to respiration belt data so that the new
data has an error of 8.1%. Using this data we got an energy expenditure estimate of 15.4%. While 15.4% is higher
compared to the estimates from the watch’s heart rate data alone (using our algorithms and not Apple Watch)
which is 12% (Figure 7), combining this respiration rate data with heart rate data reduces the error to 10.1%. This
shows that even though the frame rate of the wristband prototype is low, leveraging thermal data and heart rate
data from smartwatch can estimate energy expenditure accurately when compared to heart rate data alone. The
results are summarised in Table 3.
Table 3. Estimation of EE using a low-resolution thermal camera in combination with heart rate data yields an error of 10.1%,
showcasing its superiority over using heart rate data alone. These results demonstrate that even with a very low-resolution
thermal camera, EE estimation can be enhanced.
Table (a): The error (MAPE) in RR estimation varies with changes in thermal video resolution.
Resolution
Error on estimated RR
1080p thermal camera
2.1%
24p thermal camera
8.1%
Table (b): Reduced thermal video resolution leads to increased error (MAPE) in EE estimation.
Input Data
Error on estimated EE
RR from 1080p thermal
5.4%
RR from 24p thermal
15.4%
RR from 1080p thermal and HR
5.3%
RR from 24p thermal and HR
10.1%
Impact of Changing Time Resolution
Our result of JoulesEye shown in Figure 7 is based on input data sampling interval of 90đť‘ , where 60đť‘ is required
to estimate the first sample of volume and further 30đť‘ more data is required to estimate the first sample of
VO2. Figure 14 shows how the percentage error changes when we gradually decrease the input chunk length of respiration rate estimation.
We observe that
using 15đť‘ of respiration data is enough to predict energy expenditure with a better performance as compared to
heart rate alone. This implies that after exercising a user will have to look into the watch for 15đť‘ + 30đť‘ for her
energy expenditure to be predicted by the model. We believe that work needs to be done in order to reduce this
interval further towards making the system even more practical.
Fig. 12. Using 60s of respiration rate data gives us the best performance on estimation energy expenditure. 30s of respiration
data is enough to predict energy expenditure with a better performance compared to heart rate alone.
Limitations and Future Work
We now discuss the limitations of our present work and plans for addressing them in the future.
Smartphone/Smartwatch Integration: Our objective is to retrofit a smartphone/smartwatch with a low
resolution thermal camera [15]. Although we prototyped JoulesEye,
engineering challenges to obtain higher frame rate remains an unsolved problem. Our initial result are
promising, but our system is not yet real time, meaning the video processing and deep learning
pipeline needs to be run after data recording.
Usability of Smartphone Prototype: Although we developed a prototype smartwatch for JoulesEye, we did
not conduct any usability study with it. Currently, performing a usability study would not yield desired
results, as each participant would need to continuously look into the watch for at least 45 seconds to
obtain any energy expenditure estimate. Such extended duration for glancing
at the watch is impractical. Further research is required to significantly reduce this time interval, allowing
a quick glance at the watch to provide accurate energy expenditure values.
Uncertainty in Estimation: Smartphone/Smartwatch Integration:Smartphone/Smartwatch Integration: Our current methods for estimating energy expenditure give a point estimate.
In the future, we plan to incorporate uncertainty in our estimation. Incorporating such uncertainty will
be particularly important as various sensing modalities will be affected differently owing to differences
in external conditions. As an example, the algorithms for heart rate estimation will likely not suffer even
when the surroundings are dark, but the algorithms to estimate nostril position from RGB will suffer. Thus,
in the future, we plan to implement a principled uncertainty based approach, where uncertainties in the
different parts of the pipeline (estimating respiration rate, temperature; estimating energy expenditure
using machine learning model) are considered while estimating energy expenditure.