Rev Bras Fisiol Exerc.
2024;23(3): e235548
ORIGINAL ARTICLE
Applicability of vibrational spectroscopy in the analysis of liquid biopsy
after a cycling time-trial test
Aplicabilidade da
espectroscopia vibracional na análise de biópsia líquida após teste de contra-relógio no ciclismo
Leandro dos Santos1,
Marcia Helena Cassago Nascimento2,
Leonardo Barbosa Leal2, Ian Manhoni
Baiense2, Ana Luiza de Castro Lopes3, Amanda Piaia Silvatti4, Richard Diego Leite2,
Valerio Garrone Barauna2
1Universidade Federal Rural de
Pernambuco, Serra Talhada, PE, Brazil
2Universidade Federal do Espírito Santo,
Vitória, ES, Brazil
3Universidade Estadual de Campinas,
Campinas, SP, Brazil
4Universidade Federal de Viçosa, Viçosa,
MG, Brazil
Received: March 30,
2024; Accepted: October 30,
2024
Correspondence: Leandro dos Santos, leandro.santos79@gmail.com
How to cite
Santos L, Nascimento MHC, Leal
LB, Baiense IM, Lopes ALC, Silvatti AP, Leite RD, Barauna VG. Applicability of vibrational
spectroscopy in the analysis of liquid biopsy after a cycling time-trial test. Rev Bras Fisiol Exerc. 2024;23(3):e235548.
doi: 10.33233/rbfex.v23i3.5548
Abstract
Introduction: Fourier-transform infrared spectroscopy with attenuated
total reflectance (ATR-FTIR) is a technique that analyzes biochemical changes
and monitors physiological responses; however, chemometric methods are required
for its analysis. Aim: To investigate whether ATR-FTIR, combined with
multivariate analyses, can be used to characterize and distinguish the
biochemical profile of athletes before and after a cycling test. Methods:
Cross-sectional study with 10 cyclists performing a 20 km time trial. Results:
The results revealed that ATR-FTIR, in conjunction with pattern recognition
approaches, allowed the identification of biochemical differences between pre-
and post-test moments. After the removal of two outlier samples, principal
component analysis (PCA) revealed a distinct separation in the fingerprint
region of the spectrum. An analysis using a Monte Carlo sampling associated
with genetic algorithm-based discriminant analysis (MC-GA-LDA) identified
specific spectral regions related to these differences, indicating that the
athletes’ physiological variations were reflected in the spectra. The most
relevant regions were in the bands of 1338-1308, asymmetric C-H stretching that
can be assigned to amide III bond, and 1125-1108, asymmetric C-O stretching assigned
to lactate biomolecule. These results demonstrate the sensitivity of ATR-FTIR
in detecting metabolic changes and suggest its applicability as a tool for
monitoring physiological responses. The technique can be useful in personalized
training load monitoring and the identification of specific performance,
fatigue, or physiological stress markers. Conclusion: ATR-FTIR technique
combined with multivariate analyses can be a promising approach to characterize
and distinguish the biochemical profile of athletes in response to physical
stimuli.
Keywords: bicycling; spectroscopy; athlete; FTIR; chemometrics
Resumo
Introdução:
A espectroscopia de infravermelho com transformada de Fourier e refletância
total atenuada (ATR-FTIR) é uma técnica utilizada para analisar alterações
bioquímicas em amostras biológicas. Entretanto, são necessários métodos quimiométricos e uso de ferramentas de inteligência
artificial (IA) para análise desses dados. Objetivo: Investigar se o ATR-FTIR
pode ser usado para caracterizar e distinguir o perfil bioquímico de atletas
antes e após um teste de ciclismo. Métodos: Estudo transversal com 10
ciclistas realizando um teste contra o relógio de 20 km. Resultados: Os
dados do ATR-FTIR, combinado com abordagens de reconhecimento de padrões,
permitiram identificar diferenças bioquímicas entre os momentos pré e pós-teste. Após remoção de duas amostras outliers,
análise de componentes principais (PCA) revelou uma separação distinta de pré e pós-teste a partir da região espectral fingerprint (1800 – 900 cm-1). A análise com
amostragem pelo método de Monte Carlo associado ao algoritmo genético e análise
discriminante (MC-GA-LDA) identificou regiões espectrais específicas
relacionadas a essas diferenças, indicando as variações bioquímicas mais
relevantes (bandas de 1338-1308, estiramento C-H assimétrico – amida III; e
1125-1108, estiramento assimétrico C-O-lactato). Esses resultados demonstram a
sensibilidade do ATR-FTIR em detectar alterações metabólicas e sugerem sua
aplicabilidade como ferramenta para monitorar respostas fisiológicas em
atividades esportivas. A técnica pode ser útil no acompanhamento personalizado
da carga de treinamento e identificação de marcadores específicos de
desempenho, fadiga ou estresse fisiológico. Conclusão: A técnica
espectroscópica ATR-FTIR, associada à quimiometria,
pode ser uma abordagem promissora para caracterizar e distinguir o perfil
bioquímico de atletas em resposta a estímulos físicos.
Palavras-chave: ciclismo; espectroscopia vibracional; ATR-FTIR;
quimiometria
Infrared (IR) spectroscopy is a method that measures the
absorption of radiation in the IR region depending on the specific functional
groups of the molecules present in the sample. IR radiation excites these
molecules, and the frequency of these vibrations corresponds to the frequency
of the absorbed light. The theoretical basis of IR is described in detail in
several reviews [1,2,3], and the ability to identify the presence of functional
groups is one of the advantages of this technique. It is a rapid analysis,
requiring minimal sample preparation, with the ability to analyze any biofluid
in less than 1 minute. It is considered a viable option for analyzing chemical
changes in biological processes and evaluating metabolites in biofluids [4].
Although infrared spectroscopy is not as specific as other techniques, it is
capable of analyzing the sample as a whole, in the set of all macromolecules
present (carbohydrates, proteins, lipids, DNA, RNA...), forming a type of
metabolic fingerprint of the sample [2,3].
Attenuated Total Reflectance-Fourier Transform Infrared
(ATR-FTIR) is a type of infrared spectroscopy that has been used for a wide
variety of health studies. Recent applications include analysis of whole blood
[5], tears [6], and specific isolates such as exosomes [7]. ATR-FTIR has also
proven useful in monitoring oxidative stress under conditions of chronic
psychological stress in rat mononuclear cells [8]. Bujok et al. [9] used
ATR-FTIR to assess protein oxidation in the blood plasma of horses after
physical exercise. The results obtained from the analysis of the ATR-FTIR
spectra were similar to those obtained from the gold standard carbonyl
spectrophotometric assay using DNPH, thus suggesting ATR-FTIR as a cheaper and
faster tool for the study of exercise-induced protein oxidation.
High-performance athletes rely on specialized training to
achieve the highest levels of efficiency in their sports. To better results,
excessive training can lead these athletes to a state known as overtraining,
which results in the opposite effect than expected, such as loss of performance
[10]. Identifying the training moment that induces better results or a drop in
productivity is difficult. Therefore, evaluating these individuals at the end
of their activities, whether competitions or training, is essential to monitor
their responses. There are many methodologies to assess the biochemical
reactions of athletes. It is common to measure them based on cardiorespiratory
variables, such as oxygen consumption and metabolites in serum and urine
(lactate, urea, creatinine, creatine kinase, and ketone bodies) [11]. These
metabolites are related to metabolic responses during activity. However, they
require methods that are sometimes expensive and time-consuming, in addition to
being a method for each metabolite analyzed.
The standard technique for evaluating these metabolites,
such as urea, creatinine, glucose, and ketone bodies in urine, is based on
colorimetric measurements (absorption). In this process, a specific reagent
reacts with the molecule of interest, which has absorption at a particular
wavelength, and this is used to identify and quantify the desired component.
The major disadvantage is that the results can take hours or days and are
nonspecific for some analytes (mainly proteins). There is an ongoing search for
a fast, minimally invasive technique with high sensitivity and specificity that
can be used in this situation. One of the advantages of using ATR-FTIR in this
context is that it allows the analysis of the modifications of all these
substances (macromolecules) at once instead of analyzing them individually
[12]. The ATR-FTIR spectrum contains vast information, so applying artificial
intelligence and chemometric tools is essential for its analysis.
Thus, this study aimed to verify whether ATR-FTIR
spectroscopy, together with AI and chemometric analyses, can provide a new
biochemical view of characterizing and distinguishing the profile of athletes
before and after a cycling test.
Participants
Ten male recreational cyclists, master category (42 ± 6
years, 75 ± 7 kg, 174 ± 7 cm) who had been cycling for 20 ± 10 years were
invited to participate in the study. They participated in cycling races
(competitive) and trained for an average of 11 ± 2 hours per week. Participants
were instructed to abstain from strenuous activities at least 72 hours before
the 20 km Cycling Test (TT20), avoid any analgesic (anti-inflammatory)
medications, and maintain their regular dietary intake and lifestyle habits throughout
the study. A written informed consent form was provided, and all subjects
completed a clinical history questionnaire. The procedures were approved by the
Human Research Ethics Committee (59773616.0.0000.5153). Male cyclists with at
least five years of experience in regional-level competitive sports activities
were included in the study. Exclusion criteria were those who used any anabolic
steroid, drugs of abuse, or medications with a potential effect on sports
performance.
Cycling Test (TT20)
The cyclists performed a 10-minute warm-up with free
pedaling at their own pace, followed by a 5-minute rest. Then, the participants
performed an individualized 20 km time trial using their bicycles coupled to a
CompuTrainer ProLab 3D (Racermate), which measured performance during the test.
All participants were instructed to finish the TT20 as quickly as possible.
Verbal encouragement was provided throughout the event, but they were blinded
to feedback such as time, cadence, power, and heart rate, as these could
interfere with the stimulation effort. The course was configured in the
Computrainer 3D software with automatic control of the constant load mode and
an individual weight (bicycle + cyclist). The
cycling test was performed at
the Laboratório de Força e Condicionamento (Strength and Conditioning
Laboratory – LAFEC) of the Federal University of Espírito Santo with a temperature controlled between 20°C and 22°C.
Heart rate analysis
Heart
rate (HR) was monitored during the test
using the H7 Bluetooth heart rate transmitter worn around the
chest below the pectoralis major (Polar, USA)
and connected to the HRV® software. The recorded data were subsequently analyzed using a computer program (Kubios software, HRV
standard 3.3.0®), which allows
the selection of specific run
periods. Maximum heart rate (HRmax) was
determined by the highest HR achieved and maintained for 30 seconds during the
test.
Subjective perception of effort
At the end of the TT20, participants were asked about
their subjective perception of effort using the Borg scale (1-10), with 1
corresponding to "no effort" and 10 to "maximal/extreme
effort" [13].
Vibrational spectroscopy
The instrumentation for mid-infrared vibrational
spectroscopy comprises a spectrometer (Cary 630 FTIR, Agilent Technologies)
coupled with an ATR complement and diamond crystal. The obtained spectra were
recorded with a wavelength range of 650 to 4000 cm-1, using 32
spectra for the background and sample analysis. Each spectrum contains 1798
analysis points (spectral resolution of 1.86 cm-1).
To perform the spectroscopy, 10ul of
plasma from the pre-intervention (PRE) and post-intervention (POST) moments
were pipetted three times onto a sheet of aluminum foil on its shiny side; the
samples were left at room temperature overnight to dry. After drying, the
samples were analyzed directly on the crystal, in triplicate, using the
equipment's press, which exerts continuous and equal pressure on all samples.
At the end of each analysis, the crystal was cleaned with deionized water and
70% alcohol to remove residues from the previous sample.
Figure 1 shows the raw and pre-processed spectra before
and after the TT20. Figure 1A represents the average of the raw spectra at the
PRE and POST test moments, and Figure 1B represents the average of the
pre-processed spectra with baseline correction, Savitzky-Golay smoothing, and
vector normalization.
(A) Raw average spectrum. (B) Average spectrum after
preprocessing (baseline correction, Savitzky-Golay smoothing, and vector
normalization) with identification of high wavenumber (4000 - 2800 cm-1) and
Fingerprint (1800 - 900 cm-1) regions. Each spectrum consists of 1798 analysis
points
Figure 1 - Representation of plasma spectra before TT20 (PRE)
and immediately after (POST)
Statistical analysis
Biological data are presented as mean and standard
deviation. MathLab2020 software was used for AI and chemometric analyses. All
spectra were preprocessed with baseline correction, Savitzky-Golay smoothing,
and normalization.
Principal Component Analysis (PCA) is a pattern
recognition model developed to identify possible anomalous samples, visualize
similarities and possible natural groupings between samples, and analyze the
behavior and dispersion of spectral variables [14].
For variable selection, the genetic algorithm method
based on discriminant analysis with Monte Carlo sampling (MC-GA-LDA) was used,
which is an association of the Monte Carlo sampling method (MC) with the
genetic algorithm based on discriminant analysis (GA-LDA).
The MC method is a statistical method for solving various
problems through random sampling using the probability distribution of the
sample set. GA-LDA uses Fisher's ratio as a metric for selecting subsets of
variables that maximize separation between classes [15,16]. The association of
MC sampling with GA-LDA was applied in the present study to choose variables
that discriminate physiological variations in PRE and POST TT20 individuals and
that are reflected in the infrared spectrum of the samples. For this purpose,
the data set with all average spectra was subjected to 800 random samplings and
selection by GA-LDA iteratively. At each iteration, GA-LDA selected variables,
identifying the variables that maximize separation between classes. Each
variable's relative selection frequency was calculated at the end of the
iterations (N = 800). The final selection corresponds to the variables with the
highest relative frequency values at the end of the 800 iterations.
Initially, we needed to ensure that the volunteers had
exerted themselves to their maximum during the TT20. The ten subjects in the
study completed the test with an average time of 33.4 ± 1.7 minutes. The
subjective perception of effort, assessed by the Borg scale, at the end of the
TT20 was 9.1 ± 0.9, and the maximum HR reached was 182 ± 13 bpm, which is
equivalent to 102 ± 3% of the maximum HR estimated by age (220 - age). These
data suggest that the subjects exerted maximum effort during the TT20.
The first analysis was the unsupervised PCA to identify
whether ATR-FTIR can differentiate the samples from the PRE and POST TT20
moments based only on the characteristics of the infrared spectrum in the
plasma. The analysis was performed considering all 1798 features of the
spectrum (Figure 2A 4000-900 cm-1) and in specific regions called
the High Wavenumber region (Figure 2B, 4000–2800 cm-1) and the
Fingerprint region (Figure 2C, 1800-900 cm-1). In neither case was a clear
separation between the PRE and POST moments. However, after an individualized
analysis, it was possible to observe that in all situations, samples 3 and 9 of
the POST were the ones that did not allow a complete separation of the moments.
In other words, the PCA analysis on the spectra of these two individuals did
not match the rest of the groups (arrows in Figures 2A, 2B, and 2C).
Therefore, as the next step, these samples were removed,
and the PCA was repeated (total spectrum, Figure 2D; high wavenumber, Figure 2E
and fingerprint, Figure 2F). With the total spectrum (Figure 2D), we observed
the separation between the groups (PC1 axis explaining 58.8% of the variance of
the samples).
The next question was about which region of the spectrum
is responsible for this separation, and so the spectrum was analyzed again as
two separate parts (high wavenumber and fingerprint). We then observed that the
separation was only observed in the fingerprint region (Figure 2F, PC2 axis,
27.3% of the variance) and not in the high wavenumber region (Figure 2E). It is
important to highlight this result since this fingerprint region (1800-900
cm-1) is the region that contains the most significant amount of information in
biological samples [3].
(A and D), high wavenumber region (B and E), and
fingerprint region (C and F).
(A), (B) and (C): models from spectra of all 10 subjects;
(D), (E), and (F): models from spectra after removing samples 03 and 09 (n =
16). In blue: PRE moment, and in red: POST moment
Figure 2 - Graph of PC 1 versus PC2 scores of PCA models from
full spectral variables
After this series of unsupervised exploratory analyses,
we investigated, through MC-GA-LDA, which of the 1798 variables (spectrum
regions) were responsible for this distinction between the PRE and POST TT20
moments. MC-GA-LDA analysis is an AI technique that combines two algorithms:
GA, an optimization technique inspired by natural selection, where candidate
solutions evolve through mutation and recombination to find the best model, and
LDA, a probabilistic algorithm that identifies latent variables in a data set.
Finally, the MC method refers to the number of times a random simulation is
repeated to estimate the real value that a spectrum variable has relevance. The
more iterations are performed, the more accurate the estimate will be. It is a
widely used probabilistic technique that simulates random scenarios to
calculate the probability of different outcomes. In our study, we performed 800
iterations; the model was repeated 800 times, and the two most prominent
spectral regions (highest relative frequency values) were ~1338-1310 cm-1
and ~1125-1108 cm-1. The other regions with median relative
frequency values (~0.4 and 0.3) were ~1041-1026 cm-1 and ~965-922 cm-1
(Figure 3). Table I shows the chemical assignments of these identified regions.
Figure 3 - Relative frequency of variable selection by the
Monte Carlo GA-LDA method with 800 iterations
The most prominent regions (highest relative frequency
values) are 1338 to 1310 cm-1 and 1125 to 1108 cm-1. The
other regions that stand out are those from 1041 to 1026 cm-1 and 965 to 922 cm-1 (Table I).
Table I - Chemical assignment of the main selected FTIR
regions by the Monte Carlo GA-LDA with 800 iterations
As a result of the study, it was possible to observe that
the infrared vibrational spectroscopy technique, ATR-FTIR, could identify
differences in the biochemical profile of the sample between the PRE and POST
TT20 moments. Furthermore, through the AI and chemometric analyses, it was
possible to indicate that these differences are predominantly in the
fingerprint region of the spectrum, more specifically in the wavelengths of
1338 to 1310 cm-1 (CH2), 1125 to 1108 cm-1 (C-O), 1041 to 1026 cm-1
(C-OH), and 965 to 922 cm-1 (PO4-).
In 2003, Petibois and Déléris [17] used ATR-FTIR to
obtain a global analysis of the energy metabolism of swimmers during a 400 m
race by analyzing the plasma by ATR-FTIR every 100 m. The authors concluded
that FTIR allowed for a global description of changes in blood content during
the race. One of the significant advantages of using FTIR is that it can be
performed using capillary blood collection (blood collected from the fingertip
rather than via venipuncture), which respects the athlete's comfort and allows
successive analyses to be obtained in short periods. The authors concluded that
the region of most significant change was 1300-900 cm-1, similar to
the data found in the present study, a region also known to represent the
majority of circulating bioenergetic molecules such as sugars.
Khaustova et al. [18] used saliva to monitor
physiological stress in 48 conditioned athletes (VO2max = 58.9 ± 10.1
ml.min-1.g-1). The FTIR spectrum was obtained in saliva
before, immediately after, and 30 minutes after a maximal step test. The
authors showed that the method allows determining the concentrations of
substances present in saliva, but in a cheaper way and without sample
preparation and reagents, from the minimum sample volume and (almost)
immediately after sample collection. This study analyzed and compared changes
in the concentration of total proteins, cortisol, alpha-amylase, immunoglobulin
A, urea, and phosphate with the gold standard methods.
Similar to the data found by our group, Caetano Junior et
al. also identified two individuals among 13 rugby athletes using the FTIR
technique, collected in saliva, whose spectrum behavior in the post-test moment
was not discriminating from the pre-test moment. The two individuals in this
study had lower HR responses than the group average, which suggested that these
individuals exerted less effort during the test than the other volunteers [19].
In the present study, analyzing the individual data of
the two individuals (03 and 09), it was also observed that they had
characteristics that were distinct from the rest of the group: individual 03
was the oldest (54 years old), had a lower BMI (20.6 vs. average of 25.2), had
trained longer (40 years vs. average of 22 years), and finished the test with
the lowest %HRmax (91% vs. average of 102%). Individual 09 had the
most discrepant result, having had the worst performance among the 10
volunteers, finishing the test in 36.5 minutes (study average of 33.5 minutes).
Thus, it is believed that their chemical profile at the POST moment did not
differ from the PRE moment in the PCA analysis because the individuals had not
reached their limit. We then identified a possible application of this
technique. Were these athletes unmotivated for the test or overtrained?
Although we do not have much data to explain this, it is a fact that FTIR
identified these two individuals with only 10ul of plasma in less than 1 hour.
In 2022, Chrimatopoulos et al. [20] used ATR-FTIR
coupled with AI (PCA and PLS-DA) to determine biochemical changes after
exercise using spectra obtained in the saliva of athletes with different
fitness levels. The authors also identified regions similar to ours in this
work, with 921 cm-1 (membrane lipids/phospholipids/carbohydrates)
and 1080 cm-1 (sugars) being the main ones modified after physical
exercise. The authors suggest that ATR-FTIR analysis of saliva samples will be
able to distinguish the fitness level of athletes accurately.
More recently, in 2024, Souza et al. [21] used
ATR-FTIR to distinguish biochemical changes induced by different types of
exercise: high-intensity interval training, continuous exercise, and strength
training. The authors used more robust machine learning algorithms such as
Naive Bayes, Random Forest, K-NN, AdaBoost, Support Vector Machine, Neural
Network, and Logistic Regression to interpret the spectra. The authors observed
that the biochemical components changed explicitly according to each type of
exercise. Thus, spectral vibrational modes were identified as potential
biomarkers for every exercise performed.
The present research group has already been working with
the use of this tool (ATR-FTIR) to identify pathological conditions such as
iron overload, COVID, and sepsis [6,22,23,24,25], but this study was the first to use
the tool in a physiological condition such as physical exercise.
This study demonstrated that with FTIR data and an
unsupervised multivariate analysis, it was possible to distinguish the PRE- and
POST-TT20 moments. In addition, it was possible to observe the regions of the
spectrum responsible for this differential identification of the two moments,
being in the fingerprint region and not in the high wavenumber region.
Therefore, the study opens the possibility for the applicability of FTIR as a
personalized training load monitoring tool or even as a marker of a specific
biological response such as performance, fatigue, damage, or physiological
stress since the tool proved to be sensitive to detect individual variations
from moment to moment. Once this digital signature has been identified in an
overtraining condition, with a simple collection of capillary blood or saliva,
for example, followed by an analysis on the equipment for no more than 15
minutes followed by computational analysis, it will be possible to diagnose the
individual's condition almost instantly.
The results showed that it is possible to biochemically
differentiate and classify the physiological state of athletes undergoing
physical training by ATR-FTIR using PCA and GA-LDA. In addition, through
multivariate analysis, it is possible to identify the peaks of the spectra that
underwent alteration after physical stress, and these alterations are related
to variations in organic molecules due to the change in the physiological
state. These results demonstrate the sensitivity of the technology in detecting
changes in metabolism in a generalized manner and suggest the possibility of
being used to monitor adaptations in one athlete throughout training. However,
it is necessary to conduct other studies with larger sample sizes to better
evaluate the patterns in the spectra associated with improvement or worsening
in performance, greater or lesser muscle damage, or better or worse
cardiovascular response, for example, in practitioners of some regular physical
exercise.
Acknowledgments
We want to thank LabPetro for providing us with the
ATR-FTIR for the analyses and the entire technical team of the Instrumentation
Laboratory
Conflicts of interest
The authors declare no conflict of interest
Funding sources
Barauna VG is a PQ-CNPq research productivity fellow
(2023). The study was funded by the following notices: PROFIX-FAPES
(#711/2022), UNIVERSAL-FAPES (#979/2023), PRONEM-FAPES (#019/2022), and
IA2-CNPq (#54/2022)
Authors' contributions
Conception and research design: Barauna VG, Silvatti AP,
Leite RD; Data collection: Leal LB, Baiense IM, Lopes ALC; Data analysis and
interpretation: Leal LB, Baiense IM, Lopes ALC, Santos L, Nascimento MH;
Statistical analysis: Nascimento MH, Leal LB; Manuscript writing: Santos L,
Barauna VG; Critical revision of the manuscript for important intellectual
content: Leal LB, Baiense IM, Lopes ALC, Santos L, Nascimento, MH, Barauna VG,
Silvatti AP, Leite RD