IJCA Vol 5 Issue 1 - Flipbook - Page 10
The International Journal of Conformity Assessment
I would like to o昀昀er several
technical observa琀椀ons regarding the
ar琀椀cle “Evalua琀椀on of the Capability
of Genera琀椀ve AI to Interpret and
Provide Guidance on the Applica琀椀on
of the ISO/IEC 17025 Standard.”
The paper presents a relevant and
innova琀椀ve approach by examining
the use of genera琀椀ve ar琀椀昀椀cial
intelligence in the interpreta琀椀on
of ISO/IEC 17025:2017. The
methodology is generally well
structured, and the results are
presented with clarity. However,
certain aspects of the study
may a昀昀ect the objec琀椀vity and
generalizability of its conclusions
and therefore merit further
considera琀椀on.
There appears to be a poten琀椀al
con昀氀ict of interest arising from
the dual role of the author as
both developer and evaluator
of the customized AI model (L
Squad). While this circumstance
does not invalidate the 昀椀ndings, it
introduces a risk of bias that would
bene昀椀t from explicit disclosure and
appropriate mi琀椀ga琀椀on measures.
Addi琀椀onally, aspects of the
experimental design may have
inadvertently favored the evaluated
model. The L Squad system
was speci昀椀cally con昀椀gured and
trained to respond in alignment
with the ISO/IEC 17025 standard,
whereas the comparator tools are
general purpose models without
equivalent domain speci昀椀c 昀椀ne
tuning. Similarly, some evalua琀椀on
criteria, most notably those related
to criteria based comprehension,
closely align with the capabili琀椀es for
which the customized model was
op琀椀mized. Furthermore, limited
detail regarding the assessment
instrument (including ques琀椀on
selec琀椀on, coverage of the standard,
and di昀케culty level) restricts the
reproducibility of the study.
The absence of independent or
blind evalua琀椀on, together with
a lack of fully equivalent tes琀椀ng
10
2026 | Volume 5, Issue 1
condi琀椀ons across the compared
models, suggests that the results
should be interpreted with
appropriate cau琀椀on.
set and greater detail on scope
and di昀케culty would enhance
transparency and reproducibility in
future studies.
From a perspec琀椀ve consistent
with the principles of impar琀椀ality,
transparency, and risk management,
future work in this area would
be strengthened by explicitly
declaring poten琀椀al con昀氀icts of
interest, incorpora琀椀ng independent
evalua琀椀on mechanisms, ensuring
equivalent compara琀椀ve condi琀椀ons,
and providing full transparency
regarding the assessment
instruments used.
Regarding impar琀椀ality, my role
as both the developer of L Squad
and the author of the study
presents a poten琀椀al source of bias
that should be acknowledged.
While this does not invalidate the
results, it does warrant careful
interpreta琀椀on. To mi琀椀gate this
risk, reference answers were
developed through consensus
among 昀椀ve subject ma琀琀er experts,
and iden琀椀cal evalua琀椀on condi琀椀ons
were applied across all models.
Nonetheless, future work would
bene昀椀t from independent or
blinded evalua琀椀on.
In conclusion, the ar琀椀cle represents
a valuable and 琀椀mely contribu琀椀on
to the discussion on the applica琀椀on
of genera琀椀ve AI in technical and
regulatory contexts. Nonetheless,
the considera琀椀ons outlined above
suggest that the 昀椀ndings should be
viewed as exploratory and context
dependent, and that addi琀椀onal
research is warranted to further
substan琀椀ate the conclusions.
Thank you for the opportunity to
provide these comments.
Sincerely,
M.Sc. Jonathan Tuya Salas
Lima, Peru
Thank you, Jonathan, for your
though琀昀ul feedback and for taking
the 琀椀me to review the ar琀椀cle.
The use of L Squad’s speci昀椀c
con昀椀gura琀椀on was inten琀椀onal and
central to the study’s objec琀椀ve.
The research aimed to assess
whether a model tailored to
ISO/IEC 17025 could produce
responses more closely aligned
with the norma琀椀ve framework
than general purpose tools, rather
than represen琀椀ng an unintended
advantage.
With respect to reproducibility,
all models were assessed using
the same 40 ques琀椀on instrument
under consistent promp琀椀ng
condi琀椀ons. I agree, however,
that providing the full ques琀椀on
Thank you again for your valuable
observa琀椀ons, which are helpful
in clarifying the study’s scope and
informing future research.
Diego Uribe
Lima, Peru