IJCA Vol 4 Issue 1 - Flipbook - Page 10
10
The International Journal of Conformity Assessment
Evaluation of the Capability of Generative AI to
Interpret and Provide Guidance on the Application of the
ISO/IEC 17025 Standard
By Diego Alejandro Uribe Polo, Laboratory Assessor and Independent Consultant, LAB-SQUAD
DOI: 10.55459/IJCA/v4i1/DU
-ABSTRACTThis study evaluates the ability of generative artificial intelligence models to interpret and provide guidance on
the ISO/IEC 17025 standard, with a focus on L-Squad, a customized ChatGPT model. Through a 40-question
exam assessing literal, inferential, and criterial comprehension—evaluating how well models can justify
or reason through decisions based on standards—the performance of four AI tools (Meta AI, ChatGPT 4.0
Free, ChatGPT o1, and L-Squad) was compared using the consensus of a panel of experts in laboratory
accreditation as a reference. The results showed that L-Squad achieved the highest overall score, excelling in
criterial comprehension due to its customized configuration and reinforcement learning with human feedback
(RLHF). However, all models exhibited strong literal and inferential understanding, with a 77.5% agreement
in responses. Despite these advancements, the findings emphasize the need for model customization and
human oversight when leveraging generative AI in standardization contexts such as ISO/IEC 17025. This
research underscores both the potential and the limitations of generative AI to support the application of
technical standards.
Keywords: Generative Arti昀椀cial Intelligence, ISO/IEC 17025, Conformity Assessment, Customized ChatGPT, Laboratory Accreditation,
Reinforcement Learning from Human Feedback (RLHF), Technical Standards, AI Risk Management, Normative Interpretation
integrates optimization techniques, such as Proximal
Policy Optimization (PPO), with reward modeling
Arti昀椀cial intelligence (AI) is transforming the way
based on human feedback (Naik, Naik, & Naik, 2024).
technical and standardization processes are
Through iterative learning cycles, the model re昀椀nes its
addressed across various sectors. Generative models,
responses to align with user expectations, enhancing
such as ChatGPT, have proven to be powerful tools
its accuracy and relevance in speci昀椀c contexts.
for interpreting and applying complex standardization
The acceptance of AI systems in regulatory
requirements. However, the effectiveness of these
tools depends on several factors, including the use of environments also relies on their transparency and
appropriate prompts, the technological pro昀椀ciency of veri昀椀ability. Information published by ISO/IEC JTC
1 SC 42 (2024) and the OECD (2023) highlights the
users, and the quality of the available information.
importance of managing associated risks, such as
According to the Latin American Arti昀椀cial Intelligence
biases and privacy, to ensure generative models
Index (ILIA), there is a signi昀椀cant gap in technological
are trustworthy. Similarly, the ISO/IEC 42001:2023
competencies between Latin America and the
standard provides clear guidelines for documenting
Global North (CENIA, 2024). This gap limits the use
and managing risks in AI systems, ensuring decision
of advanced tools but also creates an opportunity
traceability.
to strengthen generative AI capabilities, particularly
In the educational domain, the integration of AI
in countries like Chile and Uruguay, which lead
presents opportunities to personalize learning
in AI research and adoption. ILIA underscores
and enhance research while also posing ethical
the necessity of high-quality data and robust
infrastructure in training models capable of accurately and technical challenges (Pedreño Muñoz et al.,
2024). This article evaluates how well a customized
interpreting technical information.
ChatGPT model (L-Squad) understands ISO/IEC 17025
Reinforcement Learning from Human Feedback
requirements and its ability to interpret and provide
(RLHF) is a methodology with notable potential
guidance on applying the standard in testing and
in con昀椀guring customized GPTs. This approach
calibration laboratory management systems.
Introduction