IJCA Vol 4 i1 2025 webmag - Flipbook - Page 25
2025 | Volume 4, Issue 1
Experimental
plan
25
Table 5 presents the values for the experimental design of the PoC for the optoelectronic system.
Table 5: Experiment plan optoelectronic system
Camera distance
(a)
Camera angle
(b)
Lamp angle
(c)
Illuminance (lx)
OCR model
110 cm
90°
20°
400 lx
Tesseract
45°
40°
675 lx
EasyOCR
60°
950 lx
PaddleOCR
The determined values are based on the requirements pro昀椀le. The experiment will investigate relevant factors,
including the camera angle (90°, 45°), the lamp angle (20°, 40°, 60°, and 80°), the illumination intensity (400lx,
675lx, and 950lx), as well as three OCR models.
The use of the three OCR models is being implemented because all three models are currently approved at BMW
and are applied in different areas.
ICR (GPT-4v)
As part of the PoC, the detection technology ICR is being evaluated using the approved model GPT-4v. In this
case, a foundation model is being tested, for which the GPT-4v model from OpenAI is used (Olesia, 2023). GPT-4
is an advanced AI-based natural language processing technology developed by OpenAI (Kaushik, 2024). Only GPT4 Vision is allowed to be used as the sole multimodal model due to a secure internal environment at the BMW
Group (BMW Group, 2023). The same experimental parameters as in the experimental design in Table 5 are used
for detecting the component IDs (homologation labels). The GPT-4 work昀氀ow is illustrated in Figure 6.
Figure 6: Work昀氀ow GPT-4v
The depicted work昀氀ow is divided into four steps:
1. Component image capture: A camera captures images of the components, on which the component IDs
(homologation labels) are visible.
2. Use of OCR technology: For actual text detection, OCR models such as Tesseract, EasyOCR, or Paddle OCR
are used. These models are specialized in identifying text in images and converting it into machine-detectable text (Bugayong et al., 2003; Sarhan et al., 2024; Bagaria et al., 2024; BMW Group, 2023).
3. Post-processing by GPT-4v: After text detection, GPT-4v is used to further process and extract the recognized text data.
4. Integration into systems: The information processed by GPT-4 is integrated into internal BMW systems
to perform target/actual comparisons. GPT-4 analyzes the actual state, while the target state comes from
BMW's internal homologation system (Approve) (BMW Group, 2022). A function in Microsoft Power Apps is
used to perform the target/actual comparison (Microsoft, 2024) ( Jayapandian, 2022).
When using GPT-4, so-called bots are created. Table 6 provides a list of all the settings of these bots.