Computer Vision: an industrial case developed using AWS services

Artificial Intelligence solutions

computer vision

Computer vision, or artificial vision, is a field of AI that studies algorithms and techniques to enable computers to extract information from images, videos, and other types of data, in order to extract specific semantic information.

In particular, computer vision finds a concrete and increasingly widespread application in predictive maintenance, where systems are monitored through photographs or recordings in order to identify potential anomalies or signs of potential failures, and therefore intervene promptly.

Think especially of large infrastructures such as railways or the power distribution network; these are complex systems composed of various elements that interact with each other for proper functioning, but at the same time are subject to deterioration due to events such as rusting, corrosion, degradation, breakage, and presence of foreign bodies.

It is therefore necessary to constantly verify the health status of the infrastructural components in order to identify degradation situations and prevent possible failures and consequently service disruptions.

Since such components are often difficult to reach and are distributed over very large geographical areas, manual inspection presents operational difficulties and high costs, which can be drastically reduced thanks to machine learning, deep learning, and computer vision techniques for automatic recognition of infrastructural failures that support and assist the client’s operators in inspection activities.

Automatic Failure Detection in Computer Vision

Automatic failure detection, in software engineering, refers to the ability of a system to automatically detect a fault or malfunction without direct human intervention. This functionality is essential to ensure the reliability and availability of systems, especially in critical environments such as data centers, communication networks, and cloud infrastructures.

Automatic fault detection techniques may vary depending on the type of system and specific requirements.

Some common approaches include:

  1. Continuous monitoring: Using monitoring tools that constantly monitor system performance and status metrics, such as CPU load, memory usage, network traffic, etc. Exceeding certain limits or observing anomalies may indicate the presence of a fault.
  2. Integrity testing: Performing periodic or continuous tests to verify the operational status of the system. These tests may include data integrity checks, service availability checks, network connectivity testing, etc.
  3. Error log detection: Analyzing system logs for errors, warnings, or abnormal reports that may indicate an impending or ongoing fault.
  4. Pattern detection techniques: Using machine learning algorithms and models to recognize patterns or anomalous behaviors in monitoring data, signaling potential faults or malfunctions.

Once a fault or malfunction is detected, the automatic failure detection system can automatically trigger troubleshooting procedures, such as automatic service restoration, initiation of diagnostic processes, or alerting system administrators for manual intervention.

Read the article Computer Vision: What It Is and How to Use It to Detect Defects

Analysis Methodology

Revelis uses the Cross-Industry Standard Process for Data Mining methodology for data analysis and training of Machine Learning models; this is a standard method for the interactive and iterative execution of data mining and analysis processes characterized by 6 activities as illustrated in the following figure.

data mining life cycle

Phase 1

The preliminary assessment phase aims to support the client in estimating, evaluating, and designing the data analysis solution. Within this phase fall the following activities:

  • Business Understanding: It is aimed at analyzing the client’s application objectives, based on the following tasks: analysis of the “as-is” scenario; definition of expected results, as defined by the client’s decision-makers; definition of criteria used to determine the outcome of the data analysis process;
  • Data Understanding: It aims to understand the data and is essential to avoid unforeseen problems during the Data Preparation phase. Within this activity, the following tasks are performed: identification of internal/external sources; identification of entities and attributes significant for analysis; descriptive analysis of the available data, through evaluation of basic statistical functions and/or the use of graphs and aggregations; evaluation of data quality, identifying any problems related to format errors and/or representation of information.

Phase 2

Phase 2 involves the “Data Acquisition and Preparation” activity and is functional to connect to the informative sources identified in phase 1, as well as to prepare and clean the acquired data.

The following tasks are performed in this phase:

  • Data Filtering: This task aims to select data relevant to achieving the client’s objectives. To this end, data and attributes are “filtered” through sampling techniques and/or removal of “dirty” data.
  • Cleaning: Within this task, statistical techniques are applied to reconstruct missing data, manage errors in the data, correct measurement errors, and handle sparse data.
  • Creation of Synthetic Data: Often raw data is insufficient to effectively carry out the predetermined analysis. In these cases, new “plausible” data can be created from existing ones, using techniques for deriving attributes or generating records.
  • Data Integration: Data from different sources that refer to the same entities can be integrated through “merging” (enrichment of the entity through a greater number of attributes) or “joining” (enrichment of the number of available entities).

Phase 3

Phase 3 involves two distinct macro-process steps:

  1. Formulation and Implementation of the Analysis Model: This phase aims to define (through inductive statistical techniques) and evaluate (using standard metrics) one or more data mining models and involves the following activities:
  • Development of connectors: Connectors are software interfaces that allow importing data from external systems to be subjected to Data Mining algorithms or exporting results to external systems/platforms.
  • Modeling: Involves applying algorithms to an appropriate subset of available data called the training set. The data in the training set will undergo appropriate preprocessing activities to enable processing by statistical algorithms. The result is a decision support model, which can be aimed at purposes such as: (i) Classification; (ii) Regression; (iii) Segmentation; (iv) Summarization; (v) Association analysis; (vi) Sequence analysis. The algorithms are based on machine learning and deep learning techniques.
  • Evaluation: This activity focuses on measuring the model’s performance through appropriate metrics among those defined in the scientific literature. This allows for an objective evaluation of the quality of the induced models, and consequently verifying the satisfaction of defined business objectives. Performance evaluation occurs on an appropriate subset of available data called the test set, disjoint from the training set used in the model training phase.
  1. Conducting the analysis solution: This involves the “Deployment” activity, i.e., implementing an executable version of the induced model that can be used within the client’s application infrastructure.

Application of the CRISp-DM methodology for Failure detection on AWS technology

The described methodology is agnostic regarding the software tools used. In this article, we will show how to use AWS SageMaker technology for developing machine learning solutions.

AWS SageMaker is a fully-managed service for machine learning that enables the creation and deployment of machine learning models in a scalable and secure environment.

AWS SageMaker implements a three-stage modeling process:

  • Data Acquisition and Preparation
  • Model Training and Tuning
  • Model Deployment and Monitoring

Data Management

The first step in machine learning is to generate a dataset that is representative of the operational context, essential for training an ML model. Data scientists dedicate time and energy to exploring and preparing such data. Data sources are the objects that allow storing data and the metadata related to them.

SageMaker uses Amazon S3 to store data. S3 is AWS’s cloud object storage, where object storage refers to a type of data storage in which data is stored as objects, rather than as files or blocks. In an object storage system, each object consists of three main components: the data, a unique identifier, and metadata. In S3, objects are stored within logical containers called buckets.

In an anomaly detection application, high-resolution images of the infrastructure are acquired through aerial surveys. Photographic data and georeferenced data of assets are uploaded to S3.

The images are then pre-processed for subsequent processing stages using AWS Batch jobs.

AWS Batch is the AWS service that allows quickly and efficiently running hundreds of batch jobs on AWS. A batch job is a processing task that runs until the task is successfully completed or fails. AWS Batch dynamically provides the optimal amount of computing resources based on the volume and specific requirements of the submitted batch jobs.

Amazon S3

Model Training

In this phase, models are trained to search for patterns within the data. AWS SageMaker provides Jupyter notebook instances that allow easy access to data sources for exploratory analysis. It allows setting up inference pipelines by combining preprocessing, prediction, and post-processing tasks of the data built into Docker containers.

In our anomaly detection application, preprocessed images can be used to train:

  • object detection models, which deal with detecting objects within the image; the sought-after information includes both the object’s category (there may be more than one in the image) and the rectangular region (bounding box) containing it.
  • classification models, which are tasked with classifying the bounding boxes identified by object detection tasks based on the presence or absence of a specific anomaly.

Online evaluation of trained models allows evaluating their accuracy.

addestramento dei modelli

Model Deployment

At the end of the training phase, models are ready to be deployed and used in inference processes on new data.

AWS SageMaker allows creating inference jobs in two different modes:

  • batch, suitable for predictive analysis applications that do not have real-time requirements and/or that need to be applied to data blocks;
  • real-time, for applications requiring real-time processing, on individual data. In this case, the inference job is published as a REST service that can be invoked as needed.

The creation of machine learning models is a continuous process that involves deploying the model, monitoring inferences, and evaluating the model to identify performance drifts. To increase model accuracy, periodic retraining on new data becomes necessary.

Model Orchestration

The entire inference pipeline can be orchestrated through AWS Step Functions. AWS Step Functions is an orchestration service that allows creating and managing workflows in the cloud that combines various AWS services and is based on state machines and tasks.

A state machine is simply a workflow, i.e., a series of event-based steps. Each step in a workflow is called a state.

A Task state represents a unit of work performed by another AWS service, such as AWS Lambda. A Task state can call any AWS service or API.

Through the Step Functions management and monitoring dashboard, you can track its execution in real-time and collect statistics upon completion.

Orchestrazione dei modelli

Author: Achille Abritta