EXAMENSARBETE INOM ELEKTROTEKNIK, AVANCERAD NIVÅ, 30 HP STOCKHOLM, SVERIGE 2021 Improving quality control in automation projects using simulation systems VIKTOR LÖNNROTH KTH SKOLAN FÖR ELEKTROTEKNIK OCH DATAVETENSKAP
Abstract Automation systems are becoming more and more integrated in today s society. The complexities of these systems are increasing and with this, the demand for high quality during the developmental phases. This poses a challenge for companies developing systems. One solution emerging form this issue is the use of simulations and virtual commissioning. In the thesis, the process of quality control and the effective use of simulations in automation system development projects are studied. The focus lies on the software part of the systems. The study was conducted as an interview study of personnel at an automation systems development company. After conducting the interviews, the information was analyzed. This was combined with general theory regarding quality control and testing methods in software development. The test methods of the quality control process were then combined with what was required by a simulator in order to preform them. The results of the thesis is a detailed picture of the quality control process. Systems are tested twice with the same testing hierarchy, first during development and then during commissioning. The difference is that during development, the final physical hardware and process system are not conceded to the software. This impacts the software system functionality and limiting its testability. Using simulations during development can reduce the disparity between the systems before and after deployment, improving quality. Considerations regarding the extent of simulation need to be made in order for the value of the simulators error removing potential to be higher than the cost of developing the simulator. Keywords Quality control, Industrial control systems, Virtual commissioning, Simulations, Automation projects
Abstract Automationssystem blir mer och mer integrerade i dagens samhälle. Systemens komplexitet ökar och med det karven på hög kvalitet under utvecklingen. Detta blir en utmaning för de företag som utvecklar systemen. En lösning som är på uppgång för detta är användningen av simulering och virtuell driftsättning. I denna rapport studeras processen för kvalitetskontroll och effektiv användning av simulering i utvecklingsprojekt av automationssystem. Fokus ligger på mjukvarudelen av systemen. Studien är gjord som en intervjustudie av personal från ett automationsutvecklings företag. Efter intervjuerna analyserades materialet och kombinerades med generella teorier om kvalitetskontroll och testning i mjukvaruutveckling. Processens tester kombinerades sedan med vad som krävs av en simulator för att utföra dessa. Resultatet av detta arbete är för det första en detaljerad bild av kvalitetskontrolls processen. Detta visar att systemen tests två gånger med samma testhierarki, först under utvecklingen och sedan under driftsättningen. Skillnaden är att under utvecklingen saknas den riktiga hårdvaran och process systemet vilket påverkar mjukvarans funktionalitet och begränsar testbarheten. Genom att använda simulering under utvecklingen kan skillnaden mellan systemen före och efter distribution vilket ökar kvaliteten. Överväganden gällande vilken nivå av simulering som behövs för att värdet av simulerings potentiella förmåga att ta bort fel skall bli göre en kostande för att utveckla simuleringen. Nyckelord Kvalitetskontroll, Industriella styrsystem, Virtuell driftsättning, Simulering, Automationsprojekt
Table of Contents 1 Introduction... 1 1.1. Background... 1 1.2. Problem... 2 1.3. Benefits, Ethics and Sustainability... 2 1.4. Method... 3 1.5. Delimitations... 3 1.6. Outline... 4 2 Theoretical Background... 5 2.1 Quality control theory... 5 2.1.1 Verification and Validation... 5 2.1.2 Testing... 6 2.1.2.1 Static Testing...7 2.1.2.2 Dynamic testing...7 2.1.2.3 Acceptance testing...9 2.2 Automation systems... 9 2.2.1 Automation system implementation... 9 2.2.1.1 Hierarchical system... 10 2.2.1.2 Control Software system implementation... 12 2.2.2 Automation projects...16 2.2.2.1 Conceptual design... 16 2.2.2.2 Functional design... 16 2.2.2.3 Procurement and Engineering... 16 2.2.2.4 Deployment... 17 2.2.2.5 Commissioning... 17 2.2.2.6 Operation... 17 2.2.3 Simulators used to develop industrial control systems...18 2.2.4 Simulator implementation classification...19 2.2.4.1 Interconnection level... 20 2.2.4.2 Model complexity... 20 2.2.4.3 Hardware realization level... 21 2.2.5 Simulator costs...22 2.2.5.1 Base costs for a simulation environment... 22 2.2.5.2 Cost for developing simulation systems... 22 3 Method... 24 3.1 Data collection... 25 3.2 Initial qualitive summary... 26 3.2.1 Summarizing the interviews...27 3.2.2 Coding the data...27 3.2.3 Searching for and defining themes...27 3.3 Analysis Simulations, Testing and Quality Control... 28 3.4 Cost and benefit analysis of the simulator... 28 3.5 Thesis quality considerations... 29 4 Interview results and initial analysis... 31 4.1 Identified quality control elements... 31 4.2 Errors... 35 4.2.1 Error occurrence and removal...36 4.3 Summary of interview data... 38 i
5 Analysis... 39 5.1 Quality levels during the project lifecycle... 39 5.2 Tests used... 40 5.2.1 Static testing...41 5.2.2 Dynamic testing...41 5.2.2.1 Unit tests... 42 5.2.2.2 Integration tests... 43 5.2.2.3 System tests... 45 6 Discussion... 47 6.1 Advantage of simulations compared to test code... 47 6.2 Potential of improved quality control... 47 6.3 Choosing the correct model implementation level... 48 6.4 Impact on error categories using simulations... 49 6.5 Simulator benefit vs cost discussion... 50 6.6 Applicability of thesis in the industry... 52 7 Conclusions... 53 7.1 Future Work... 54 7.1.1 Other quality control methods...54 7.1.2 Developing standard cost models for simulation development...54 References... 55 Appendix A Example of an interview guide... 1 Appendix B Coding tables for errors and QC... 2 Appendix C Mapping of the host company project and QC model... 1 ii
1 Introduction Validating that different products or systems are working correctly is important in all engineering disciplines. The automation industry is no exception. The more that is automated the more important it becomes. Failure of systems can today be very serious. In industries like infrastructure or energy, a failure could cause major damage to society [1]. Any errors that are made by the companies developing or designing the systems must be corrected before the systems are commissioned. Solutions like virtual commissioning using simulation tools are emerging in the industry [2]. This provides a challenge for the systems developers. Firstly, it s technically challenging to develop simulators that match and can be used to evaluate the automation systems. There is also a business challenge for the companies developing the systems. The engineering workflow and business solutions needs to adapt to the new approaches in the industry, like virtual commissioning [3]. 1.1. Background Automation systems are any systems that can with little or no human interaction manage and control a product, process or system [4]. Automation is present in many areas like healthcare, buildings and security [5]. The origins of automation are in the manufacturing industry. This means that often when the term automation system is used it refers to industrial automation systems or industrial control systems [6]. Automation is important in today s society [5]. As the systems grow in size and complexity [7] it is important that the quality stays at its highest. This puts more and more demand at the companies that are designing and constructing the systems. As systems are becoming more automated the responsibility for stabile operation is moved from the human operators to designers and programmers of the automation systems [5]. This means that the programmer needs more insight in the process that the system is controlling in order to know that it is working. With the introduction of the Internet of Things and Industry 4.0 [8] and the new interfaces between IT and OT [9] complexity and interconnections grows even further. All this combined means that the companies may need to improve the Quality control process in order to keep up with the changes and increase in complexity that is happening[10]. Automation system projects are often a part of large projects and issues causing time plans to fail can have dire consequences in both costs and reputation for the contracting companies. In large projects the amount of work needed for design and construction of the systems can often be several months. This often needs to be commissioned in a very short and inflexible timeframe [11]. Because of this there is little time to fix errors during commissioning [12]. To evaluate that the system is implemented correctly form of quality control used [13]. As the project moves forward the system becomes more implemented and quality controlled during this development. During the initial quality 1
control it can be hard to verify that the system is behaving as intended and during and later three is usually little time to make changes if problems are discovered. This has been a problem for automation contractors causing extra costs and risks for them. One solution that is emerging in the industry is the use of simulation. Simulation can be used in all aspects of the engineering process [14]. It can be used when designing the process itself to the design of the mechanical and electrical systems. It is today possible to do a so-called virtual commissioning [12] were the system is tested and verified against a full digital twin of the system that is to be constructed. Despite the many benefits simulations are not a perfect solution for solving the quality control problem in all projects. Constructing a full digital twin of a system can be very expensive and time consuming. The customer might not be willing to pay for the construction of a full-scale simulator. The contracting company needs to be able to find a balance between the cost of setting up and using a simulation and the benefits it gives in terns if getting the product verified at an early stage. 1.2. Problem One problem faced by industry today, is that as automation systems are growing in size and complexity, the process of quality control becomes harder. The customer also have very tight time schedules, and typically there will be large consequences if they are not meet. Failure to meet time schedules can delay production causing great losses. Because of this contractors risk large fines for causing delays in a project. One typical cause is errors introducing rework late in development. These things combined put more demands on the quality control. The methods for used for quality control needs to be as good as possible. This gives this project its two research questions: RQ1: What are the main difficulties in the quality control process for contracting companies in the automation industry? RQ2: For which quality control difficulties can simulations be used as an effective tool? 1.3. Benefits, Ethics and Sustainability The main benefit of this thesis will be for automation companies. They will be able to improve the quality control method and thus save money and time. A better quality control will also lower risk since the companies can be assured that errors will be discovered. For the people working with implementing the systems better quality control will improve their working conditions. Commissioning of systems is an intense process that requires a lot of work being done in short time frames. If errors are presents at the time of commissioning, there is usually limited time to fix them. 2
This requires implementers to work overtime fixing them which can be very stressful and unhealthy. Improved quality control and possibility of virtual commissioning will allow systems to be more ready than before. In some industries like energy and chemical plants optimizing the process for environmental reasons is important. Traditionally this must be done when the plant has been started up. While optimizing the process unnecessary emissions could occur. Using virtual commissioning this can be reduced. A better quality control also reduces the risk of plant failure during startup. Simulations has been used for a long time in the oil and gas industry and for nuclear energy since failure here will have such dire conciseness. In industries like energy and chemical plant failure can also have catastrophic for the environment. Improved quality control reduces the risk of this happening. 1.4. Method The project conducted a literary review in order to get a better definition of the problem and get a better understating of the field. The workflow in this project was based on the DMAIC [15] method of quality improvement used in Six Sigma[16]. The project will not be a pure Six Sigma or DMAIC project, but the task structure was based the workflow in these methods. For the scope of the project only the Define, Measure and Analyze parts of the method was used. The main method used in the project was data triangulation[17] with the help of semi-structed interviews [18] of host company personnel. The project interviewed several persons with the same and different roles in automation projects. The data gathered from these interviews was compared and summarized using triangulation and this formed the basis of the conclusions. 1.5. Delimitations The thesis mainly look at automation projects from a developer point of view. This thesis will be a case study of one automation company, Midroc Automation, hens mentioned as the host company. The thesis will only investigate the problems of the quality control methods and suggest for which simulation can be useful. It will not propose suggestions how simulations can be implemented. The thesis will not investigate any other methods for solving quality control problems except simulations. The thesis focuses on the software development part of a automation project. The other parts like mechanical and electrical design are mentioned but not studied. 3
1.6. Outline The paper is organized in the following way. Chapter 2 describes the theoretical background for quality control, testing, automation systems and simulator implementation. Chapter 3 looks at the methods that was used in the thesis mainly focusing on how the interviews was done and their results used in the analysis. In chapter 4 the results of the interviews are presented. Chapter 5 analyses how the different test that was used in quality control process, who simulations can be used in the tests and the cost of the simulator. In chapter 6 the results and analysis are discussed. Finally, in chapter 7 the conclusions are outlined, and future work discussed. 4
2 Theoretical Background This chapter will provide a theoretical background with regard to the two primary topics in this thesis. The first topic is quality control theory, and the material described in this study are based on the computer science literature. The second topic is automation systems. 2.1 Quality control theory Quality control is typically defined in the ISO 9000 standard of quality management [19]. It involves determining whether a product or system meets the quality requirements of the customer. Quality control includes the concepts of verification, validation and testing. One or more of these can be used to fulfill the needs of quality control. The theory described in the sections below is primarily taken from the fields of software and computer science. However, there are no direct methods described only these concepts, which can also be applied to other engineering disciplines, such as automation. Automation also has a large software component in the engineering process [11]. The terminology and approaches used by different companies, disciplines and projects can differ markedly. Testing is typically a part of verification and validation, and there is no direct way to say that a test type belongs to verification or validation. This categorization depends on how the test is used and what requirements the test is conducted against. It can, however, be good to know the correct terminology and what different methods do and can describe what is achieved by different tests and where errors can and cannot be found. 2.1.1 Verification and Validation Verification and validation are terms that are used to check if a system or component is working correctly. The terms are sometimes used loosely; however, there are strict definitions in computer systems according to the IEEE [20]. Verification is the process of determining whether a system or component satisfies the demands that were imposed at the start of the phase in which it was produced. Validation is the process of determining whether an implementation of a system or component meets the end requirements to which it is subjected. These requirements can be set by the same abstraction level or from the level above. 5
Figure 1. Example of a bottom-up verification and validation process. The system abstraction levels are verified both against the same level and the level above. All levels are validated against the user requirements The primary difference between verification and validation is that the verification process must be predetermined before the implementation is done. This is because verification involves checking how something is implemented, while validation only checks that the requirements of the system are met [21]. Verification is primarily used by the person or company implementing the system to ensure that the system is being built correctly. The validation process will be interesting both to the implementer and the end user that wants to know that the completed system is working as intended [21]. For the implementer, verification can also be used to track how a project is going in nontechnical areas. If, for example, a system is broken down into smaller tasks, these tasks can be individually verified. A plan can be developed when each task should be verified [22]. By ensuring that the tasks are verified in time, the project can be performed in time, which also allows other methods, such as earned value management, to be applied, making the project even more efficient [23]. When verifying a system during design and implementation, the implementer receives more feedback that the implementation is proceeding correctly. Finding and correcting errors during the verification process is also cheaper than during the validation process because verification occurs first [24]. 2.1.2 Testing Testing is defined by IEEE [20] and is an activity that checks a component or system under specific conditions and observes and compares results to a given specification. Testing can be divided into several different types and methods. Examples of these are shown in Figure 2. Different companies can, however, have different interpretations and definitions. 6
Figure 2. Example of a test hierarchy in which different tests are categorized under different labels 2.1.2.1 Static Testing Static testing are forms of testing where the system that is developed is checked without it operating [25]. Examples of static testing include parameter checking, code reviewing and static time analysis. Parameter checking involves reviewing the implemented system s parameters and comparing them to predetermined lists. Code reviewing [25] checks if the system s code follows a standard structure. Static time analysis takes the source code, analyses it and produces an estimate of how long it will take to run [26]. 2.1.2.2 Dynamic testing Dynamic testing involves running the system to evaluate whether the system is correctly implemented. Because the code must run, either the entire system must be completed, or test benches for units or subsystems must be used. Dynamic testing is performed either in a white or black box way [27]. Whitebox testing is performed when knowledge of how the system is or should be implemented is available. White-box testing typically involves a coverage model [28] that systematically checks all parts of the system s structure. Test cases are determined by analyzing the system s structure. White-box testing can provide good results but can be complex if the system is large or otherwise complex. Black-box testing is performed when there is little or no knowledge of the internal structure of the system and is typically focused on the function of the system. Validation is often a form of black-box testing because validation often only looks at user requirements and not the internal structure of the system [21]. Black-box testing can be necessary if the system is large and complex, or if obtaining good coverage in white-box testing requires too much time or effort. Dynamic testing is often performed at different levels, which are typically unit testing, integration testing and system testing [22]. This hierarchical testing strategy is often coupled with a design and development strategy, such as the V model [29]. In the V model, design is performed top-down, and testing is performed bottom-up. Because the testing level increases the system, the testing process becomes increasingly complex. Thus, testing tends to move from a white-box test during unit testing to a black-box test during system testing. 7
Performing the testing bottom-up means that the components that are tested first are tested individually in unit tests; the connections between units are then tested in integration tests; and finally, the entire system is tested in the system test. With a bottom-up testing approach, one advantage is that as low-level components are tested and verified or validated, they can be used for verifying and validating higher-level components. If an error is discovered during integration testing, it can be assumed that the error is in the connection between units. If the unit tests have yielded good results and were performed with care, the internal workings of those components are already verified, and the error must be in the connections. When the system tests are performed, the complete system is verified and ready to be deployed. An example of the test approach is shown in Figure 3. Figure 3. Illustration of how a system consisting of interconnected units can be dynamically tested. Three units, two integration and one system test are performed. As shown in Figure 3, the higher the testing complexity is, the fewer tests need to be performed. This concept is known as the testing pyramid [30]. Testing using the testing pyramid approach can be cost effective because performing smaller, less complex tests takes less time. However, the testing pyramid relies on the fact that once a test is completed, it is assumed that no or few errors remain in that step in the hierarchy. For example, errors that could be found during integration testing are not looked for in system testing because they are assumed to have already been found. 8
2.1.2.3 Acceptance testing Acceptance testing is defined in IEEE 610 [12]. It requires that the system being developed meets certain requirements that are typically set by the customer and are typically some form of validation. Acceptance testing can be both static and dynamic depending on user requirements and are often dynamic black box tests because they are typically used for validation. Acceptance tests can have different levels. Internal acceptance tests (IAT) are set up and conducted by the developer and check that the system will meet the requirements at an early stage. After the IAT, factory acceptance tests (FAT) are typically performed by the developer when the system is being developed. The customer will likely set an agenda and approve the test results. However, it might not be possible to test all features of the system at this stage. The environment in which the system is developed might be different from the environment in which it will finally operate. The last test is typically a site acceptance test (SAT), where the system should be fully implemented at the customer s location. During the site acceptance test, the customer checks that everything is working according to the design specifications and approves the project. After the SAT, the customer takes control of the system. System warranties can still apply if a problem develops; however, the customer retains operational responsibility. 2.2 Automation systems Industrial automation systems were first used in the manufacturing industry and have evolved into having a prominent role in the process industry and within public utilities and infrastructure. The primary role of using automation systems is to make processes more efficient and cost effective. There are also benefits in reducing the need for human labor in hostile and dangerous environments such as mining [31]. 2.2.1 Automation system implementation Automation systems function on the basic control principle of a controller influencing a plant. The controller obtains an input from a user corresponding to the desired output of the plant. To determine the current state of the plant, the controller uses some type of sensors, and the plant is affected by actuators to achieve the desired output. A simple schematic of this process is shown in Figure 4. 9
Figure 4. Basic model of an automation system consisting of three subsystems A full automation system is implemented using three interconnected subsystems. These systems are the plant process system, a control system and a collection of instrumentation systems [32]: The plant process system is controlled and monitored by the remainder of the automation system. What this is varies depending on the type of system. For example, in manufacturing, this can be a line of robot cells that can perform different tasks, such as welding or clamping. In process industries, tanks and aggregators can mix different substances. The function of an automation system is to ensure that the process plant system functions correctly and produces acceptable output. The inputs and outputs for this system are physical signals including force, movement, temperature or pressure. The control system decides what the process system should do and is either performed continuously or discreetly. For example, in the manufacturing industry, it is common to use a discrete sequential interlocking approach to control a robot s actions. In the process industries, it is more common to use continuous systems with PID regulators. The control system consists of both hardware and software components; the software is run on some type of hardware platform, which varies between the different systems. It typically consists of processors and networks in several layers. More details about the control system are discussed in Section 2.2.1.2. Instrumentation systems act as interfaces between the physical process plant and the control system. Instrumentation typically consists of a component that monitors a process; sensors; a component that commands the process; and actuators. This type of system is the socalled field layer that will be discussed in a later chapter. 2.2.1.1 Hierarchical system A plant-wide control system consists of a hierarchical system. A common model is the one used in ISA 95/IEC 62264-3[33]. An illustration of the standard is shown in Figure 5. 10
Figure 5. Representation of the ISA 95 hierarchal model of automation systems Level 0 is called the field level and includes sensors and actuators. Typical sensors measure temperature, pressure and position; however, in theory, anything that can be measured and converted into an electrical signal can be sensed. Actuators are performed by devices such as motors, pumps or valves. Level 1 is called as the control level and here sensors and actuators are connected to some type of controller. The connections between controllers and sensors and actuators are performed via a fieldbus network [34] and hard-wired connections. Controllers are typically programmable logic controllers (PLCs) that control a set of actuators and sensors that manage some sub-component of the plant. In manufacturing, this component is typically a cell that handles one part of the production chain. Level 2 is called the supervising level and includes several controllers that are supervised and controlled to ensure that the different controllers in level 1 operate properly together. Technologies such as supervisory control and data acquisition (SCADA), human machine interfaces (HMI) or distributed control systems (DCS) are typically used in this level. All components are connected via some type of industrial networks such as industrial ethernet networks. This level also includes primary low level human interactions. Levels 3 and 4 are used for plant and enterprise planning and strategies. In level 3, issues such as scheduling, quality management and maintenance are managed. Level 4 addresses enterprise-level issues such as finance and logistics. This layer is not discussed in detail in this thesis because it is typically 11
not involved in directly controlling the plant in real time, which is the primary focus of this thesis. 2.2.1.2 Control Software system implementation The nature of the software used in industrial control systems is based on its origins of being purely electrical systems [35]. The earliest software used ladder programing, a form of graphical programming that was understandable for a person that understood how the read and write relay logic. This setup was useful because ladder programming replaced relay-based systems when softwarebased control systems developed. Today, there is a standard for the most commonly used programming languages, the IEC 61131-3:2013 [36]. This standard defines types of languages that all have different advantages and disadvantages. A mix of different programming languages could be used when developing software. For example, when using function block diagrams (FBS) [36], the inner workings of the blocks can be programmed using ladder blocks or structured text, and sequences can be created using sequential flow charts, even though the remainder of the program is made with FBDs. In programming today, there is a strive to reuse code already created and validated [37] to reduce implementation time and produce a more validated system. The customer or owner of the plant can also have standards that are used to make the system less dependent on a single contractor integrator. With FBS, code reuse occurs naturally but can also be achieved with other languages. This process is often achieved with the help of the development environment used to program software systems [38]. Types for different objects are created in the software platform, which also has tools to generate objects of various types efficiently. This system approach can be developed in several steps using a bottom-up method, which is supported by programing languages [39, p. 1]. The terms used in this study are specific to the host company of this thesis: Types or typicals. The first step is to develop types that can be used in programming. Types can be functional like controllers or can be related to objects (e.g., a type for a motor or sensor interface). Among possible types, configurations and signal connections are set up so each type behaves as desired. Each type can also be pre-validated by the customer. Instances. Based on types, instances are created. An instance is a realization of the type with a pericellular configuration. The building blocks of the system, instances can have physical counterparts, such as motors or sensors, or can be purely software objects, such as sequences or regulators. The most basic functions are performed within instances. Process area s internal connections. A process area is a set of interconnected instances. Instances exchange data with each other, and these interconnections create more advanced functions. An example of this can be a continuous closed circuit control loop with a sensor, 12
actuator and controller. Another can be a robot cell that performs different actions when it detects that it has received material. Process area s external connections. Process areas can also be interconnected, creating even more advanced functions. Different parts of the system must interact with each other; for example, a production line of several robot cells must be coordinated because material can only move from one cell when the next is ready. System. The final layer is the system, where everything is brought together. The system level can interact with other types of systems, such as a recipe handler. The system also manage issues such as safety or networking, ensuring that the process can operate safely and securely. A schematic view of an example system is shown in Figure 6. The same system can be described as a piping and instrumentation diagram (P&ID) diagram in Figure 7. Figure 6. Example of an automation system consisting of interconnected instances. The three subsystems are represented with different shapes. Figure 7. Piping and instrumentation diagram of the system shown in Figure 6 13
In the example system, the software system is used to control the temperature after the hot side of the heat exchanger using a PID regulator to control the flow to the cold side of the heat exchanger. The flow is actuated using a valve. A pump provides pressure for the cold water system. The software prevents the valve from opening if there is no pressure. The final component consists of a sequence that starts the pump and regulator. Figures 6 and 7 do not show the connection to the remainder of the software and plant systems. In each of the software layers, there are also connections to the hardware system, the control system s hardware platform, the instrumentation system, and the plant process system. Connections to these systems were made during the system deployment phase of the project. Thus, when a level is being developed, the underlying level will not be completely finished and can be subject to change. A schematic view of how the system is developed is shown in Figure 8. 14
Figure 8. Illustration of how a software system is developed during the different phases of an automation project. 15
2.2.2 Automation projects One possible lifecycle model for automation projects is shown in Figure 9, and the different phases are described in [40]. Automation projects are often part of larger multidisciplinary projects, where plans must be made in conjunction with other disciplines. This fact is particularly true during the design and deployment stages. The design must be created in a way that all parts will function together. During deployment, everything must proceed in a given order (e.g., mechanics before instruments). This thesis primarily focuses on software development. In these sections not only, software development is specifically considered to provide a better picture of the extent of automation projects. Conceptual design Functional design Procurement and Engineering Deployment Commissioning Operation Figure 9 Example of the phases that can exist in an automation project. The phases and names can differ between projects, developers or customers. 2.2.2.1 Conceptual design In this step, the basic requirements and concepts for the plant are designed; however, few technical details are decided in this step. 2.2.2.2 Functional design In this step, the overall functional design and requirements are determined., and certain important documents are also produced. Process schematic: These documents schematically describe the process and how different objects are interconnected in the physical system. Functional design description: This description details how different plant functions are to be achieved. Object lists: These lists specify all objects in the system, including software, hardware or process objects. Technical requirements: These requirements dictate how the system will operate and can be related to performance, safety requirements or applicable standards. Today, modern plant engineering tools such as COMOS [41] can be used to manage all the data and documents generated during the functional design phase, which helps keep track of all objects and systems, and allows for better revision handling. 2.2.2.3 Procurement and Engineering After the functional design is completed, all components required for the system can be obtained, which includes both process components and the components of the control systems, such as PLCs and network components. 16
The engineering of control systems often occurs in three stages: System development involves designing and setting up the system architecture that will hold the control system. System development considers the requirements that were developed in the previous step and translates them to the system. This process includes designing network layouts, allocating tasks to controllers and setting up the signal distribution. Programing involves creating all functions of the system in code, which can apply to controllers, the SCADA system or graphics for the HMI. Programming also includes coding typical code, types, for objects. Electrical design specifies the design and creation the electrical equipment required for the control system. Creating electrical drawings for the power and signal distribution circuits of the system is performed in this step, and the cabinets and racks containing the equipment are created in workshops and are then shipped to the project site. System development is performed first because it is required for electrical design and programming. When developing control systems, there is an incentive to create typical circuits for objects that can be reused. This process is performed in both programming and electrical design, and yields many advantages, such as less effort required when verifying. This phase is concluded with an FAT, where the customer approves the engineering design. 2.2.2.4 Deployment In this step, the system is assembled at the project site. Signal cabinets, controllers, networks and servers are set up, and deployment is linked to the remainder of mechanical and civil engineering. This process indicates that the timeframe during which deployment must be performed is fixed, which increases demand on the remaining steps to be performed on time. 2.2.2.5 Commissioning Commissioning involves starting up the system to check that it is operating correctly. Before the system is started, it is first site-tested, which builds on the same hierarchical idea as the remainder of the automation systems. First, signals are tested, and then objects, functions and finally the system. An initial tuning of the system is also performed in this step. This phase can end with a SAT, where the customers approves of the system s operation. After commissioning is complete, the system is handed over to the customer. 2.2.2.6 Operation The customer is now responsible for the operation and maintenance of the system. Depending on the type of system, the system may be improved during 17
operation. For example, if the system is a production facility, new production lines might be added. 2.2.3 Simulators used to develop industrial control systems There are commercially available simulators for control system development. Examples include the ABB 800xA Ability process simulator [42], Siemens SIMIT [43] and Emerson Mimic [44]. When looking at the manuals and reference documentation for the different simulations, their basic principles are all similar. Control system simulators that are available today consist of two primary parts. The first part is a link or coupling to the control system [45] that allows the simulator to connect to the control system and exchange signals. This linking is performed differently between platforms and systems. There are also different levels of hardware realization, installation or emulation, that can be used. A more detailed review of different types of hardware realization is shown in Section 2.2.4.3. The other primary part of the system is the behavior model, which describes the behaviors of different simulated systems and includes the hardware and the process plant systems described in Section 2.2. The model is then programmed in a similar way to the automation system. Programming is performed with systems and APIs that resemble the IEC 61131-3:2013 standard of programming that is used in automation systems. Another similarity to programming the control system is the possibility to auto-generating code using templates or types in the simulation development platform and object lists. The simulator is an interface between the hardware and process plant systems and the software system. The simulator is also a testbench that enables dynamic testing of the system at an earlier step in the development process [46]. When the hardware and process plant systems have not been implemented, the simulator must be constructed using the reference documents that describe the system, which enables so-called virtual commissioning [47] by having the hardware and process implemented in the simulated environment. In theory, the same test that are going to run during the real commissioning process can now be run before the system is deployed There can also be other features available in the tools, such as animation and visualization options, which are commonly used in simulations of the manufacturing process where robots and machines can be simulated [48]. Another feature is the possibility of using scripting and automatic testing. [44]. These features increase the usability of the simulations. Using these features allows the simulator to be used more effective during testing or other activities, such as scenario exploration and operator training. Today s tools for developing simulators are similar to those used to program the automation system. Previously, simulation tools were used only by experienced 18
engineers and required many hours of labor. Now, engineers that are used to programming control systems can more easily use simulation tools. This is an idea that have been used before in the automation industry, the introduction of new technology in a way that is recognized and accepted by current developers. The simulator system can also be divided up in the way that was described for the other systems: instances and process areas. These categories indicate that simulator systems can be developed in a way that is similar to what was described in Section 2.2.1.2. In Figure 10, the same software system that was described in Figure 6 in Section 2.2.1.2 is shown; however, the components of the hardware and process plant systems have in this case been replaced with simulated components. All hardware components have been replaced individually; the same components that were in the real system exist in the simulated system. The process plant components have been abstracted into simpler components. Different model complexities and realizations are discussed in Section 2.2.4.2. Figure 10. Example of one possible simulation implementation of the system that is shown in Figures 6 and 7. The process model can, for example, be a linear dependency 2.2.4 Simulator implementation classification A simulator can be developed to different sizes and levels of complexity. In this section a classification of these different levels will be established. It will also considered how the levels effect development costs. The levels can be descend in different categories. 19
Three different categories were considered: Interconnection level Model complexity Hardware realization level These three categories can be applied to a simulation system or a certain part of a simulation system, where different levels can exist in different parts. 2.2.4.1 Interconnection level The simulator interconnection level is the level where different objects are connected and interact with each other. The interconnection level of the simulator is the same as the type of measurement at the development level for the software, as described in Section 2.2.1.2. The names of the levels are also the same: Type Instance Process Area s Internal Connections Process Area s External Connections System The definitions for the different levels are also the same as in Section 2.2.1.2; thus, they will not be reported here. The implementation cost of increasing the interconnection level increases geometricity, which is supported by general theory about how costs grow in ICT development [49] and how the number of connections grows as a system becomes more interconnected [50]. 2.2.4.2 Model complexity Model complexity measures how complexly the parts that create the model are implemented. The parts can be different objects and connections that are in the process plant and hardware systems. In [47], there are four levels of complexity mentioned; in the example below, there are five levels. The difference is that the Model with dynamics level in [47] is divided into two parts; in the example below, there is one linear and one nonlinear part because the difference in effort and knowledge required to set up a linear or nonlinear model is sufficient to warrant different complexity levels. Only manual responses: Signals can only be manipulated by the person using the simulation. No automatic behavior is programmed into the model. Automatic discrete responses: The simulator responds to outputs from the control system or the manual simulator inputs discretely. There can be time delays in the signal response and temporary continuous signals that stabilize to a fixed value after a fixed amount of time. If no new inputs are input to the simulator, the behavior is static and 20
unchanged. The simulator will always respond to a change in the same way. Continuous linear responses: The response of the simulator is always continuous. The simulator behavior is dynamic and can change without a change in control system output or simulation user input. The instances, connections and individual subsystems all have linear models. Continuous nonlinear responses: As in the above category, responses can vary but, in this case, nonlinearly. The nonlinear behavior can come either from direct programming or from multiple linear subsystems interacting in a way that becomes nonlinear. Fully realistic: This category is similar to the above category; however, the models behave in the exact same way as they would in the real plant. The cost of increasing complexity grows geometrically with the same motivation regarding the interconnection level shown in Section 2.2.4.1. 2.2.4.3 Hardware realization level The hardware realization level determines how the simulator connects to the software system. In the development platforms for software systems, there are currently option of testing code on a soft controller that is available for the control software development platform [51]. In the simulation platforms, there are also options to emulate controllers, devices and networks to build up a virtual hardware environment within which the system can run. These two options are known as software in the loop [12]. There is also the possibility to run real hardware coupled to both the simulation system and software control system, which is known as hardware in the loop [12]. In software and hardware in the loop, there are also variants and hybrids, and it is also possible to have certain parts of the system in one realization level and another with a different realization level. For the analysis performed in this thesis, four different levels are considered. Software running in development platform. Simulation connects to the development platform. The software might be modified by the development platform to run, for example, setting simulation flags on signals and removing signal status checking. Software running on emulated controller in the simulator. The software is downloaded to an emulated controller that is running in the simulation environment. The software is not modified; however, all features might not behave the same due to the difference between the emulated controller and the real hardware. Software running on a real controller. Hardware connections are emulated in the simulation environment. The simulator emulates the other hardware objects, such as input output cards or other devices, and connects to the controller over a network interface. The simulator sends soft signals to the controller. Software running on real controller. Real devices connected to the controller. The system is similar to the real system that will be 21
deployed on site. If the simulator is to be used, it must connect to where the real sensor and actuals would connect, for example, with electrical or physical signals. 2.2.5 Simulator costs The total cost of the simulator will consist of three major parts: the fixed base cost for the simulation platform; a running engineering cost of setting up and installing the hardware system; and a running engineering cost of implementing the behavior models. 2.2.5.1 Base costs for a simulation environment In the base cost, there will be two costs. The first is a licensing cost for using the simulation tool. Different vendors have different approaches to licensing costs. Certain vendors (e.g., Siemens) have a fixed cost for certain versions and sizes of the simulation software. There can also be additional costs for using additional packets (e.g., advanced premade models). The other base cost will be a hardware investment. Required hardware can include additional servers, controllers, network interfaces or other hardware devices. The hardware cost will depend on the realization level that is to be used and the interconnection level. The cost relating to the hardware realization level will behave in a step function. For each level of increase, more hardware is required, and the cost will increase. The increase per step might not be linear and will depend on the interconnection level of the simulator. As described in the coming sections, an increase in the interconnection level will geometrically increase the costs, which is also true regarding hardware costs. For example, if different instances that are connected in the software are running on different controllers [52], this phenomenon will likely occur. There will also be additional hardware costs for the networking devices that must connect the controllers together if the software is running on a real controller. If the software were to run on in the software development system instead, the number of connections required in the development platform would decrease, and there was no need for additional hardware. 2.2.5.2 Cost for developing simulation systems There are two primary ongoing costs when developing a simulator: creating the behavior model, and setting up the hardware realization platform. The cost for these components accumulates during development as engineers work to develop the simulator. The final cost of the different parts depends on the size of the simulator system, the interconnection level of the simulator and the complexness of the model, or the realization level of the hardware. Increasing complexity generally increases costs nonlinearly [49]. This increase can be geometric, such as the increase in edges when increasing the nodes in an interconnecting system [50]. In Figure 10, an example system is shown that is the same system as shown in Section 2.2.3. In this example, the interconnection 22
Times Base Cost level of the simulation system is internal, and the model complexity is continuous. The actuation object has linear dependencies on the sensing object. This method of modeling only requires two parameters from the first-order equation. In reality, the pressure from a pump comes from a more advanced nonlinear equation [53]. Using these more advanced functions requires more investigation time, more parameters that must be set correctly and more connections that must be considered. Stepping up the model level in this way will increase the resources needed in a nonlinear way. In Figure 6, the interconnection level is the process area s external connections or system. In the simulation example shown in Figure 10, the level is only the internal process area. Increasing it to the external environment would add a connection between the two linear dependencies, which would also add additional paths, for example, between the pump and temperature, that must be considered. This change increases the effort in engineering geometrically, such as increasing objects and edges in a connected network. 700 600 500 400 300 200 100 0 Types Instances Process area internal Process area external System Manual response Continuous linear response Fully realistic Discrete like response Continuous non linear response Figure 11. Graph showing how costs can increase with different geometric growth depending on the chosen interconnection level and model complexity Figure 11 shows how cost factors can behave when based on two geometric cost distributions: costfactor = (interconcetion level) 2 (model complexety) 2 Figure 11 shows that if a complex model is used, then the cost of scaling the simulator at the interconnection level will be high. Conversely, for a low interconnection level, increasing complexity will not dramatically increase cost. 23
3 Method This thesis used an explorative inductive method, where focus was placed on the problems and inadequacies surrounding the quality control process in automation projects. A case study of the process at one automation company was conducted. The reasoning behind only looking at one company for collecting data is that this allows this thesis to go into more depth rather than giving a more generalized picture [54]. The common thread running throughout the thesis has been how testing is used and can be improved. The goal of the thesis was to determine how simulations could be used to improve the current quality control process. The first step of this research was to establish the theoretical background with regard to quality control in software, automation and simulations within automation. Using this information, interviews were prepared. The theoretical background was used to determine what was relevant to search for in the interviews and following analysis, and was used to create the interview guides and coding tables. After the interviews, a qualitative summary was created using the coding tables. The reasoning behind this process was to obtain a better picture of the quality control process and how it was performed in practice. Following the summary, the material was analyzed with the focus of determining what tests were used and how they compared to the theory from the background. In this step, the simulations were also related to the tests that were used. In the final discussion, the simulations were brought together with a focus on what considerations required to be performed for the simulations to provide value in testing without increasing costs to a point where it decrees the profitability of the projects. An overview of the approach, its steps and what was gained from the steps shown in Figure 12. 24
Literary review Quality control Automation Simulations Preparations Interviews Interview guide Coding tables Interviews Summaries Qualitative Summary QC process Errors Simulations Analysis Tests used Simulator requirements Discussion Costs for simulations Value from simulations Cost vs value Figure 12 Overview of the method applied in this thesis and what was achieved in each step 3.1 Data collection Primary data were collected via a set of semi-structured interviews. In total, 8 interviews were performed with the host company s employees to describe how the quality control process works and what problems exist in the process. The other intention was to gather data about how simulations are used at the host company today and the personnel s view of them. Semi-structured interviews are guided by major questions and topics that are predetermined; however, the details and additional probing questions are not planned in advance [18]. Before the interviews, a guide was constructed; an example of the interview guide used for interview R8 is shown in Appendix A. The interviews were performed following the guide but with room for more indepth discussion to include all important information. The questions were divided into three themes, each with three question types. The themes were errors in automation projects, the quality control process within the projects, 25
and the simulations. The three question types were general questions that were asked to everyone, role-specific questions and person-specific questions. Interviewees were selected via purposive sampling [55]. A discussion was held between the company supervisor and the author of the thesis to find suitable persons for the interviews. To obtain several perspectives of the development process, we sought interviewees with a mix of different job positions, and three were chosen: business managers, project managers and system developers. The second criterion was that the interviewees should have worked in projects where simulations were used. The third criterion was to obtain representation from several industrial sectors. Given the organizational structure of the company, this requirement meant that the interviewees should come from different departments due to department specialization. Several industrial sectors and departments decreased the potential for bias due to interviews only describing the same or similar projects. Table 1 Summary of interviewees ID Position Role type Date Duration Type R1 Depute CEO and head of Business manager 20.02.2020 60 min Online conference sales and projects R2 Project Manager Project Manager 20.02.2020 60 min Online conference R3 Project Manager Project Manager 21.02.2020 50 min Online conference R4 Systems developer Systems developer 25.02.2020 50 min Online conference R5 Systems developer Systems developer 26.02.2020 30 min Face to Face R6 Systems Systems 04.03.2020 30 min Face to R7 R8 developer Department and project manager Chief Technology Officer developer Business manager/ Project manager Business manager Face 05.03.2020 45 min Face to Face 05.03.2020 50 min Face to Face 3.2 Initial qualitive summary The initial analysis was performed by decoding the interviews to find three metrics: the elements in the quality control process, the errors that can occur in automation projects, and how simulations have been used previously at the company. The primary analysis used was a thematical analysis [23], which consists of finding themes that are present in the collected data. Themes are 26
patterns or meanings that share the same quality. In this case, a theme could be an element in the QC process or an error category. Analysis was performed in five steps: (1) summarizing the interviews; (2) coding the data; (3) searching for the themes and subthemes; (4) defining the themes and subthemes; and (5) reevaluation of all the codes with the defined themes. The steps were not performed linearly; task execution sometimes overlapped (e.g., searching for themes and coding were performed in parallel). Summarizing Coding Searching for theams Defining themes Reevaluation of codes Figure 11 Overview of the analysis steps of the interview material 3.2.1 Summarizing the interviews The interviews were not transcribed word for word but were summarized in a block of text. These summaries were then sent to the interviewed person to check that the summary represented what the person had said, and that no misinterpretations had been made. The interviews were summarized in this way so that written material was available for the following analysis steps and reflect on what had been said during the interview. 3.2.2 Coding the data Coding was performed using two coding tables. Sentences and quotes that described errors or QC elements in the summaries were extracted and put into tables, creating codes. The coding tables were constructed to include features of the QC elements and errors that could be used to categorize them into themes. Features of QC elements could include which project phase the code belongs to, which part of the system was tested or who conducted and observed the QC element. For the errors, this could be where the error occurs and is removed or who or what had caused the error. The first version of the coding tables was created before the interviews started. During coding, they were modified to match the data from the interviews inductively. Features that were not initially considered were also added to the tables during coding. Other details that were in the coding tables were not considered interesting for the analysis and were removed from the coding tables. The final layout of the coding tables is shown in Appendix B. 3.2.3 Searching for and defining themes As the coding was performed, each code, a quality control element or error, was given a name. If the code was similar to another code, the same name was used; otherwise, a new code was created. Categories or themes were created by grouping names with similar coding table features together (e.g., if names in the error table occurred for similar reasons). 27
When all themes were created, definitions were written for each theme based on what was said in the interviews and the common elements of the codes. When the definitions were created, the initial categorizations of certain codes had to be changed because certain codes fit better into another category than what was initially thought. 3.3 Analysis Simulations, Testing and Quality Control Potential simulator implementation strategies were developed using different sources of data. First, a literature review related to the use of simulators and virtual commissioning in the automation-industry was conducted. Next, reference documentation and manuals for currently available system simulators were studied. The simulator platforms investigated were the ABB 800xA Ability process simulator [42], Siemens SIMIT [43] and Emerson Mimic [44]. The abovementioned sources were combined with what was described in the interviews about how simulations are used in practice today at the host company. The focus of this thesis was to determine how simulations can be used effectively in the current quality control (QC) process in automation projects. A theoretical and practical connection was made between the QC process and the capability of the simulator, and the element used to connect the QC process and simulations was the type of test performed in the current QC process. This connection was selected because simulations are used as a tool during testing [56]. Analysis was performed in two steps. First, qualitative descriptions of the test types found in the material were established by relating what was said in the interviews to the theoretical background with regard to different types of tests. The terminology for the tests used in the analysis comes from definitions used in computer science [20]. The analysis connected that terminology to the information from the interviews. The second part of the analysis was to establish what was required from a simulator to perform a test type used in the current QC process. The requirement was explored using the implementation classification defined in Section 2.2.4. 3.4 Cost and benefit analysis of the simulator To determine the effectiveness of using simulations, qualitative reasoning was used to determine a cost benefit approach [57]. Because simulations have not been used extensively in host company projects, there were limited data on real costs and values of simulations. Thus, the analysis was only made using qualitative reasoning, and the result was an indication of the basic principles of how the costs and benefits would be expected when using simulations in automation projects. 28
To determine cost, an investigation into the creation of a simulator was made including what would be driving cost in this process. Both fixed costs required for all simulators and ongoing costs during development were considered. Data came from looking at different options from manufacturers and from considering how development costs grow in interconnected systems. The benefit analysis was also performed qualitatively. The difference in the value of a simulator primarily comes from how many different tests it can be used to perform and the error-finding potential of these tests. For simulations to provide value, they must have a higher error-finding potential than the methods currently used in the QC process. When investigating the error-finding potential, errors that were identified in the interviews were used. 3.5 Thesis quality considerations In qualitative studies, four measurements of quality must be considered: credibility, confirmability, transferability and dependability [58]. Credibility Credibility is a measurement of how well results describe reality. To increase credibility triangulation, the methods used in [17] were applied in this research. Several job positions at the host company were investigated to have several development perspectives. Interviewees were from different departments of the company and had different backgrounds so that their experiences would not come from the same projects and situations. Confirmability Confirmability describes how objective the authors of a document are. The interviews were not transcribed word for word and were only summarized, which reduces the confirmability of this research. This process was used because transcription is time-consuming, and the reduction in confirmability was considered justified to save time. To minimize the reduction in confirmability, the summaries were performed immediately after the interviews and using an audio recording of the interview. The summary was then sent for conformation to the person who had been interviewed. Transferability To have good transferability, results and conclusions must be applicable in other contexts. Because this thesis was a case study of only one company, the first layer of transferability is if it can be applied to another similar company. The interviews were performed with people who had relevant experiences from other companies, and the simulators studied were not limited to that used at the company. These facts increase the transferability of this research to other companies. Another layer of transferability is if results can be applied to fields other than automation system developments. The general theory used in this research is based on quality control from computer science and is not solely based on 29
literature from the automation industry. This fact increases the transferability of the thesis to fields in computer science and software development. Dependability The dependability of a study is given by how well the results can be replicated. The methods and approach of this thesis are described in an open way, which includes the summaries that were made, the interview guides and the coding tables that were used. 30
4 Interview results and initial analysis In this chapter, data collection and analysis are described. The primary source of data was the interviews, and this chapter will describe and summarize the results of these interviews. As described in section 3.2, the primary tool for analyzing the interviews was code tables that consisted of relevant elements from the interviews. Two code tables were created: one for quality control elements, and one for errors in automation projects. The code tables were tools that were used to work with the data in a structured way. This chapter will summarize the results and perform an initial analysis. The full code tables is shown in Appendix B. This chapter focuses primarily on the first research question, RQ1, shown Section 1.2. 4.1 Identified quality control elements The theory behind quality control (QC) is described in Section 2.1. In this analysis, a QC element is defined as a project step or activity that fulfills the theory set up in chapter 2.1. QC elements become unique and separated by looking at what project phase, what part of the automation system is checked and who is checking it. The project phases reference the background theory described in section 2.2.2, and which part of the system is tested is referencing section 2.2.1. An overview of the elements found is shown in Table 1. 31
Table 2 Elements gathered from the quality control coding table QC element Phase System part tested Requirement Concept/func Entire validation tional/system system Functional design documentati verification on Self-check Engineering Soft full system bottoms up Internal Acceptance test (IAT) Factory Acceptance test (FAT) Signal check Verification and validation Verification and validation Commissioni ng Soft full system bottoms up Soft full system bottoms up/focus on system level Signals and instances Test performer Developer Mentioned in interview R8, R7 Developer R3, R5, R8, R4, R7 Internal resource that is not the developer R3, R6, R8, R4, R5, R7 Customer R2, R3, R6, R7, R4, R5 Commissioning engineer R2, R5, R7 Manual operation Sub system test Full system test Commissioni ng Commissioni ng Commissioni ng Instances and small process areas Process areas Full hard system Commissioning engineer/custo mer Commissioning engineer/custo mer Commissioning engineer/custo mer R2, R5 R5, R2 R2, R5 Site Acceptance test (SAT) Commissioni ng Full hard system Customer R3, R7 Operational debugging Operations Full hard system Developer/Co mmissioning engineer/custo mer R4, R8 The first QC element was an overview and check of the reference documents that were used in the projects. This element was described in R7 and R8. They 32
describe that before engineering work begins, the documentation that is going to be used as a reference must be checked. This documentation is often supplied by the customer, and the automation company checks that the designed solution can be implemented in an automation system. Quality control during and in connection with the engineering, verification, and validation phases is described in all interviews except R2. The elements that are described are a self-check and an internal acceptance test (IAT). As shown in section 2.2, automation systems are implemented using a bottom-up approach. According to the interviews, this is also the case in the QC. For example, after an instance or process area is developed, it must be checked and, if possible, tested. The distinction between the self-check and IAT was described in some but not all interviews. The difference was who performs the QC and when it is performed. R3 and R8 described the self-check and IAT as separate elements where the developer performs the self-check, and a separate person performs the IAT. R5 only mentioned the self-check performed by the developer, and R6 only mentioned the IAT made by another person. R4 and R7 both discussed internally checking and testing the system but did not make the distinction between self-check and IAT. After development but before deployment, there is a factor acceptance test (FAT), described as the customer coming to the developer to check that the system is working as intended to meet the requirements. This test was described in all interviews except R8. Similar to the self-check and IAT, the FAT is described to include both checking and testing the system. The FAT was described to be different depending on the project or customer and can be a repeat of the IAT with the customer present or a continuation that focuses on the higher system level and its functions. Several elements about commissioning were described in the interviews. The first was a signal check that was described by R2, R5 and R7. A signal check was described as a test to verify that signals correctly go from electrical signals to software signals. After the signal check was performed, the objects or functions were operated manually. R2 and R5 both described this event. R5 distinguished between objects and subsystems, while R2 discussed manual operations of functions. This process is followed by some form of full system test. R2 and R5 talked about a test that examines the full system, and R3 and R7 mentioned the site acceptance test (SAT), in which the systems are tested against customer requirements. The final element that was mentioned in the interviews was some form of debugging during operations. R8 and R4 mentioned debugging when talking about how simulations can be used. During the interviews, several interviewees implied that the system was checked and tested in a bottom-up way. R2, R4 and R5 talked about looking at blocks and the connections between them. R3 compared these blocks to building Lego, starting with small components and connecting them together to form larger ones. R8 included the concept of types, instances and process areas, and how these have their own self-checks and IATs. 33
The QC elements found in the codes are summarized in Table 3, along with the name of the QC element, a description of the element, the project phase in which it is performed and the test types used. An illustration of the process used in this thesis is shown in Appendix C. Table 3 Definitions of the quality control elements mentioned in the interviews QC element Description Test types used Requirement validation/functional verification Self-check Internal Acceptance test (IAT) Factory Acceptance test (FAT) Signal check Manual operation Sub system test Full system test Static checking of reference documents for viability Code checked and tested by person developing it Code checked and tested by another employee Code checked and tested by the customer Signals are tested from process plant or hardware system to the software system Objects are tested individually to see that they are working on them own Test subsystems individual The system is tested to be operated fully Static document review Static code review/dynamic unit, integration or system test Static code review/dynamic unit, integration or system test Static code review/dynamic unit, integration or system test Dynamic unit test Dynamic unit/integration tests Dynamic integration/system test Dynamic system test Site Acceptance test (SAT) The customer approves the that the system is working according to specification Operational debugging The system is in operation and errors are discovered Dynamic system test Dynamic system test 34
4.2 Errors In the interviews, questions were asked about what errors can occur in automation projects. In total, 39 descriptions of errors were given in the interviews, and 27 were considered unique and were thus given names. These 27 errors were then categorized into six categories. The categories that were mentioned in all interviews were miscommunication or misunderstandings about functions. R2 and R6 both mentioned that the system or functions can be misinterpreted at the start of a project, moving the analysis in an incorrect direction. Several interviewees, including R3 and R6, mentioned communication between the developer and the customer as a source of errors. In general, all interviewees described some form of functional misunderstanding. The second most commonly mentioned error was in the reference documents that are used to develop the system. R6, R7 and R8 reported that the documents that define the functional descriptions can have mistakes in them and thus lead to a system design that cannot meet the requirements. R4, R6, R7 and R8 also indicated that the final process might not be the same as the system design. R2 said that the control system might need to fix mechanical problems, and R4 mentioned an example of material bouncing in front of a sensor and ruining synchronization. The interviewees all reported that the physical system does not match what was described in the documentation. The third most commonly mentioned error was careless mistakes made by the developers. These were mentioned in R2, R5, R6, R7 and R8, and generally indicated smaller mistakes. For example, R7 mentioned that careless mistakes can often occur due to an unstructured method of working and that things were forgotten. R5, R6 and R8 reported these types of errors as writing errors, incorrect syntax or a forgotten invert of a signal. The fourth most commonly mentioned error was some type of hardware issue. R2, R5 and R6 all reported that a hardware platform can be implemented or designed incorrectly so that it does not match what is developed in the software system. R6 mentioned that this type of error can include a lack of input and output or insufficient processing power. The fifth most commonly mentioned error was some type of signal error. This type of error can be hardware-related (e.g., a wrong connection) and was mentioned by R5, R6 and R7. This type of error can also be software-related; R5 and R6 talked about scaling, and R7 and R2 mentioned alarm handling and signal monitoring. The sixth and least commonly mentioned error was some type of compatibility issue. R3 reported that a major issue can be communication and versions between systems. R3 referenced interfaces between systems as a keyword in 35
development. R2 and R7 stated that versions of systems can be a problem, and R7 mentioned projects that have used development platforms that have not matched what is on site, which created a problem during commissioning. Table 4 Error categories and descriptions from the interviewees Error category Description Error in reference documentation An error is made when developing the reference documentation Miscommunication or misunderstanding about functions The reference documents are misunderstood by the developer who makes an error Careless mistake The developer makes a mistake out of carelessness. Hardware issues The hardware system does not behave or is set up as expected. Compatibility issues The intended components, SW or HW are not compatible with etch other Signal errors A signal is received correctly in at its destination 4.2.1 Error occurrence and removal From the data gathered during the interviews, a timeline of when errors were created and removed was created. When analyzing the results of the interviews, they showed that the same error categories could be added in different phases and could also be removed in several different phases. A summary of these results is shown in Figure 13. 36
Possible error lifetimes Figure 12. Diagram of where errors from different categories can be added and removed according to the interviewees Four out of six error types can survive throughout the entire project lifecycle. If the distribution of the addition and removal of errors is proportional to the number of possible additions and removals, a cumulative error count can be calculated, as shown in Figure 14. 37
Possible distribution of error 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% Signal errors Hardware issues Compatibility issues Misunderstanding or miscommunication Errors in reference documents Careless mistake Figure 13. Graphs showing where errors can be added and removed, and the possible number of errors during each phase Figure 14 shows that the largest number of cumulative errors occurs during the engineering phase. The largest number of error removals occurs during the commissioning phase. All types of errors can survive to the commissioning phase. These numbers are only distributed on what was deemed possible in the interviews and are not based on any real distribution of the real addition of errors or the effectiveness of the QC process. 4.3 Summary of interview data Interviews were performed with people from different departments at the company. There were no major contradictions between the interviewees that required further consideration during analysis. There were differences that were primarily related to the differences in perspectives of the interviewees and not related to facts. A general difference between the interviews was that business and project managers focused more on a large-scale perspective, while system developers focused more on technical details. When discussing the quality control process, people being interviewed focused on different things. Some, primarily developers, focused on what was actually performed during the tests, while others, primarily managers, focused more on what was expected to be found during the tests. There was also a focus on how and when simulations could be used. The error types were spread out among the interviews. No person described all errors, and there were no persons who described the exact same distribution of errors. Each error category was mentioned at least three times. All errors were described by all roles except compatibility issues, which were only described by project managers. The errors were described in a similar way by all interviewees. 38
5 Analysis This section analyses the data that was discovered in the initial qualitative summaries of the interview material and connects what was found in the interviews to general theory about testing and simulator implementation. Continuing chapter 4, chapter 5 answers RQ1. In chapter 4, only interview data were considered and in chapter 5 it is combined with general theory from chapter 2. Thus, RQ1 will yield a more in-depth and applicable answer. This chapter also starts to answer RQ2 from section 1.2. The analytical methods used will combine knowledge from RQ1 with the simulator requirements. 5.1 Quality levels during the project lifecycle Four levels of quality were identified using data from the interviews and the background information. The term object in the descriptions can be of any system level, instance, process area, etc. Defining these quality levels highlights that the same tests and QC elements are performed twice: Implemented. An object is developed by the engineer and is ready for testing. Soft verified. An object was tested and verified using reference documents and the underlying verified system level. The verification was done by developing engineer or dedicated tester. Soft validated. The object is validated against the reference documents and the overlaying (if available) and underlaying verified system objects. The validation is done against the user requirements. Hard validated. The objects are tested and validated when deployed in the real hardware and process plant systems. The objects are verified against the implanted system and user requirements. A graph representing how these levels grow during the project phases is shown in Figure 15. 39
Figure 14. Figure showing how the system quality level progresses during the project lifecycle During hard validation, there are specific QC elements that represent testing and validation of the different system development levels. During soft verification/validation, there are no specific QC elements that are called for the different levels; however, this is the case in the interviews. Thus, the system must be tested twice, which is supported by the fact that the process plant, the hardware system, and components added in the deployment phase are connected to all system levels. In traditional software development, the deployment or release of software [59] does not affect the functionality of the system. Issues can be found; however, the inner workings of the code are the same. Issues can, for example, occur on the system level if the new software is a subsystem that integrates with a supporting system [60]. In this case, there is no need to redo the unit and integration testing but rather focus on system-level testing [61]. In industrial control systems, this process is not feasible. During commissioning of the entire system, the software and other aspects of the system must be considered. One of the interviewees said that it is common that software is used to fix mechanical problems, which indicates that even if the software is tested, and no bugs or errors are present, the system might malfunction during commissioning. 5.2 Tests used The tests that are used can first be divided into the two primary categories that were described in theory in section 2.1.2 and include static and dynamic testing. In the next two sections, tests are described in a way that relates to how they are used at the host company. The terms mentioned in this study were not specifically mentioned by the interviewees; however, what was described matched the terms. Static and dynamic testing, for example, are terms used in 40