The ARW provides a venue for individuals from accelerator communities worldwide to meet and share their experiences on operating reliable facilities. The workshop fulfills the need to improve information exchange on technical issues and equipment reliability. It facilitates the opportunity for individuals to share their problems and solutions with their peers from other facilities, worldwide.
Local Organizing Committee and International Organizing Committee welcome and opening remarks.
CERN / SNS / BNL / ALBA / Jlab / Triumf
Reliable operation of accelerators requires reliable infrastructure support systems. This session explores the interface of these systems with accelerator systems as well as issues and optimization associated with them. Some of these areas include interfaces between cryogenics and plant operation, cooling and ventilation, electrical infrastructure and power quality, diesel and UPS backup power systems and power converters. Maintenance, disruption or malfunction of the utilities can lead to large impact on the accelerators in terms of downtime, reliability, and performance if no end-to-end approach is considered. In this session, feedback is welcome as well as means and strategies to make the accelerator components more resilient to support system disruptions (redundancy, backup system, harmonic filtering, UPS, training, etc.) and to shorten recovery time.
The Spallation Neutron Source (SNS) started operations in 2007 after extensive equipment commissioning. Electrical Power Conversion (EPC) equipment was originally specified or designed by different national laboratories and manufactured by various vendors. The equipment includes Low Energy Beam Transport (LEBT) chopper pulsers, High Voltage Converter Modulators (HVCMs), DC magnet power supplies, and injection and extraction kicker power supplies. Miscellaneous issues caused lengthy down time in the beginning, but the issues have been resolved through many prioritized upgrades and improvements. EPC equipment availability reached 98% in 2011 and has remained high. This presentation will discuss the past operation history and the current equipment status. Ongoing and future upgrades will also be discussed.
CERN operates and maintains several large cryogenic systems with complex architecture necessary for the operation of the Large Hadron Collider (LHC). Continuous operation of the accelerator during Run2, and the implementation of availability calculation tools as well as the use of a fault tracking system on cryogenics equipment, has allowed the CERN cryogenics team to gain valuable experience and data to drive maintenance and consolidations. Major maintenance tasks and a comprehensive consolidation program have been planned during the Long Shutdown 2 based on a data driven approach. In this talk we will review the methodology, approach and main issues addressed to improve reliability of the cryogenic infrastructure aiming at maintaining the cryogenic system availability for the accelerator complex above 98% for the coming years. We will report on the initial operation results achieved and we will also give some perspective on the new time line for cryogenics operation during the extensive Run3 including impact of energy preservation scenarios.
TRIUMF is heavily reliant on supplies of liquid helium and liquid nitrogen to maintain operations. The Cyclotron uses bulk LN2 and a Linde 1630 helium refrigerator to support the vacuum system. There are two superconducting LINAC’s at TRIUMF as well. Both use LN2 to cool the necessary 80K thermal shields. There are three Helium Refrigeration cold boxes to supply LHe for the cryomodules. The on-going helium shortage, Covid-19 pandemic and natural disasters have created restrictions and interruptions to cryogenic fluid supply that has impacted operations. This presentation will discuss the impact, our response and possible future plans to attempt to mitigate the impact of these issues on lab operations.
All accelerators have reliability expectations. A data driven approach to maintenance has become a necessity as new accelerators increase in size and complexity. This can become more complex over the lifecycle of accelerator facilities when upgrades mix new and old components.
• What approaches work best for maintaining high reliability your facility?
• Have these approaches been shown to improve reliability and availability?
• How do you handle obsolescence issues?
• What are biggest struggles with optimizing maintenance activities and scheduling repairs?
• Are there different approaches for new facilities vs new or upgraded facilities?
Construction, commissioning and validation of the Linear IFMIF Prototype Accelerator (LIPAc) are carried out in the framework of the IFMIF/EVEDA project. The LIPAc in its current configuration operates with D+ and consists of a 100 keV injector, a 5 MeV RFQ accelerator, a medium and high energy beam transport lines and a beam dump. In its final configuration it will include a Half-Wave Resonator-SRF (HWR-SRF) linac and will target to commission a D+ beam in Continuous Wave (CW) of 125 mA CW at 9 MeV. A temporary transport line is currently replacing the SRF linac.
In 2019 the beam commissioning campaigns achieved to accelerate a 125 mA D+ beam through the RFQ at low duty cycle of ~0.1 %. Beam operation were carried out until December 2021. In 2021 and 2022 extensive experimental campaigns have been carried out on the injector and RFQ targeting CW operation. The injector campaign aims at identifying the best configuration for operating at CW and nominal beam current. The RFQ operation aims at reaching RFQ conditioning at CW at nominal voltage of 132kV.
This paper will focus on the reliability of injector and RFPS-RFQ during the respective conditioning campaigns and recent efforts made to improve future LIPAc reliability.
The Canadian Light Source (CLS), a 3rd generation light source operates since 2004 and is comprised of a 250 MeV Linac dating from 1964, a booster and a Storage ring built in 2003 and having seen completion of its last beamline(BL), #22, in 2019. For many years focus on BLs has put the maintenance of the machine in a dire position. With the completion of the last BL the Machine division can reclaim long overdue heavy maintenance to ensure both quality and reliable beam to CLS' users. I will present the status of the machine and needs for improvement as well as the CLS maintenance strategy to ensure that no system is orphan of a responsible group.
Authors: Mr Nikolas Paneras, Dr Yasser Mafi-Nejad.
Presenter: Mr Nikolas Paneras
Institute: Australian Nuclear Science and Technology Organisation, Centre for Accelerator Science.
Presentation Title: "STAR Accelerator High Voltage Generator driver replacement"
The STAR Accelerator is a 2 MV Tandetron manufactured by High Voltage Engineering Europa. STAR is located in the Centre for Accelerator Science, Australian Nuclear Science and Technology Organisation, Australia. Commissioned in 2004, it has been reliably used for Ion Beam Analysis and Atomic Mass Spectrometry ever since.
In 2022, aging and obsolete electronics are posing a reliability threat to the STAR Accelerator. The high voltage generator driver is such an example. The analogue frequency synthesiser and phase locked loop driver board is populated with obsolete and no-longer available components. The driver had been incrementally repaired, however now the store of replacement parts has been depleted and a new solution is required.
This presentation will focus on the journey of re-designing the high voltage generator driver, including problem solving, technical hurdles and assumptions that had repercussions for the project.
Electrostatic Accelerator, Tandetron, High Voltage Generator, FPGA, Analogue Electronics, LabVIEW.
Limited to IOC Committee Members
Machine Learning has already been applied in some instances to particle accelerators, the recent advances (e.g. deep learning) and heightened interest in the community indicate that it will be an increasingly valuable tool to meet new demands for beam energy, brightness and stability.
The intent of this session is to introduce how problems in accelerator science and operation (in particular accelerator reliability) can be approached using ML techniques, and how these have the potential to provide benefits over classical approaches.
Enormous efforts are expended creating high-fidelity simulations of accelerator beamlines. While these simulations provide an initial starting point for operators, there exists a gap between the ideal simulated entity and the real-world implementation. Bridging that gap requires a brute force and time consuming task known as beam tuning. This project develops a data-driven approach to beam tuning in the CEBAF injector, which leverages deep learning over structured data (graphs). Specifically, we use graphs to represent the injector beamline at any arbitrary date and time and invoke a graph neural network to extract a low-dimensional representation that can be visualized in two-dimensions. By analyzing historical operational data from the CEBAF archiver, good and bad regions of parameter space can be mapped out. Initial results demonstrate the validity of the concept. We then suggest how this framework can serve as a real-time tool to aid beam tuning – which represents the dominant source of machine downtime – as well as address issues of reproducibility and stability in the machine.
With the many advances in machine learning in recent years, adopting this technology for accelerator operation offers promising perspectives. At CRYRING@ESR we implemented an operator-grade machine automation application with beam optimization support based on a Genetic Algorithm. This tool was used for optimizing the beam intensity via several machine sub-systems such as injection, ion source, local injector beam line.
One challenge for automatically optimizing the machine is the reliability of the injected beam in terms of pulse-by-pulse intensity variation. To mitigate this beam-variation issue, our recent work aims at developing software discrimination signals to identify „bad“ beam pulses based on time-series data. For this, our team established a rather lightweight framework of tools for on-line data monitoring and mid-term data storage in parallel to the official FAIR Archiving system based on the InfluxDB and Grafana software packages. Besides reporting on the technology stack for this framework we will provide a status update of the development of our „bad pulse“ detection which as a by-product allows user-friendly availability calculation and online monitoring.
The deployment of Machine Learning (ML) applications in a production environment requires verification, validation, assurance, and trust. ML models are notoriously difficult to maintain in these environments where data and systems may evolve over time and long-term maintenance is required. The models require active management for (1) reproduction or replication of model weights, (2) monitoring data drift, (3) tracking model performance, and (4) updating models. A Machine Learning Operations (MLOps) framework that will ensure a sustainable develop-deploy-monitor paradigm for accelerator control systems will be presented along with an overview of R&D to enable ML capabilities for accelerator operations. The R&D is being initiated by the Accelerator Controls Operations Research Network (ACORN) DOE O413.3b project to modernize Fermilab’s accelerator control system in preparation for operations with megawatt particle beams.
In order to improve the day-to-day particle accelerators operations and maximize the delivery of the science, new analytical techniques are being explored for anomaly detection, classification, and prognostications. We describe the application of an uncertainty aware machine learning (ML) method using Siamese neural network model to predict upcoming errant beam pulses as Spallation Neutron Source (SNS) accelerator. Predicting errant beam pulses reduces downtime and can prevent potential damage to the accelerator. The uncertainty aware machine learning model was developed to be able to detect upcoming errant beam pulses not seen before. Additionally, we developed a gradient class activation mapping for our model to identify relevant regions within a pulse that makes it anomalous and we use these regions for clustering fault types. We describe the accelerator operation, related ML research, the prediction performance required to abort beam while maintaining operations, the monitoring device and its data, the uncertainty aware Siamese method with its results, and fault type clustering results.
Environmental factors create many challenges for reliable operation of accelerators. Most accelerators operate over large physical areas and must maintain high precision and safe operations. Challenges that must be considered include many factors which are outside the direct control of the operator; such as varying temperatures, ground settling, earthquakes. Accelerators themselves produce radiation and temperatures from operation that impact equipment selection, precision and lifespan.
• Presenters are asked to share impacts on their systems from the harsh environments they operate in,
• How they address the issues
• Lessons learned.
The fourth industrial revolution, the current trend of automation and data interconnection in industrial technologies, is becoming an essential tool to boost maintenance and availability for space applications, warehouse logistics, particle accelerators and for harsh environments in general. The main pillars of Industry 4.0 are Internet of Things (IoT), Wireless Sensors, Cloud Computing, Artificial Intelligence (AI), Machine Learning and Robotics, and we are finding more and more way to interconnect existing processes using technology as a connector between machines, operations, equipment and people. Facility maintenance is becoming more streamlined with earlier notifications, simplifying the control and monitor of the operations. Core to success and future growth in this field is the use of robots to perform complex tasks, particularly repetitive, unplanned or dangerous, which humans either prefer to avoid or are unable to carry out due to hazards, size constraints, or the extreme environments in which they take place. In this presentation, the status of the Industry’s 4.0 IoT and robotic activities performed at CERN by the BE-CEM group, is presented
Several robotics and AI solutions have been applied in the past years at CERN, as well as custom made robotic devices. New ideas and solution could arrive in the near future to increase the safety of CERN personnel. The current and future research and development in robotics done at CERN are described, as well as the results from the commissioning of various novel robotic controls, and how this knowledge is being culminated in a set of best practises to improve machines availability.
Experiences gained in the FY22 RHIC Run at BNL have been informative to dealing with common weather challenges. In particular, power interruption and its negative effects will be discussed here, as well as ways to avoid it. Typically when weather conditions seem likely to cause power interruptions weather standdowns are initiated. These conditions are detected with an accuweather account setup to notify operators of nearby lightning. The standdowns target vulnerable equipment that could be damaged permanently by being tripped off from a power interruption. There were several instances of unexpected power outages that exposed weaknesses in the system and demonstrated the usefulness of the weather standdowns which will be discussed here.
The J-PARC 3GeV Rapid Cycling Synchrotron (RCS) is aiming to provide the proton beam of very high power for neutron experiments and the main ring synchrotron. We have continued the beam commissioning and the output power from RCS have been increasing. In recent years, we have been trying continuous supply of 1-MW high-intensity beam, which is the design value, to a neutron target. We tried to operate continuously for over 40 hours in June 2020. However, some trouble occurred and the operation was frequently suspended. In June 2021, we tried again 1-MW operation but it was suspended due to deterioration of the cooling water performance. Last summer shutdown period, we recovered performance of the cooling water system and retried in this June. In the final case, the outside temperature became extremely high. We could not keep 1-MW power, whereas 600 kW beam was delivered in stable.
Radiofrequency cavities are critical for accelerating beams at CERN, and their efficiency is in part due to the inner surface quality, which may be effected by various mechanical procedures such as welding. A detailed inspection is required to ensure the quality of manufacturing, but also for post mortem analysis. Ideally, this inspection should be non-destructive, and a new robotic system has been developed at CERN to perform this task. The Automated Robotic Inspection System (ARIS) autonomously meets, for the first time in the world, the needs of CERN to visually inspect the entire inner surface of radiofrequency (RF) cavities in LHC, Linac and FCC, and to detect any anomalies at short distance. This system is equipped with a liquid lens able to overcome depth of field (DOF) limitations, a high-resolution camera which ensures excellent quality photos regardless of the distance within the entire cavity, and an anti-collision mechanism that can immediately stop the inspection system, if required. ARIS is capable of performing three different scan modes in three different cavity types through a specialised user interface currently in development. This system is controlled using the CERN Robotic Framework - a robust software framework that is modular and used on all CERN inhouse robotic platforms.
We present initial results from a proof-of-concept “smart alarm” for the CEBAF injector. Because of the injector's large number of parameters and possible fault scenarios, it is highly desirable to have an autonomous alarm system that can quickly identify and diagnose unusual machine states. Our approach leverages a trained neural network to not only identify an anomalous machine state, but also to identify the root-cause by pinpointing the specific element or region responsible. We developed an inverse model trained on data collected during normal operations. Using the inverse model, measurements from the machine are used to compute machine settings, which are then compared to EPICS setpoints. Instances when predictions differ from EPICS setpoints by a user-defined threshold are flagged as anomalies, and the user is alerted to the issue. We present the results of our data collection efforts, model training and performance, and initial performance metrics.
Please see attached word document.
Although many systems at BNL's C-AD are well instrumented, some legacy and specialist systems are not. Air Conditioning units, for example do not report their status at all. Catching anomalous or off-normal signals can help prevent extended downtime. Efforts are beginning to show potential with the leveraging of Machine Learning techniques to assist prompt analysis.
Availability of the GSI Accelerator Complex is tracked with OLOG (Operation LOGbook) tool, developed internally. In 2020, we founded an Availability Work Group (AWG). Its activity drastically improved the quality of the failure data, handled in OLOG. Additionally, the internal acceptance for the failure statistics has grown.
Complex parallel operation of the facility is another challenge on statistic accounting. Relatively high accelerator availability is supported by agile adjustment of the beam time plan.
North Area Consolidation Project aims for a performance and resource-oriented approach to improve the reliability and availability of North Experimental Area facilities at CERN-SPS. The North Area provides various beams for fixed-target experiments and test beams for the R&D programs of the CERN/LHC experiments as well as for present and future projects at CERN or other laboratories.
The project covers the consolidation needs for beamlines, experimental areas, infrastructure & services. Project consolidation is planned after analyzing the subsequent downtime due to the degraded condition of the facility amplified by the obsolescence of various equipment. Predicting failures & Preventive Maintenance are of prime importance to reduce the overall Operation & Maintenance costs for running the North Area facility.
The project goal of increased performance is linked to the “Physics Beyond Colliders” future requirements and energy optimizations. It is adapted to serve the current and future Particle Physics community needs.
There are close to 3M pieces of equipment registered in the equipment database for CERN’s accelerators and their infrastructures that cover a very large range of technologies. The challenge of the technical groups in charge of the equipment is not only to maintain the reliability of the equipment, but to reach an ever higher performance, to increase efficiency of its operation, to increase its lifetime and last, but not least, to optimise cost.
To assist the equipment owners in these tasks, CERN has continuously developed its portfolio of asset and maintenance management applications in particular in view of asset performance.
This talk shall provide an overview of the applications that CERN has put into place over the years for managing physical equipment and for tracking activities related with it. It shall highlight how the strong integration of the different functionalities does facilitate the sharing of data across technology frontiers, the integration of information and the streamlining and integration of processes, all with the objective to develop new ways of improving the availability and efficiency of the accelerators.
Some of the functionalities that shall be introduced are
• the functionality for asset generation and for asset management
• the functionality for preparing and scheduling work
• the functionality for recording and managing activity information used for installation, testing and maintenance tasks
• the checklists functionality used for predictive maintenance and for commissioning and testing tasks
• the SCADA bridge functionality for usage based task generation
• the functionalities for accelerator component storage and for spare part management
• the logbook application
• the application for tracking potentially radioactive equipment
In the last number of years, maintenance of the Canadian Light Source (CLS) vacuum system has shifted from a reactive approach to a preventative maintenance approach. The catalyst of this shift is the aging facility and infrastructure. A comprehensive review and analysis of vacuum system components has given clear direction on preventative maintenance activities. Components that pose the greatest risk of failure are determined using the CLS enterprise risk management (ERM) framework. The ERM method often finds vacuum components to be in the high-risk category because of long lead times for parts and recovery time needed for vacuum conditioning. Risks to the vacuum system are strategically addressed using the risk quantification determined by the ERM analysis. Through the ERM framework and analysis, the status and health of the vacuum system is now better understood and necessary component spares and replacements are strategically determined. Design, fabrication, and procurement of strategic spares has increased the ability to respond quickly to failures, thereby reducing potential downtime. The reliability of the machine is increased as components posing the greatest risk to the vacuum system are replaced in planned shutdowns.
In the last years, the Experimental Areas (EA) Group at CERN has enhanced its configuration management strategy to the experimental areas (North and East Area, AD and ELENA Complex, HiRadMat Experiment) to have a clear and coherent picture of any beamline at a given point in time. As an essential tool for quality management, configuration management facilitates an increased physics time for the facilities due to an optimized reliability and availability of the beamline components.
The implementation is being deployed in the consolidation/renovations projects leaded by EA aiming to improve the conditions of the experimental areas (NACONS, ADCONS, East Area Renovation,..), where systems as the Accelerator Fault Tracking (AFT) allow to register and follow the operation faults of beamline equipment and give essential information for future upgrades.
In parallel to the performance-focused design, of the preventative maintenance and the design of the new equipment is strongly oriented toward remote handing compatibility, in line with the ALARA principle and due to the access limitations for radiation protection constraints which require a constant optimization. The operability and maintainability of the systems is well improved in the last years thanks to the use of Infor database, automatic generation of work orders, track tool, … in the framework of the configuration management strategy.
As accelerators become more complex over their lifespan it becomes increasing important to document and plan for the removal of obsolete equipment as well as the installation of new components. The Electron Ion Collider at Brookhaven National Laboratory is no exception where coordinating the removal and installation must take place in the years prior to beam commissioning. This presentation will discuss the planning and consideration for these removals and installations in the coming years.
The Lujan Center at the Los Alamos Neutron Science Center utilizes multiple tungsten spallation targets to generate intense bursts of pulsed neutrons for academic, national security, and industrial research. These targets are cooled by a closed loop, deionized water system. As part of the safety basis of the facility, safety rated flow switches are implemented to ensure sufficient cooling is delivered to the targets. An upgraded target system was installed in 2022 that contains a new third tungsten target. This new target system required a redesign of the water cooling system. As part of this rebuild, new custom flow switches were developed, tested, and installed to replace the previous style that suffered multiple failures over the previous years. The new flow switches are adjustable to meet different flow rate requirements and do not contain tight tolerances that caused the previous flow switches to fail.
Reliable particle accelerators require reliable control systems to operate and protect the different subsystems of the accelerator. The software reliability of these critical control systems is crucial to guarantee a safe and optimal operation.
Software reliability is a very challenging aspect of the overall reliability analysis. Formal methods and formal verification are well-proven techniques to verify that critical software components meet the specification requirements. These techniques are very popular in critical industries like aeronautics or aerospace, however less common in the particle accelerators domain.
At CERN, we have developed an open-source tool, called PLCverif (www.cern.ch/plcverif), which allows to apply model checking (a formal verification technique) to PLC (Programmable Logic Controller) programs. PLCs are one of the most popular control devices in the process industry and therefore we can find many of them in the industrial facilities of particle accelerators (e.g. access control systems, cryogenics plants, cooling and ventilation systems, etc.).
PLCverif has been applied to many critical PLC programs at CERN and other international organizations and companies from the private sector.
This presentation discusses about the benefits and challenges of using formal methods and formal verification. It shows how PLCverif improves the reliability of the PLC programs of the accelerators subsystems. Finally, it concludes by presenting some of the results of real-life use cases at CERN.
PETRAIII is a 3rd generation sychrontron light-source starting operation in 2009. The annual availability usually reaches a level over 97%. For this level of availability a high reliable pre accelerator is mandatory. The pre accelerator complex for PIII includes a Linac (LINACII), an accumulator ring (PIA) und a booster synchrotron (DESYII). With the constant-current (“top-up”) operation mode of PETRA III, the pre accelerators must deliver beam every few minutes. As a consequence they have to run continuously. Maintenance periods are rather limited. At the same time, the reliability and operation stability must be very high. To fulfill this goals several technical systems are designed redundant and the complex runs in a completely automated operation mode.
With over 2 decades of operations at the RHIC, the generally improving trends in availability and reliability do include a number of running periods with anomalies; it is difficult to analyze coarse statistics in fine detail. Early trends can be attributed to improvements in general areas such as system upgrades or operational efficiency; later years involve ever greater concern for aging systems. Recent years are additionally complicated. This presentation looks at overall trends with focus on recent issues as we look to close out RHIC runs and plan reliability for the EIC.
TRIUMF’s accelerator facilities are expanding, with the new radioactive or rare isotope beam facility, the Advanced Rare Isotope Laboratory (ARIEL), set to come online in 2026. The 520 MeV Cyclotron has been operational for 45 years and will produce one of the driver beams for ARIEL. Several of the cyclotron’s support and subsystems are (near) original and have reached the end of life. These include emergency power, cooling water, HVAC systems, and magnet power supplies. These systems cause considerable downtime and no longer meet the requirements of the laboratory. There are efforts underway to upgrade or refurbish many of these systems. This poster presents a few of these projects and how these projects impact the reliability of beam delivery at TRIUMF.
The SNS superconducting linac consists of 81 superconducting RF (SRF) cavities placed in 23 cryomodules, accelerating the H- beam from 186 MeV to 1 GeV. Its reliable operation plays an important role in delivering a 1.4 MW beam power on the target. A high availability of 99% has been routinely achieved in the SNS SRF systems. With its current accelerator upgrade to a beam power capable of 2.8 MW and the planned addition of a second target station, many more years of scientific life are foreseen at SNS. To sustain the excellent availability track record, a major effort is now focused on the aging effect of existing components in the SRF linac. In this contribution, we will describe a new method being developed for in-situ leak detection in cryomodule isolation gate valves. Ensuring their reliability and integrity is critical for SRF cavity protection in case of both accidental vacuum failures and planned maintenance. Initial results of the analysis are to be presented. We will discuss its potential use for predictive maintenance at SNS and perhaps similar SRF accelerators where cryomodules are interleaved with warm beamline sections.
This poster will be describing the LINAC Tank 4 upgrade as well as the history of the original Tank, through to the end commissioning of the new RF Tank.
The poster will briefly explain the 30-year history of the original Linac RF Tank 4, including its origin, and the reasons for the need to replace Tank 4. Furthermore, the poster will contain details of the upgrades and changes made in the design of the new Tank 4.
Finally, the poster will illustrate the highlights and lowlights from the tank 4 upgrade and commissioning of what went well, what didn’t, and if there are any further improvements.
Particle accelerators are arguably the most complex scientific instruments ever built. Given the enormous amount of data generated by these facilities, and the increasing demands on system performance, we recognize the need to leverage advances in machine learning (ML). In the last few years there have been several ML projects at Jefferson Lab with a common focus to optimize the operation of superconducting RF (SRF) cavities. This multi-faceted approach aims to (1) identify and classify types of faults from C100 cavities, (2) extend the work to provide real-time fault prediction for C100 cavities, (3) minimize radiation levels due to field emission in the linacs, and (4) develop tools to automate cavity instability detection in legacy cryomodules. We give a brief description of each project along with model performance. We also highlight challenges in working with real-world data and challenges for deploying models.
"The Machine Protection Risk Management Lifecycle at the European Spallation Source ERIC will be presented.
Reliability and availability requirements are taken into consideration before and during the design of the Machine Protection Systems at ESS. This is done by systematically identifying, assessing and mitigating damage risks to equipment.
The machine protection risk management lifecycle has the following phases:
1. Identification of concept and scope
2. Risk identification
3. Risk assessment
4. Risk mitigation
5. Requirement specification
6. Design and implementation
7. Verification and validation
In the first phase it is determined which systems will have a detailed Machine Protection analysis and which systems are excluded.
The outcome from the risk identification, risk assessment and risk mitigations is documented in a machine protection analysis. The analysis is performed in workshops, with the related system owners, technical experts and system engineers. The methods used for the identification, assessment and mitigations will be described in more detail.
Based on the analysis the Machine Protection requirement specifications is developed, which identify the systems required to fulfil Machine Protection related functions.
The requirements are fulfilled through the design and implementation process and are then confirmed through verification and validation."
The Spallation Neutron Source (SNS) is the highest power pulsed proton linac in the world routinely delivering 1.44 MW of beam power. For high-power accelerators, like the SNS, the main operational areas of focus are minimizing beam losses and reducing trip frequency. Since beginning high-power operation SNS has been optimizing beam losses and recording beam trip statistics of varied lengths. The operational focus on beam trips and the forethought to install an abundance of diagnostics has allowed for a steady decrease in trip rates year over year. The decrease can be attributed to utilizing alarms and interlocks, downtime trending, equipment diagnostics, and the ability of operations resources to provide system engineers with narrow areas for focused troubleshooting. The focus here will be on the specifics for how SNS has continued to systematically reduce beam trip frequencies across a range of time durations.
*ORNL is managed by UT-Battelle, LLC, under contract DE-AC05- 00OR22725 for the U.S. Department of Energy.
Accelerator complexes rely on multiple sensors to be able to reach desired operational conditions, including safety. For example, Beam Loss Monitor sensors are used for both machine safety, and to help operators steer the beams. These critical devices are installed periodically along the beam lines, and as an example, over 3500 of these sensors are installed in CERN's Large Hadron Collider. To ensure their smooth and safe operation, the sensor performance should be validated before each run. In the last long shutdown at CERN, part of this validation was undertaken using a robotic system for the first time. The system consisted of a 9DOF arm mounted on the Train Inspection Monorail robot, that removed a radioactive source from shielding installed on the robot, before bringing it closer to the sensor, triggering the ionisation chamber, and checking the results were correctly generated. The operations have been remotely operated via tele-manipulation, and in the future, developments will focus on autotomizing part of the measurements under supervised control and checks of remote operators. This work presents the operation and the modular software framework that has been developed to manage this validation campaign and robotic mission.
The Canadian Light Source (CLS) is a 2.9 GeV third-generation synchrotron light source providing photons to 22 beamlines. In the last 18 years of operation there has been ongoing efforts to not only maintain but improve the reliability of the CLS. Over the past few years a more comprehensive strategy to rationalize our preventive maintenance through risk analysis and budget constraint has been developed.
I will present the contribution of the operator group to providing operational statistics as a part of this maintenance strategy through the understanding of our machine reliability over the years. This poster will reflect how the fault statistics helped the facility to improve the synchrotron reliability.
The change in reliability related to the change in injection modes -- from decay mode where injection was done every eight hours to a top up mode where injection is done every 2-5 minutes -- will also be highlighted
H. Terry, C. Peters, R. Saethre, R. Taylor, Spallation Neutron Source, Oak Ridge National Laboratory, Oak Ridge, TN, 37831, USA
The Spallation Neutron Source (SNS) Enterprise Asset Management (EAM) system has been in production operation since 2006. From the beginning, the purpose of EAM was to have a tool that would track the work that was happening, keep an inventory of assets, and help predict when assets could be close to failure. However, due to the lack of focus on the proper EAM software configuration, and minimal resources to analyze the data collected, the original effort has not reached full potential. Today the focus is on reducing administrative load on engineers and technicians by utilizing the software for inventory management, automated reliability analysis, predictive maintenance scheduling, and labor and parts cost tracking for budgeting purposes. Particular importance is being placed on ensuring the data collected is properly filtered for ease of analysis.
*ORNL is managed by UT-Battelle, LLC, under contract DE-AC05- 00OR22725 for the U.S. Department of Energy.
Programmable Logic Controllers (PLC) are one of the most popular control devices used in the process industry and therefore we can find many of them in the industrial facilities of particle accelerators (e.g. access control systems, cryogenics plants, cooling and ventilation systems, etc.).
A bug in a PLC program can have catastrophic consequences in terms of personnel safety, machine protection or even the availability of the particle accelerators. For that reason, it is essential to apply methods that improve the reliability of these PLC programs.
Model checking is a well-proven technique to verify that critical software components meet the specification requirements. Despite these techniques are very popular in critical industries like aeronautics or aerospace, they are not in the process industry and particle accelerators domain.
At CERN, we have developed an open-source tool, called PLCverif (www.cern.ch/plcverif), which allows to apply model checking to PLC programs. This poster presents the benefits of using model checking compared to the traditional testing techniques. It also introduces the PLCverif methodology and how it can improve the reliability of the PLC programs of the accelerators subsystems. Finally, it presents some of the results of real-life use cases at CERN.
Particle accelerators, used in foundational research and cancer treatment, are complex machinery comprising many different components. In this work, the ion source CAPRICE ECRIS is examined with 48Ca operation. The most common problem with 48Ca at the ECR ion source is the instability of the created plasma inside the source, which leads to increased material consumption and lower quality of the resulting beam. Beamtime data of 2020 and 2021, which consists of multiple device settings and readings, was accumulated and labeled with normal or anomalous state accordingly. For automatically detecting plasma instabilities, 1D-convolutional neural networks are investigated to perform time series classification. The results show the effectiveness of convolutions, which leads to sensitivity of 0.74 and specificity of 0.79. A visual evaluation of the prediction shows good detection of longer anomalous sequences, but the model struggles with smaller anomalies. Given the small nature of the dataset, more data is needed to improve classification performance. Furthermore, better metrics for anomaly detection with time series have to be investigated to perform high-quality evaluations.
The Electron Ion Collider (EIC) will be built at the Brookhaven National Laboratory Site in Upton New York USA. It will occupy much of the tunnel and support building locations presently in use for the Relativistic Heavy Ion Collider (RHIC). Reliability considerations play a key role in planning for the site-wide modifications and upgrades necessary to support the EIC. This presentation will discuss some of the designs and plans for this undertaking.
The Relativistic Heavy Ion Collider at Brookhaven National Laboratory in Upton New York, USA normally runs during the winter and spring seasons, when the weather on Long Island is cool and power is steady, readily available and cost effective. Due to recent projects, upgrades and scheduling issues, it will be necessary to run during the harsh summer months in 2023 and possibly in 2024. In this presentation, I will discuss the steps that are being taken and preparation that will be made to maximize up time and reliability during this period.
The multiple pulse analysis of BNL's AGS extraction septum allows for not only more reliable performance, but also insight into the overall health of the device. Most voltage readbacks, either in their raw state or recorded through oscilloscopes, are unable to be used in a predictive manner as they require filtering, comparison, and complex analysis. This is particularly important for devices that directly interact with beam over multiple pulses, as is the case with the AGS extraction septum. An overview of analysis of the extraction septum is given here including what data needs to be collected, the code used, and what should be done with the results to enable utilization and predictive power.
During operation of the accelerators at DESY, technical failures of individual subsystems, components or operational issues are repeatedly observed in addition of the failures of more complex technical systems.
Further improvement of the existing high availability must therefore be achieved by analyzing all types of failures and their causes, and also through a structured recording of the operational processes and their failure-free handling.
The in-house developed database application »qBase.« links the three most important sources of information from the accelerators PETRA-III, FLASH and EuXFEL together: expert knowledge from the operators electronic machine logbooks, status information and measured values from associated databases and at least detailed root causes from written failure event reports.
»qBase.« combines the information out of the mentioned sources and makes them accessible throughout DESY's Accelerator Division for advanced analytical investigations.
An accelerator complex relies on many key systems of which the power converter complex at CERN is a large component. Modern systems rely on software and firmware which is periodically updated. Accelerator reliability requires good management of this deployment and the pre deployment detection of regression.
A system which takes modern software engineering techniques such as continuous integration and continuous development and applies these to power converter hardware is presented. This system uses a real time hardware in the loop device to emulate a wide range of power converters allowing new code to be tested with a high coverage of power converters, loads and scenarios. This ensures that there is no regression which would later be found in the accelerator resulting in down time. While these system testing methodologies are demonstrated in the context of power converters, they could be applied to other systems in an accelerator.
System/subsystem failures interrupt availability and operations. Consequences can be costly and may be safety critical. Patterns of failures are typically described by the well-known bathtub curve: first a period of decreasing failures (usually for new equipment), then a period of random failures followed by increasing failures due to wear. Traditionally, maintenance is carried out to reduce failures and ensure availability. Maintenance can be preventive (PM), or repair (RM) after equipment failure. The type of maintenance is based on the type of failures on the bathtub curve. However, recent studies reveal that complex systems/subsystems do not follow the bathtub curve. They are more likely to suffer from random failures. It can be shown that preventive maintenance has no effect on random failures in improving reliability, in fact PM may increase failures. Also, random failures are not age related. Complex systems tend to have an initial phase of increasing failures for new equipment, followed by random failures with no wear out. Such awareness has led some to abandon preventive maintenance (it may have its place for equipment/components in identified wear out phases). The question then becomes, what needs to be done to ensure availability. New techniques such as predictive or on-condition maintenance are needed as part of Reliability Centered Maintenance (RCM). This paper/presentation discusses the P-F Curve (P is the point where failure may be detected before occurrence) and failure modes and effects analysis (FMEA) in determining how equipment may fail and its consequences. RCM implementation for complex systems is discussed.
Accelerator complexes rely on the seamless operations of many hundreds of highly customised devices working together in a harsh environment. Intelligent robotic systems are becoming essential for inspection, maintenance, and repair tasks – both for the validation of systems before installation as well as during operations. Using robots can increase the safety and machine availability, while performing repetitive and dangerous tasks that humans prefer to avoid or unable to do because of hazards, size constraints or the extreme environment they take place. At CERN, robots are regularly used for inspection of devices and beam lines, validation of equipment, repair operations or regular maintenance as well as decommissioning and post-mortem analysis tasks. A fleet of robots have been developed with specific tooling to suit our needs, including procedures and best practices, and this work presents the main technical developments and features implemented to optimise robots for the use in accelerator environments and to support equipment design of remote maintainable machines. In the future, we imagine using robots in a more autonomous way, while still ensuring safe and robust interventions.
Temperature control and monitoring is a critical part of daily operations at large facilities, and is currently an urgent topic at NSLS-II. NSLS-II storage ring magnet cooling-water systems have developed blockages that reduce cooling, increase temperatures and endanger magnets, and so monitoring and preventative maintenance is critical. Machine learning algorithms provide a promising approach to giving Operators and system experts more intelligent feedback about temperature trends, to better guide preventative maintenance. In this paper, we construct two types of linear regression methods to predict temperature rise in QM Quadrupole magnets during daily operation: (1) piecewise linear regression and (2) overlapped piecewise linear regression. In (1), temperature data is collected, and piecewise regression is performed daily on 24-hours of data, to obtain a corresponding linear-fi?t slope. The slope parameter will be collected daily to generate a new rate-of-change parameter, using an accumulated regression model. In (2), the same regression techniques are used, except 24-hour dataset regression is performed every 12 hours (meaning 12 hrs of data overlap in neighboring datasets). Regression slopes are now found at twice the rate, giving a more accurate rate-of-change prediction. Di?fferent ratios of training part to predicting part are also studied to verify the quality of the methods.
With the high availability requirements at ESS of 95% availability during annual operational periods, a key aspect to achieving this target will be through following strict maintenance and configuration management procedures. As the facility is still at the early phase of commissioning, these procedures are also still under development but the key elements which are implemented or under consideration for the Machine Protection Systems at ESS will be shown and also the key challenges faced with maintaining the related hardware and configuration of the machine protection systems.
The SOLEIL synchrotron is the French third generation synchrotron light source (2.75 Gev, 354 m of circumference, 4 nm.rad natural emittance). In operation since 2007, it provides photon beams simultaneously to a total 29 beamlines. The electron beam intensity can reach up to 500 mA depending on the filling pattern (all 5 of them are available, all in Top-Up injection) with sub micrometer stability.
After the first ten years spent improving the reliability of our equipment to reach or approach the goals of 99% beam availability and 100 hours mean time between failure. We continue our efforts to maintain these good results in spite of the important work necessary in parallel to prepare the forthcoming upgrade of our facility. We will present examples of tools and actions that we have developed or implemented to anticipate incidents or reduce their impact, whether they be monitoring programs, incident follow-up, equipment modifications or changes in machine optics settings.
At the Canadian Light Source (CLS), the Accelerator Operations and Development (AOD) department makes extensive use of a wiki as a form of documentation for crossover, work instructions, and other resources. AOD began using a wiki in 2015 and over time it has become an integral part of everyday operations. I will provide an overview of how AOD has implemented the wiki, including what kinds of information are stored there and what administrative practices are applied in maintaining it. Additionally, I will comment regarding the impact of the wiki on operations from the perspective of an Operator using the wiki on a day-to-day basis.
One of the difficulties that users are reporting frequently when implementing equipment and activity management is the correct choice of equipment structuring and organisation. It is the terms items/parts, functional positions/slots and assets etc. that often confuse the users. This presentation will demonstrate that we are in fact applying the systems engineering concepts of these equipment organisation terms constantly in our everyday life without noticing and that principles are strictly the same when in the technical domain.
Hosted By: The Accelerator Reliability Workshop 2022
Admission is Free
Join us for a full American Breakfast Buffet on Wednesday in the Rotunda at the Marriott.
James River Country Club
The largest number of particle accelerators are devoted to medical applications (diagnostic or therapy), but having a smaller size and energy compared to high energy physics research. These devices are also industrial products, as compared to research machines. During the two last decades an increase pf “large scale” medical facilities, comparable to those for research, have appeared. These Proton and Carbon facilities are now slowly moving to the biomedical and industrial world.
The session of ARW2022 will consider:
• return on experience or methods used in the existing centres, which consider reliability and a financial performance indicator
• consideration of the medical accelerator and associated systems design; challenges and approaches of the industrial companies to achieve reliability of their products and processes.
The MedAustron Ion Therapy Centre has seen the first patient treatment in December 2016. Starting as a single-room, proton-only machine employing only one horizontal beam line, the therapy accelerator has gradually evolved to a three-room p+ and C6+ facility encompassing two horizontal beam lines, one vertical beam line, and a Gantry, allowing us to routinely treat more than 40 patients per day.
This contribution focuses on the challenges related to providing particle beams with high quality and reliability within a clinical setting experiencing a constantly increasing number of patients. Apart from the overall accelerator performance, a key parameter is the accelerator uptime which is basically defined by the mean time between failures (MTBF) and mean time to repair (MTTR) of the accelerator hardware and software components. Based on selected examples, we report on our strategies and methods developed during the last six years for maximizing the MTBF and minimizing the MTTR.
The Heavy Ion Medical Accelerator in Chiba (HIMAC), a heavy ion radiotherapy device of the National Institutes for Quantum Science and Technology (QST), is the first dedicated machine for heavy ion cancer therapy in the world. Beam stop due to device failure add a burden and excess amount of radiation to the patient because of retry of the patient positioning. However a preventive maintenance is indispensable so as not to bring such situation, it is not so easy because of deterioration of HIMAC, budget constraint and so on.
Accelerator Engineering Corporation (AEC) is a company that supports the operation and maintenance of heavy ion medical accelerators in Japan including HIMAC, and we carry out various maintenance work to produce the stability and reproducibility of the beam. QST and AEC organized "Maintenance Program (MP)" team in 2011, and review the utilization of fault history data. MP team analyzed those statistics in detail, and set priority to the problems which should be solved in order to improve the reliability and availability of HIMAC.
Present status, availability and MP procedures will be presented.
The cyclotron at BC Cancer in Vancouver (model TR19 by Advanced Cyclotron Systems Inc.) supplies F-18 to manufacture [F-18]FDG for 3 provincial PET centers: Vancouver, Victoria, and Kelowna (total of four scanners with around 16,000 patients scanned per year). During planned maintenance and any downtime, a backup supply of F-18 is obtained from nearby TRIUMF’s TR-13 cyclotron. The activity supplied by TRIUMF is enough to sustain 80% of BC Cancer – Vancouver’s demand; no FDG is shipped to Victoria or Kelowna; therefore, the reliability of the BC Cancer cyclotron is of the utmost importance.
Striving for maximum reliability (zero downtime, minimum component failures) puts full focus on preventative maintenance of major components of the machine during maintenance days (done every 2 or 3 weeks) and during the annual shutdown periods (1 week every year).
Also, in order to mitigate supply-chain issues encountered in the recent period and reduce the downtime after component failures (e.g. power supplies, controllers, pumps, and gauges), on-site spare part inventory will be increased. Deciding on what components and systems need upgrading, while taking into account the department’s budget, is an ongoing challenge.
The Isotope Production Facility (IPF) at the Los Alamos Neutron Science Centre (LANSCE) linear proton accelerator extracts and utilizes 100 MeV protons for the bulk production of isotopes for the DOE Isotope Program.
To facilitate the safe irradiation of isotope production targets at this facility, the target bombardment station is embedded in concrete and connected directly to a hot cell above through a 40 foot tall transfer tube filled with cooling water. A motorized chain drive system is used to move production targets between the hot cell and the target bombardment station.
We recently replaced the existing target transfer system, consisting of an old motorcycle chain, with a new push pull chain to facilitate better maintenance and reliability and to mitigate risks associated with the inability to repair the chain transfer system in the event of failure. We will present some of the installation and commissioning challenges faced during this process as well as lessons learned for future design improvements.
Components and systems have different failure modes, and rates depending on where they are in their lifetime, which in turn is dependent on the operational cycle in which the components and systems are being used.
• What are effective means and tools for predicting failure?
• When should we use the tools?
• What is the raw data which is needed and how do we obtain it?
• How do we analyse and feed-back this information to drive improvement?
In this session, we encourage the presentation of uses cases, scenarios, and the tools, such as Root Cause Failure Analysis (RCFA), to conduct logical, structured, and deductive techniques that can identify the failure root-causes.
The Spallation Neutron Source (SNS) Superconducting Radio Frequency (SRF) Linac has been in production operation since 2006. Since that time much has been understood about causes for SRF downtime with a high-power proton beam. One of the important causes for downtime is related to repeated beam loss events which lead to the need to reduce cavity gradients to maximize reliability. Significant time has been spent to try to first reduce the frequency of beam loss events, and second to reduce the beam lost during each event. The need to reduce gradients has slowed significantly, but the need does remain. With the Proton Power Upgrade (PPU) ongoing which will double the beam power capability of the linac the need to further prevent beam loss events remains a high priority. The source and frequency of beam loss events are difficult to predict and prevent and the protection system turn off time is hardware limited. This all led to the idea to try to utilize machine learning to monitor beam pulses to try to predict an upcoming beam loss event and when predicted hold the beam off during the event to prevent beam loss occurring. The focus is not to prevent the cause of the beam loss but just prevent the beam from occurring in the SRF cavities. The timeline to reach the point of need for machine learning as well as the current implementation and future utilization will be discussed.
*ORNL is managed by UT-Battelle, LLC, under contract DE-AC05- 00OR22725 for the U.S. Department of Energy.
MAX IV is a synchrotron radiation facility based on a 3 GeV linear accelerator, which powers a soft and hard x-ray storage ring, as well as a short pulse facility. At MAX IV, the prediction, prevention and handling of failure is a critical part of ensuring a reliable beam for user access. A robust set of tools has been developed for use in failure-related procedures, which will be presented along with several examples.
Before handover to beamlines, parameters are compared to past and nominal values and alterations are made if necessary. Furthermore, after longer maintenance, a system of checklists delegated to subsystem owners is used to verify the equipment. During user access, operations personnel monitor accelerator systems with several applications, and are notified by an alarm system when parameters go beyond specified limits. When failure does occur, the personnel use various applications to diagnose the failure and collect data. Repetitive start-up tasks are automated to allow for a quick recovery.
After a failure, details are logged in a home-built data-driven logging system. This allows for evaluation of the performance of the facility, presenting statistics such as the uptime%, mean time to failure and mean time to repair. Moreover, the logged details are discussed with relevant subsystem owners to identify causes and formulate potential plans to improve facility performance. After a plan is made, progress is tracked and archived on completion. Several past examples of the continues improvements and their effects on performance will be presented.
The IFMIF-DONES facility is aimed at providing a database of materials exposed to similar irradiation conditions as in DEMO. For this purpose, a neutron source facility will be built, consisting of a superconducting linear Accelerator Systems (AS) generating a deuteron beam impacting onto a target made of a liquid lithium jet provided by the Lithium Systems (LS).
An important aspect of the project design activities is to assess the system reliability at all phases of the facility life cycle to support a reliability growth during the ongoing design phase and to monitor the compliance with the stated availability targets.
In DONES facility, AS and LS are functionally and physically entangled. In fact, AS operation has an interlock dependency on the LS which can be “ready or not ready for beam” since the liquid lithium jet must be ready to receive the 5 MW power of the accelerator. In this context, the present study focuses on the failures arising in LS that can lead to a beam interlock to the AS.
Following RAMI methodology, first a Failure Mode and Effect Analysis (FMEA) is done in order to point out all the relevant unavailability conditions in the LS subsystems (lithium loop, secondary and tertiary cooling loops and target system) that could lead into a shutdown event on the AS. Then, Reliability Block Diagrams (RBDs) are derived from FMEA by implementing a reliability-wise representation of system component behavior and simulating the system performance under due operating conditions.
As outcome of the RAMI, information on the expected total downing events, total downtime and criticality for a given period can be derived. This offers valuable information about LS failures impacting on AS operation that can be presented as number of events in the AS caused by the LS per period of time.
Innovation in automation and robotics is an area of focus for increasing efficiency and productivity in the accelerator community.
• How is the accelerator industry utilizing both "off the shelf" and "in house" technologies for handling repetitive tasks to reducing stress and increasing safety for the workforce?
• How are we using Big Data analytics to discover useful, otherwise hidden, patterns to incorporate data-driven decisions for facilities and applying machine learning techniques to model, explore and implement data driven solutions?
• What type sensors are available (pressure, vibration, thermal, etc.). Uses of case scenario will be very greatly appreciated.
A Digital Twin is a dynamic, bespoke model of a physical system, providing real-time information on the asset instant state and allowing for data extrapolation and predictive modelling.
The application of Digital Twins can open a new chapter in the engineering of particle accelerators and their systems. With Digital Twin, data acquisition and asset modelling can be augmented in space and in time: plentiful additional insight may be provided with respect to ‘standard’ data acquisition of physical parameters, thus allowing for enhanced perspectives in design and engineering; asset prediction is bespoke and real-time, to the great enhancement of predictive and corrective maintenance activities.
This contribution will detail what makes a Digital Twin so unique and different from the historical methods applied in engineering design, data acquisition, monitoring and failure prediction. Applications for accelerator systems will be presented, together with the status of ongoing activities within the Mechanical and Materials Engineering Group at CERN.
The fast beam interlock system (FBIS) for the ESS accelerator was developed and built in-house by the safety critical systems (SKS) group at the Zurich University of Applied Sciences (ZHAW), in close collaboration with the ESS machine protection (MPS) group. The FBIS plays an essential role in ESS machine protection and is the logic solver element of most protection functions. In order to ensure high reliability of the FBIS, a reliability analysis is performed following the IEC 61508 functional safety standard for the assessment of hardware integrity.
The presentation shows the various steps needed to verify the hardware integrity.
This includes the calculation of the Probability of dangerous Failure per Hour (PFH) and the evaluation of the architectural constraints by calculating the Safe Failure Fraction (SFF) and the Hardware Fault Tolerance (HFT) of the system. These calculations are based on failure rate predictions using the Siemens SN 29500 standard and a detailed Failure Modes, Effects and Diagnostic Analysis (FMEDA).
The current results of the FBIS reliability analysis are presented and compared with the corresponding hardware integrity requirements. In addition, an example reliability analysis of a complete ESS machine protection function containing a senor system and actuators is shown.
An alarm management system is an essential tool for control room operators to facilitate a fast and effective response to abnormal operating conditions. A major challenge for operators is alarm fatigue resulting from too many concurrent alarms, frequent alarm state changes (nuisance alarms), or recurrence of known bad alarm states that are already handled. We have sought to create a control-system-agnostic alarm management system that is fast, scalable, distributed, fault tolerant, and allows for on-the-fly event processing providing additional custom functionality beyond simply logging and acknowledging alarm state changes. Expected benefits include assisting operator decision making, reducing nuisance alarms, seamless integration with different and future control systems, and the ability to tailor the system to different end users and GUIs.
The anomalies in the High Voltage Converter Modulator (HVCM) remain a major down time for the Spallation Neutron Source (SNS) facility. To improve the reliability of the HVCMs, several studies using machine learning techniques were to predict faults ahead of time in the SNS accelerator using a single modulator. In this study, we present a multi-module framework based on Conditional Variational Autoencoder (CVAE) to detect anomalies in the power signals coming from multiple HVCMs that vary in design specifications and operating conditions. By conditioning the VAE according to the given modulator system, the model can capture different representations of the normal waveforms for multiple systems. Our experiments with the SNS experimental data show that the trained model generalizes well to detecting several fault types for several systems, which can be valuable to improve the HVCM reliability and SNS as a result.
Limited to IOC Committee Members
Faced with the combination of increasing beam power, high operational availability requirements, and the reliance on often custom made, specialized and expensive equipment, machine protection in accelerator facilities is critical for protecting against long shutdown periods and the associated financial losses incurred though damage to equipment. Without any dedicated standards for the implementation of machine protection systems, defining a machine protection concept and method can be one of the key challenges faced by facilities to mitigate machine protection risks in a systematic way which requires a broad understanding of accelerator physics, engineering design, functional safety principles and gained facility experience. In this session it will be discussed some of the challenges and opportunities related to machine protection for accelerator facilities.
The SCLinac at TRIUMF’s Isotope Separator and Accelerator (ISAC) Facility can provide energies above the Coulomb barrier to create nuclear reactions at our third/final set of experimental facilities. High vacuum is required to support the operation of RF & Cryogenics. This past June, two forms of vacuum protection failed in tandem to adverse results. Two years minimum are needed to regain the accelerating capacity, & there are implications for these experiments in the meantime. This talk will explore what information indicative of such potential vacuum failures we had in advance, what was immediately implemented as temporary additional protection, & opportunities being explored to increase reliability for the future as we upgrade existing & design new accelerators & beamlines.
ALBA is the Spanish 3rd generation synchrotron light source. In operation since 2012, we are continuously aiming to maximize the beam availability for users. Like the rest of particle accelerators, an Equipment Protection System (EPS) takes care of avoiding situations that could damage the accelerators.
Based on sensors (that gather information), PLCs (that processes it) and switches (that act on hardware), many different hazards are taken into account in the EPS: magnet thermal switches interlocks power supplies, vacuum gauges and pump pressures close valves, surface thermocouples inhibits RF,... However, these situations rarely occur.
Here we discuss which checks we perform on the EPS to ensure it is working properly, not only focusing on what/how often do we check, but also how we do it; one can easily simulate a thermocouple error signal in a PLC, but it’s not that easy to heat it up physically.
The beam abort system for the current Swiss Light Source (SLS) is based on inverting the RF phase to decelerate the stored beam. The loss process was assumed to evenly spread out the stored beam around the ring. However, it is actually localised at longitudinal positions where the dispersive orbit meets the machine aperture. For the SLS, these losses mainly occur at the septum and triple-bend-achromat arc sections. For the SLS 2.0 with its seven-bend-achromat lattice and thus much lower dispersion in the arcs, tracking simulations show that these losses are localised at a superconducting super bend and an in-vacuum insertion device. Due to this unfortunate dispersive loss distribution, the small beam size, the fragile vacuum chamber and the stored beam energy of 1 kJ, a more controlled beam abort is desired. In case of an RF failure, the beam abort system must dump the beam safely before the critical dispersive orbit is reached. A fast beam dump controller with dedicated inputs for fast systems such as the low-level RF and fast feedback systems is foreseen for triggering the required emergency beam dump procedure. The majority of the well over 6000 machine interlock signals will pass through the slow, programmable-logic-controller-based machine interlock system. Here the sheer amount of signals will pose a challenge.
Reliability and availability requirements are taken into consideration before the design of systems and components is carried out. This is a common practice in many industrial applications and is becoming more common for projects in the accelerator domain. A systematic analysis of machine parameters and scenarios should serve as a basis to establish prediction and calculations to correctly dimension accelerator sub-systems and components. Knowing how to calculate reliability is important, but knowing how to achieve reliability is equally, if not more, important. Reliability practices must begin early in the design process and must be integrated into the overall product development cycle. This session will try to answer how to fold reliability considerations before and during a machine upgrade.
An overview of the power supply system installed during the EBS upgrade will be presented. The focus will be on the operation of the storage ring main magnet power supplies. A comparison between the old and the new power supply structure will be given. The current topology of the system and the main issues since the beginning of the EBS operation will be presented in detail. The description of a power supply hot swap system to enhance the global system availability and reliability will also be provided. Finally, the perspectives for the future will be discussed.
With the Facility for Antiproton and Ion Research (FAIR), an extension of the existing accelerator facility is being built at the GSI Helmholtz Centre in Darmstadt/Germany, which will increase the area of the research center by a factor of four. Within the framework of the project, a superconducting 100 Tm synchrotron, a superconducting fragment separator and 2 storage rings, as well as several kilometers of beamline and target stations are built. With the increased complexity and the number of devices, the possible sources of failure also multiplied. It was therefore necessary to keep reliability in mind from the very beginning and to develop a corresponding concept for the operation and maintenance of the facility. In this talk, the concepts are presented and a short status update on the state of the project will be given.
Future proton superconducting RF (SRF) linacs used as accelerator driven systems (ADS) must achieve high reliability and availability to meet the challenging parameters for applications in medical treatment, nuclear waste reduction, and nuclear power generation. What SRF innovations and advanced concepts are needed? To answer this question, a case study of the past, current, and possible future downtime sources is carried out for the Spallation Neutron Source (SNS) SRF linac systems. SNS is an accelerator-driven neutron source facility routinely operated at a 1.4 MW beam power with a 99% availability in its SRF systems and is currently undergoing an upgrade to a new level capable of a 2.8 MW beam power. The preliminary outcome of this study is to be presented. We will discuss its implications to the needed development of the next generation SRF systems and related systems towards 10-20 MW proton SRF linacs required for future ADS facilities.
Discussion summary, workshop highlights, closing remarks from LOC and IOC and future workshop strategy and planning