Summary:Tape storage and distributed computing are at the core of the latest discoveries in the field of physics.
Image: CERNAt the press of a button, the largest machine ever made by humans will come alive. Buried around 100m underground, near Geneva, Switzerland, the Large Hadron Collider is about to make history once again when it restarts in a few weeks time. Thousands of researchers are at work in the center to discover how the world was made and what happened right after the Big Bang. Most of the time, it's the physicists that get all the credit for big breakthroughs, but they couldn't have done it without the IT guys.
"When I arrived at CERN [European Organization for Nuclear Research], I immediately felt that it was science heaven," Alberto Pace, IT data and storage services group leader, told ZDNet. The challenges he faces here, the home of the Large Hadron Collider (LHC), are "unmatched in many areas", he says.
Pace and his team get to be part of the most exciting scientific experiment in history, the one that discovered the Higgs boson. This particle appeared in calculations in the 1960s, but researchers were able to confirm its existence through experiments conducted in 2012.
What exactly is the Higgs boson, and what does it do? Physicists know that it helps other particles acquire mass, but many of its properties remain hidden for the moment. The upgrade of the world's largest and most powerful particle collider should tell them more about this mysterious particle.
All the experiments happen in a circular underground tunnel which is 27km long and has a depth which ranges from 50m to 175m below ground. Here, beams of particles are fired at almost the speed of light from two opposite directions. They smash into each other forming new particles, which are tracked down by LHC's detectors. The temperature reaches -270C, and the machine is powered by a current of 11,000 amps.
The upgrade allows particle collisions at a total energy of 13 trillion electron volts, compared to eight trillion electron volts three years ago. To understand this number, that is roughly the force of an apple hitting the moon and creating a crater 9.5km across.
Besides the properties of the Higgs boson, the new and improved LHC aims to find out more about dark matter and dark energy, extra dimensions, antimatter, supersymmetry, and quark-gluon plasma, a hot soup that appeared after the Big Bang.
How data travels
Hundreds of millions of particle collisions take place every second, at the heart of LHC's detectors. The sensors generate about one petabyte of data every second, an amount no computing system in the world could be able to store if it was generated for any prolonged period.
Most of the data is discarded quickly, as sophisticated systems select what could be of interest for the scientists and filter out the clutter. Then, tens of thousands of processor cores go even further and choose just one percent of the remaining events - information which then gets stored and is later analyzed by physicists.
The datacenter can save 6GB of data per second at the peak rate of the LHC. However, this gigantic machine doesn't run 24/7. "We're expecting about 30 petabytes per year of LHC run two - that would represent something like 250 years of high-definition video," Frédéric Hemmer, IT department head, told ZDNet.
From the underground tunnel, the information travels to the surface to CERN's datacentre, where it's stored on magnetic tapes. The same data also goes to over 170 locations scattered around the globe to the Worldwide LHC Computing Grid, a distributed computing platform. It's like a virtual supercomputer with parts located in almost 40 countries, organised in tiers.
Every day, the Grid processes more than two million jobs, the equivalent of a single computer running for roughly 1,300 years, according to CERN.
"We believe the infrastructure that we have here is a very good example for scientific analysis in general that requires intensive data processing as well as storage," says Pace.
Computing power
Most of the magic happens in Geneva. The datacentre here has four rooms: "One is about 1400 square metres, a second one is 1200 square metres, and then we have another two that are a bit more than 200 square metres," Wayne Salter, IT computing facilities group leader, told ZDNet.
It wasn't enough. They needed an additional data room, and the most cost effective project came from Budapest, Hungary. Wigner Data Center was built in 2012. It's not a backup site, but an extension to the one in Geneva, acting like a new data room, and occupying an area of more than 800 square metres.
In total, the two datacentres have 150,000 cores, according to Salter. Two dedicated and redundant 100Gbps lines connect the sites.
Testing future commercial technologies
Some of the technology that IT professionals use at CERN hasn't been released publicly. The organisation has several agreements with companies to test their products through a public-private partnership called CERN openlab.
The collaboration helps CERN gain access to early technology years before it reaches the market, while companies get to see how their products work in a highly demanding environment. "With Intel, for example, we've been exposed to early [CPU] technology: Nehalem, Westmere, Sandy Bridge, Ivy Bridge, and Haswell," says Hemmer.
Other partners in CERN openlab are Huawei, Oracle, and Siemens, while Rackspace and Seagate are contributors and Yandex is an associate. "The ongoing collaboration with Huawei is in the area of storage," says Hemmer. "We have been evaluating the latest solutions for cloud storage, and this is a good example of something that has become a product now available on the market."
Why store data on tapes?
In contrast, there's one old-school tech that CERN favours. It stores raw data from the experiments on magnetic tapes, a medium first used to record computer information in 1951, on a UNIVAC device.
The ongoing battle between disk and tape has a clear winner at CERN. There are several reasons for that, says Pace. One of those is money, as tapes are slightly cheaper to use than hard disks. "Magnetic tapes do not consume electricity, so this allows us to keep a very large storage site with nearly no running costs for data preservation," he says.
Another argument is reliability. "When the disk fails, we lose the entire content. We're talking of terabytes of data loss every time a disk fails. With tapes, it's only a limited amount of data that is lost in the area of the tape where the error occurs, so it's rarely more than a few gigabytes." Plus, tapes can be read in 30 years from now, while data stored on disks is only fully accessible for as little as five years.
Magnetic tapes are a safer bet for security reasons, as well, according to Pace. "It would take several years to delete the amount of data we have stored on tapes, whereas deleting data from hard disks is often only a matter of a few seconds."
Contrary to the common belief, tapes aren't that slow, he argues. They have high latency, meaning that every time you need to write to a tape, it takes a minute for it to be mounted from the library into the tape drive. However, when the preparations are made, it allows you to write data at a high rate. "And it's also possible to verify data on the fly, synchronously, when the data is written."
CERN demonstrates that tape, a 60-plus-years-old storage medium, is far from being obsolete. For the Large Hadron Collider run two, IT experts have improved the tape infrastructure to support cartridges with over 8TB of capacity each, a measure that will grant an increased archival capacity of the datacentre.
Pace's team might directly benefit from research currently carried out in this field. In May Sony announced an 185TB magnetic tape, while IBM and Fujifilm presented a prototype with 154TB capacity.
Apart from this, IT professionals here use several other storage technologies. "We have a small number of NetApp, EMC, and Hitachi solutions, but these are not part of the general storage infrastructure," he says. "We use them in some special cases where we need to meet certified hardware platforms that are necessary to obtain support for some proprietary software applications that are required in specific areas."
A mix of old and new
One might assume that physicists are the ones doing the major discoveries at CERN. An example that contradicts this theory is Sir Tim Berners-Lee, who came up with the idea of the World Wide Web while working there in 1989. He wrote a memo, passed it on to his supervisor, and after some time he was told he could do this project in his spare time. His supervisor wrote on that piece of paper: "Vague, but exciting." This quote has veen printed on some of the T-shirts visitors can buy at CERN's gift shop.
Many of the people who work for the particle accelerator say they were drawn here by the state of the art science and technology. Still, when they arrived, they were dazzled by the simplicity of the place. The buildings are white and modest, with traditional offices decorated with old furniture. This is because the funding goes into science, rather than redecorations.
The Hollywood crew that came here to research the 2009 film Angels & Demons took some pictures to create a 3D model of the ATLAS detector of the Large Hadron Collider, and chose other shooting locations. Legend has it that the buildings on the surface seemed too outdated to them.
|
No comments:
Post a Comment