big data

Big Data Technology that is used at CERN for Data Analysis

CERN is perhaps the world’s biggest particle physics laboratory. It was founded in 1954 and since then has been bringing together thousands of scientists from all over the world who would work on advancing our knowledge of the matter, its constituents, and the forces linking them. CERN keeps pushing technological boundaries by focusing on fundamental research and training numerous scientists.

What Is Big Data

To put it simply, big data comes in a number of larger, more complex data sets usually from new data resources. As Gartner phrased it in 2001, “Big data is data that contains greater variety arriving in increasing volumes and with ever-higher velocity.” Because of how enormous big data is, it simply can’t be processed by traditional data processing software. These volumes of data are analyzed to improve the different aspects of a business and solve various problems that you would not be able to deal with otherwise. You could say that big data comprises of three main factors: volume, velocity, and variety. As mentioned earlier, big data comes in high volumes with low density and next to no structure with undefined value. Anything from Twitter data feeds to sensor-enabled equipment data can be considered big data as long as it comes in large volumes. The velocity is basically the rate with which the data is received and acted on. The data with the highest velocity usually streams directly into memory rather than being written onto a disk. Some products enabled by the Internet operate in real-time and require appropriate evaluation and management. The many types of data currently available are referred to as variety. Traditional data types used to be structured in a relational database, but big data changed the game. It comes in new, semistructured and unstructured data types. These can include anything from text and audio to video and other forms. They require extra preprocessing in order to derive meaning and support metadata.

Two other very important aspects of big data are its value and veracity. You could say that there are five V’s of big data: volume, velocity, variety, value, and veracity. Until you truly discover the value of big data, it remains intrinsic. Another question is the accuracy and truthfulness of your data and its reliability. Nowadays, big companies rely on huge amounts of data a lot in order to analyze it and develop new products producing more efficiency. Storing data has become less expensive with the help the most recent technological breakthroughs. Business decisions have become more accurate thanks to the more accessible and affordable increased volumes of big data. On the other hand, there are different challenges that come along with it. For instance, the fact that it is so big makes it much harder to handle despite all of the leaps in technology. The problem is that big data keeps growing every year sometimes exponentially and organizations still struggle with it a lot. Another issue is that structuring, sorting, and curating big data takes a lot of time. Up to 80% of the time used on managing big data is spent on organizing it. Big data technology is also constantly changing and evolving making it hard to keep up with sometimes. But some organizations like CERN have managed to figure it out and start using it in their own processes.

Big Data Used By CERN

CERN’s LHC records hundreds of millions of collisions between particles. These particles can achieve a speed of 99.9% of the speed of light because they accelerate around the collider. Consequently, this generates an enormous amount of data with LHC alone generating around 30 petabytes of information a year. After that, this data is being analyzed by special algorithms programmed to detect the energy signatures that were left behind by the appearance and disappearance of the elusive particles that CERN is looking for. The results are compared by the algorithm with the theoretical data concerning the way we believe in which these particles act. Once there is a match, it signifies that the sensors found the target particles. Back in 2013, CERN had announced that the scientists found the Higgs boson particle. This was a breakthrough in science. The thing is that the existence of this particle had been theorized for decades, but couldn’t be proved until now. The technology was simply not advanced enough. With this discovery, scientists now have insight into the structure of the universe as well as the complex relationships between particles.

Technical Details of Big Data Analysis

CERN’s LHC uses light sensors to collect data and record the collision and fallout from protons accelerated around the collider. The logic is simple. The sensors located inside of the colliders pick up light energy emitted at the time of the collisions as well as from the decay of the resulting particles. Then, this is converted into data that can be analyzed afterward by computer algorithms. Most of the data comes in the form of photographs and is essentially unstructured. However, there is something called the Worldwide LHC Computing Grid. It is the world’s largest distributed computing network and spans 170 computing centers in around 35 different countries. Seven CERN sensors provide 300 gigabytes of data per second. This number is converted to 300 megabytes of “useful” data per second. After that, this data is transformed into a real-time stream to academic institutions that are CERN’s partners.

Ideas and Concepts behind Big Data

There are many interesting ideas and concepts behind big data and how it is processed by CERN that you can use yourself. For instance, CERN’s computing grid illustrates how distributed computing allows us to carry out tasks much more complicated than what one organization can complete. Normally, these processes would simply be beyond our capabilities. Now we can store data almost anywhere with the help of distributed systems. This can be done across a number of different locations and the data can still be found and accessed almost immediately. This, in turn, has led companies to change their views and approach to working with data, especially how much data we can work with. Use the techniques used by CERN in your own big data analysis and you will see how your data management improves.

Using Big Data Analysis

So, how can you use big data analysis in your own strategy?

Firstly, it’s a good idea to combine it with translation and find a way to reach international markets with its help. You can use such online translation agencies like PickWriters to work with your data and documents to either translate or localize them.

Secondly, consider affiliate marketing to help you collect even more relevant data for big data analysis. You can use some affiliate network or something similar to get you started in this field. Affiliate marketing will allow you to sell almost anything and collect the data you need so much. For example, the coffee maker can be your first item to promote if you have a café or have a business connected to coffee. Try to get creative with big data analysis and find new ways to use the data you collect on a daily basis.


Written by:

Stuti Dhruv

Stuti Dhruv is a Senior Consultant at Aalpha Information Systems, specializing in pre-sales and advising clients on the latest technology trends. With years of experience in the IT industry, she helps businesses harness the power of technology for growth and success.

Stuti Dhruv is a Senior Consultant at Aalpha Information Systems, specializing in pre-sales and advising clients on the latest technology trends. With years of experience in the IT industry, she helps businesses harness the power of technology for growth and success.