DIGILIT

Empowering Digital Literacy

User Tools

Site Tools


big_data

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
big_data [2020/08/23 19:23]
Sophia Bickhardt
— (current)
Line 1: Line 1:
-======Big Data====== 
- 
-=====In short===== 
- 
-Big data refers to large amounts of data that are collected, stored, processed and analysed using specific procedures. It is impossible to imagine economy and society without data; they are obtained from people's activities on the Internet and via mobile phones, are the basis for activities on the financial markets, are relevant in the energy industry, health care and transport; they are covered by the use of credit and customer cards, surveillance cameras, airplanes and vehicles, chatting via WhatsApp, postings on Facebook (or other providers), the use of assistive devices such as Alexa, Curtana or Siri, the use of fitness wristwatches and soon the use of "intelligent refrigerators", the smart table lamp, face recognition at train stations, body scanning at airports, "social scoring", and so on. It is estimated that the amount of data having been collected doubles every two years. 
- 
-Data in large quantities, "big data", are the **"oil"** of the digital age. They represent a significant **economic factor** and growth engine. Firstly, because they serve to optimise production processes. On the other hand, because new products – such as the "intelligent" watch – are created on the basis of technologies that generate data. 
- 
-The development and use of data on a large scale raises questions about the **political control**: Who has access to the data? How can the misuse of data be prevented, how can transparency be ensured? How can it be prevented that data are removed from social, democratic control, for example by handing it over to secret services? Attempts to regulate the development and use of data to protect personal rights are formulated in data protection regulations. 
- 
-=====Facets of a Term===== 
- 
-Big data refers to large amounts of data that are collected, stored, processed and analysed using specific procedures. This is associated with great hopes. At the same time challenges and risks are discussed. Above all, big data is political. Why? Read it yourself! 
- 
-First of all, a **brief introduction** to some aspects - as one of many presentations published on the Internet: 
- 
-[[https://www.youtube.com/watch?v=bAyrObl7TYE|Video: Big Data in 5 Minutes]] 
- 
-**Questions about the film**: 
- 
-What are big data according to this representation? 
- 
-What can you learn about data storage? 
- 
-Which aspects you know about big data were not mentioned? 
- 
-=====Why Big and not Small?===== 
- 
-Big data exceeds the usual possibilities of data transfer and data storage. It does not work like this to send a 150 MB attachment in an e-mail. This is "too big". It is the same when several aircraft simultaneously exchange data with the air traffic control of an airport, which monitors the flight. The amount of data is too large to be stored on conventional media. 
- 
-Big data, however, does not only mean large amounts of data. The term refers to several dimensions: 
- 
-  *volume – volume, data volume, 
- 
-  *velocity – speed with which the data volumes are generated and transferred 
- 
-  *variety – range of data types and sources, 
- 
-  *veracity – authenticity of data. 
- 
-These technological dimensions are extended by the aspects of 
- 
-  *value – the added value for companies (hoped-for profit growth) and 
- 
-  *validity – ensuring data quality. 
- 
-While big data is characterised by the fact that they are not stored and processed for instance with a simple PC, how is it done? In the video, this is explained using Apache Hadoop as an example, a framework ("programming framework") by which data are processed in parallel, where they are divided and stored on different computers and thus backed up. 
- 
-Big data is more than that. It also means an active handling and use of these data. The stored information are **analysed (data analytics)**. It is expected, that this will provide insights for the improvement of products, the development of new products, the most accurate possible advertising of goods and services, for science and research, justice and administration, but also for the military and secret services. There is no lack of "raw materials" for data analysis: it is estimated that the volume of data has doubled every two years since 2011. What may seem abstract may become more 'comprehensible' if one imagines the activities that have led to the acquisition of data: research on the Internet, via search engines such as Ecosia, Startpage or Google, posting on Facebook, Instagram, Twitter or similar, when chatting (WhatsApp, Telegram, Signal, Threema etc.), cashless payment by credit card, booking a train, bus or flight, body scanning at airports, visiting the doctor, online shopping, using a fitness watch, using assistants such as Alexa, Cortana or Siri, using navigation systems such as GPS and much more. Data is collected by companies, in many different ways by authorities, by surveillance cameras in public places, in train stations or private facilities, by face recognition, networked technology in houses (smart homes), the "intelligent" car, when making phone calls, writing e-mails and much more. So-called metadata are often obtained, i.e. data that describe an object and are combined (indexed) into categories. One can imagine this similar to a catalogue in a library. 
- 
-{{:big_data.png}} 
- 
-In addition to those already mentioned, the analysis of large amounts of data obtained is used in many other fields: Crime prevention, analysis of web statistics, investigation of weather data, risk assessment and classification of insurance contributions (health, car and other insurance companies), in medicine, fraud detection, precision agriculture, investigations into the development of earthquakes and epidemics, population migration, traffic congestion, marketing and influencing purchasing behaviour, the evaluation of movement profiles and much more. Data analyses have also been used in **politics** and the **steering of political opinions**. The company Cambridge Analytica has become well-known. It had the reputation of having created several million personality profiles of Facebook users, which offered information for targeted election advertising (in the US presidential election campaign in 2016, in the referendum on Brexit 2016). However, the accuracy of the analyses was highly doubtful.1 
- 
-=====Big Data – Big Market===== 
- 
-Data are considered the fuel of the 21st century – as the central raw material for economic growth. Companies are focusing on using big-data technologies such as in-memory data management, analytics, artificial intelligence and machine learning to optimise business processes, gain competitive advantages over others, create new business models and new markets, for example with a view to combating climate change: "Climate needs data and lots of it.”2 As it ist not easy to imagine the data volume, not only big, but also vast data or data lakes are spoken of.3 
- 
-Challenges from a company's point of view are described as "data chaos", the pressure for speed and time advantages and the error-proneness of data analyses. 
- 
-From the perspective of employees and consumers, the spying and, based on this, the diagnosis of employees, often referred to as "people analytics", is a trigger for criticism. The aim is to bring together the data traces left behind by employees. Among other things, algorithms are used to determine the mood in the company, to gain information about who has influence or is unlikely to have influence, or to predict future behaviour, e.g. whether someone is inclined to resign. People Analytics is becoming more and more widespread, not only in individual divisions of the company. In Germany this practice is subject to co-determination, as personal data are used. However, those who are already on the move with fitness bracelets, smart watches and the like practice a kind of **people analytics** themselves – and may not be sure whether the data are only accessible to themselves. 
- 
-Bracelets are also used by the online company Amazon, using radio and ultrasound technology. They are used to record the hand movements of employees precisely. For example, the bracelet vibrates when a warehouse worker misplaces a package. It can also be used to check whether an employee is working, taking a break or visiting the rest room. 
  
big_data.1598210637.txt.gz · Last modified: 2020/08/23 19:23 by Sophia Bickhardt

Redistribution of this work and its contents as OER permitted.
Please cite as follows: "Empowering Digital Literacy" by DIGILIT project team, CC BY-SA 4.0