This is an old revision of the document!
Big data refers to large amounts of data that are collected, stored, processed and analysed using specific procedures. It is impossible to imagine economy and society without data; they are obtained from people's activities on the Internet and via mobile phones, are the basis for activities on the financial markets, are relevant in the energy industry, health care and transport; they are covered by the use of credit and customer cards, surveillance cameras, airplanes and vehicles, chatting via WhatsApp, postings on Facebook (or other providers), the use of assistive devices such as Alexa, Curtana or Siri, the use of fitness wristwatches and soon the use of “intelligent refrigerators”, the smart table lamp, face recognition at train stations, body scanning at airports, “social scoring”, and so on. It is estimated that the amount of data having been collected doubles every two years.
Data in large quantities, “big data”, are the “oil” of the digital age. They represent a significant economic factor and growth engine. Firstly, because they serve to optimise production processes. On the other hand, because new products – such as the “intelligent” watch – are created on the basis of technologies that generate data.
The development and use of data on a large scale raises questions about the political control: Who has access to the data? How can the misuse of data be prevented, how can transparency be ensured? How can it be prevented that data are removed from social, democratic control, for example by handing it over to secret services? Attempts to regulate the development and use of data to protect personal rights are formulated in data protection regulations.
Big data refers to large amounts of data that are collected, stored, processed and analysed using specific procedures. This is associated with great hopes. At the same time challenges and risks are discussed. Above all, big data is political. Why? Read it yourself!
First of all, a brief introduction to some aspects - as one of many presentations published on the Internet:
Questions about the film:
What are big data according to this representation?
What can you learn about data storage?
Which aspects you know about big data were not mentioned?
Big data exceeds the usual possibilities of data transfer and data storage. It does not work like this to send a 150 MB attachment in an e-mail. This is “too big”. It is the same when several aircraft simultaneously exchange data with the air traffic control of an airport, which monitors the flight. The amount of data is too large to be stored on conventional media.
Big data, however, does not only mean large amounts of data. The term refers to several dimensions:
These technological dimensions are extended by the aspects of
While big data is characterised by the fact that they are not stored and processed for instance with a simple PC, how is it done? In the video, this is explained using Apache Hadoop as an example, a framework (“programming framework”) by which data are processed in parallel, where they are divided and stored on different computers and thus backed up.
Big data is more than that. It also means an active handling and use of these data. The stored information are analysed (data analytics). It is expected, that this will provide insights for the improvement of products, the development of new products, the most accurate possible advertising of goods and services, for science and research, justice and administration, but also for the military and secret services. There is no lack of “raw materials” for data analysis: it is estimated that the volume of data has doubled every two years since 2011. What may seem abstract may become more 'comprehensible' if one imagines the activities that have led to the acquisition of data: research on the Internet, via search engines such as Ecosia, Startpage or Google, posting on Facebook, Instagram, Twitter or similar, when chatting (WhatsApp, Telegram, Signal, Threema etc.), cashless payment by credit card, booking a train, bus or flight, body scanning at airports, visiting the doctor, online shopping, using a fitness watch, using assistants such as Alexa, Cortana or Siri, using navigation systems such as GPS and much more. Data is collected by companies, in many different ways by authorities, by surveillance cameras in public places, in train stations or private facilities, by face recognition, networked technology in houses (smart homes), the “intelligent” car, when making phone calls, writing e-mails and much more. So-called metadata are often obtained, i.e. data that describe an object and are combined (indexed) into categories. One can imagine this similar to a catalogue in a library.