Introduction: When the Internet entered the second half: After the Internet of the Industrial Internet or Industry, the United Nations could connect with the Internet, and at the same time, many factors rushed into the exponential development track, forming the Internet can not bear and can not bear the weight .
Time is fair and time is unfair.
The fairness is that time is the same for all, and it is unbiased; the unfairness is that the outcome of things may vary greatly over time. Moore's Law tells us that when things enter normal orbit, their speed of development will show exponential growth. The chip is like this, the network is like this, and the data is the same.
So, when the Internet entered the second half: the Internet of the industrial Internet or industry, the United States can connect, and everything is connected. At the same time, many factors are competing to enter the exponential development track, forming the Internet can not bear and can not Bear the weight.
If life is just like the first sight, what is the autumn wind and sadness to draw a fan.
A very obvious feature of the industrial Internet field in which Gegen Dongzhi is located is the convergence of large amounts of industrial data, and a very obvious feature of industrial data is time.
In general, typical characteristics of industrial data include:
fast generation frequency
industrial data acquisition is basically second level, part of high frequency data acquisition is milliseconds or microseconds Level, each collection point can generate multiple data in one second
is heavily dependent on acquisition time
each data requires corresponding unique time
Large amount of information and relatively simple data structure
The conventional real-time monitoring system has thousands of monitoring points, and the monitoring points generate data every second, generating tens of GB of data per day
Industrial data is a new problem in the IT world "just as first seen", but in the industry, this has long been a problem.
In the traditional industrial data acquisition and industrial monitoring area (SCADA), it is necessary to monitor networked devices and to persist the data sampled by the monitoring. There has been a dedicated database in the industrial field to accomplish this task.
This specialized database is called: real-time database (here should haveapplause). The real-time database in the industrial field has main functions such as data acquisition, real-time data buffering, data write-back (sending instructions to the device), and sampling data archiving. At present, real-time databases in the industrial field are basically monopolized by foreign manufacturers and are expensive. Take the famous PI database as an example. The base version (only 5,000 points) requires about $100,000, and each data collection interface costs $6,000. Therefore, I do not know how many industrial versions of the IoT project have been killed in the "cradle", and "the autumn wind is sad to draw a fan" ...
God closes, inevitably open the window.
Fortunately, in the east of the Internet of Things, the Time Series Database (TSDB) should be upgraded.
First look at the explanation on Wikipedia:
Reluctantly translate: "Time-series database is used to store time-series data and time (time point or time interval) ) Indexing software."
In short, the time series database is called a time series database. The time series database is mainly used to process data with time stamps (in order of time, ie time serialization), and time-tagged data is also called time series data.
According to the specification, Time Series Data (TSD) can be represented by a binary function:
TSD =Metric(Timestamp,Measurement), where:[ 123] Metric represents a sequence of data that can be uniquely identified; Timestamp represents a timestamp; Measurement represents a measurement; simply says that such data describes a certain The measured value of each measured subject at each time point within a time range. It is ubiquitous in the power, chemical industry and other industries as well as IT infrastructure, operation and maintenance monitoring systems and the Internet of Things and other types of real-time monitoring. A database used to store, manage, query, and process the above binary function data can be called a time series database. The time series database mainly solves the following problems: Timing data writing: How to support the writing of tens of millions of data points per second. Query of time series data: How to support the packet aggregation operation of hundreds of millions of data in the second level. Storage of time series data: Addressing cost sensitive issues caused by massive data storage. Lifecycle management of time series data: The value of industrial data is mainly reflected in timeliness. Therefore, the life cycle management of industrial data is the core mission of time series database. You look at the official, trouble the mother, keyword: Internet monitoring system, you will find Xiaomi, hungry and other Internet giants are also using the time series database to achieve enterprise-level Internet monitoring system. Not to mention the current mainstream industrial Internet platforms at home and abroad, almost all use time series databases to undertake the influx of industrial data. Seeing here, it is estimated that there are already many "bars" eager to try: What powerful Oracle, PostgreSQL and other traditional relational databases can't make time series data? Why not use advanced distributed databases such as HBase, MongoDB, Cassandra to solve industrial data problems? Is there any deep technical reason to use this time series database that was only popular in 2017? Please continue to pay attention to the follow-up wonderful articles of Gehuihui!
The author: Dr. East-record grid chief architect Wang Jin (reproduced please indicate the author and source)
This article was written by the author settled Cutting-edge technology, views of the authors and do not represent OFweek stand. If you have any infringement or other problems, please contact us.