最近阅读了Redis6.2源码,添加了一些注释,感兴趣的同学可以看下。
https://github.com/neohope/NeoRedisSrc
Category Archives: Database
TiDB环境搭建
本节采用单机环境,搭建TiDB测试环境。
全程云环境部署,操作系统为CentOS7.6,用户为root。
1、修改ssh配置
# 提高连接数 vi /etc/ssh/sshd_config MaxSessions 20 #重启sshd service sshd restart
2、安装tidb
# 系统更新 yum -y update # 安装tidb源 curl --proto '=https' --tlsv1.2 -sSf https://tiup-mirrors.pingcap.com/install.sh | sh # 安装tiup cluster source .bash_profile tiup cluster
3、新建cluster配置文件
# 新建配置文件 vi mytidb.yaml # # Global variables are applied to all deployments and used as the default value of # # the deployments if a specific deployment value is missing. global: user: "tidb" ssh_port: 22 deploy_dir: "/tidb-deploy" data_dir: "/tidb-data" # # Monitored variables are applied to all the machines. monitored: node_exporter_port: 9100 blackbox_exporter_port: 9115 server_configs: tidb: log.slow-threshold: 300 tikv: readpool.storage.use-unified-pool: false readpool.coprocessor.use-unified-pool: true pd: replication.enable-placement-rules: true tiflash: logger.level: "info" pd_servers: - host: 192.168.1.111 tidb_servers: - host: 192.168.1.111 tikv_servers: - host: 192.168.1.111 port: 20160 status_port: 20180 - host: 192.168.1.111 port: 20161 status_port: 20181 - host: 192.168.1.111 port: 20162 status_port: 20182 tiflash_servers: - host: 192.168.1.111 monitoring_servers: - host: 192.168.1.111 grafana_servers: - host: 192.168.1.111
4、应用cluster
#应用配置文件 tiup cluster deploy mytidb v4.0.0 ./mytidb.yaml --user root -i hwk8s.pem Starting component `cluster`: /root/.tiup/components/cluster/v1.0.7/tiup-cluster deploy mytidb v4.0.0 ./mytidb.yaml --user root -i hwk8s.pem Please confirm your topology: TiDB Cluster: mytidb TiDB Version: v4.0.0 Type Host Ports OS/Arch Directories ---- ---- ----- ------- ----------- pd 192.168.1.111 2379/2380 linux/x86_64 /tidb-deploy/pd-2379,/tidb-data/pd-2379 tikv 192.168.1.111 20160/20180 linux/x86_64 /tidb-deploy/tikv-20160,/tidb-data/tikv-20160 tikv 192.168.1.111 20161/20181 linux/x86_64 /tidb-deploy/tikv-20161,/tidb-data/tikv-20161 tikv 192.168.1.111 20162/20182 linux/x86_64 /tidb-deploy/tikv-20162,/tidb-data/tikv-20162 tidb 192.168.1.111 4000/10080 linux/x86_64 /tidb-deploy/tidb-4000 tiflash 192.168.1.111 9000/8123/3930/20170/20292/8234 linux/x86_64 /tidb-deploy/tiflash-9000,/tidb-data/tiflash-9000 prometheus 192.168.1.111 9090 linux/x86_64 /tidb-deploy/prometheus-9090,/tidb-data/prometheus-9090 grafana 192.168.1.111 3000 linux/x86_64 /tidb-deploy/grafana-3000 Attention: 1. If the topology is not what you expected, check your yaml file. 2. Please confirm there is no port/directory conflicts in same host. Do you want to continue? [y/N]: y + Generate SSH keys ... Done + Download TiDB components - Download pd:v4.0.0 (linux/amd64) ... Done - Download tikv:v4.0.0 (linux/amd64) ... Done - Download tidb:v4.0.0 (linux/amd64) ... Done - Download tiflash:v4.0.0 (linux/amd64) ... Done - Download prometheus:v4.0.0 (linux/amd64) ... Done - Download grafana:v4.0.0 (linux/amd64) ... Done - Download node_exporter:v0.17.0 (linux/amd64) ... Done - Download blackbox_exporter:v0.12.0 (linux/amd64) ... Done + Initialize target host environments - Prepare 192.168.1.111:22 ... Done + Copy files - Copy pd -> 192.168.1.111 ... Done - Copy tikv -> 192.168.1.111 ... Done - Copy tikv -> 192.168.1.111 ... Done - Copy tikv -> 192.168.1.111 ... Done - Copy tidb -> 192.168.1.111 ... Done - Copy tiflash -> 192.168.1.111 ... Done - Copy prometheus -> 192.168.1.111 ... Done - Copy grafana -> 192.168.1.111 ... Done - Copy node_exporter -> 192.168.1.111 ... Done - Copy blackbox_exporter -> 192.168.1.111 ... Done + Check status Deployed cluster `mytidb` successfully, you can start the cluster via `tiup cluster start mytidb` #启用cluster tiup cluster start mytidb Starting component `cluster`: /root/.tiup/components/cluster/v1.0.7/tiup-cluster start mytidb Starting cluster mytidb... + [ Serial ] - SSHKeySet: privateKey=/root/.tiup/storage/cluster/clusters/mytidb/ssh/id_rsa, publicKey=/root/.tiup/storage/cluster/clusters/mytidb/ssh/id_rsa.pub + [Parallel] - UserSSH: user=tidb, host=192.168.1.111 + [Parallel] - UserSSH: user=tidb, host=192.168.1.111 + [Parallel] - UserSSH: user=tidb, host=192.168.1.111 + [Parallel] - UserSSH: user=tidb, host=192.168.1.111 + [Parallel] - UserSSH: user=tidb, host=192.168.1.111 + [Parallel] - UserSSH: user=tidb, host=192.168.1.111 + [Parallel] - UserSSH: user=tidb, host=192.168.1.111 + [Parallel] - UserSSH: user=tidb, host=192.168.1.111 + [ Serial ] - ClusterOperate: operation=StartOperation, options={Roles:[] Nodes:[] Force:false SSHTimeout:5 OptTimeout:60 APITimeout:300 IgnoreConfigCheck:false RetainDataRoles:[] RetainDataNodes:[]} Starting component pd Starting instance pd 192.168.1.111:2379 Start pd 192.168.1.111:2379 success Starting component node_exporter Starting instance 192.168.1.111 Start 192.168.1.111 success Starting component blackbox_exporter Starting instance 192.168.1.111 Start 192.168.1.111 success Starting component tikv Starting instance tikv 192.168.1.111:20162 Starting instance tikv 192.168.1.111:20161 Starting instance tikv 192.168.1.111:20160 Start tikv 192.168.1.111:20162 success Start tikv 192.168.1.111:20161 success Start tikv 192.168.1.111:20160 success Starting component tidb Starting instance tidb 192.168.1.111:4000 Start tidb 192.168.1.111:4000 success Starting component tiflash Starting instance tiflash 192.168.1.111:9000 Start tiflash 192.168.1.111:9000 success Starting component prometheus Starting instance prometheus 192.168.1.111:9090 Start prometheus 192.168.1.111:9090 success Starting component grafana Starting instance grafana 192.168.1.111:3000 Start grafana 192.168.1.111:3000 success Checking service state of pd 192.168.1.111 Active: active (running) since Thu 2020-07-02 11:38:37 CST; 13s ago Checking service state of tikv 192.168.1.111 Active: active (running) since Thu 2020-07-02 11:38:38 CST; 12s ago 192.168.1.111 Active: active (running) since Thu 2020-07-02 11:38:38 CST; 12s ago 192.168.1.111 Active: active (running) since Thu 2020-07-02 11:38:38 CST; 12s ago Checking service state of tidb 192.168.1.111 Active: active (running) since Thu 2020-07-02 11:38:42 CST; 9s ago Checking service state of tiflash 192.168.1.111 Active: active (running) since Thu 2020-07-02 11:38:45 CST; 5s ago Checking service state of prometheus 192.168.1.111 Active: active (running) since Thu 2020-07-02 11:38:47 CST; 4s ago Checking service state of grafana 192.168.1.111 Active: active (running) since Thu 2020-07-02 11:38:47 CST; 4s ago + [ Serial ] - UpdateTopology: cluster=mytidb Started cluster `mytidb` successfully
5、查看cluster状态
#查看cluster清单 tiup cluster list Starting component `cluster`: /root/.tiup/components/cluster/v1.0.7/tiup-cluster list Name User Version Path PrivateKey ---- ---- ------- ---- ---------- mytidb tidb v4.0.0 /root/.tiup/storage/cluster/clusters/mytidb /root/.tiup/storage/cluster/clusters/mytidb/ssh/id_rsa #查看cluster详情 tiup cluster display mytidb Starting component `cluster`: /root/.tiup/components/cluster/v1.0.7/tiup-cluster display mytidb TiDB Cluster: mytidb TiDB Version: v4.0.0 ID Role Host Ports OS/Arch Status Data Dir Deploy Dir -- ---- ---- ----- ------- ------ -------- ---------- 192.168.1.111:3000 grafana 192.168.1.111 3000 linux/x86_64 Up - /tidb-deploy/grafana-3000 192.168.1.111:2379 pd 192.168.1.111 2379/2380 linux/x86_64 Up|L|UI /tidb-data/pd-2379 /tidb-deploy/pd-2379 192.168.1.111:9090 prometheus 192.168.1.111 9090 linux/x86_64 Up /tidb-data/prometheus-9090 /tidb-deploy/prometheus-9090 192.168.1.111:4000 tidb 192.168.1.111 4000/10080 linux/x86_64 Up - /tidb-deploy/tidb-4000 192.168.1.111:9000 tiflash 192.168.1.111 9000/8123/3930/20170/20292/8234 linux/x86_64 Up /tidb-data/tiflash-9000 /tidb-deploy/tiflash-9000 192.168.1.111:20160 tikv 192.168.1.111 20160/20180 linux/x86_64 Up /tidb-data/tikv-20160 /tidb-deploy/tikv-20160 192.168.1.111:20161 tikv 192.168.1.111 20161/20181 linux/x86_64 Up /tidb-data/tikv-20161 /tidb-deploy/tikv-20161 192.168.1.111:20162 tikv 192.168.1.111 20162/20182 linux/x86_64 Up /tidb-data/tikv-20162 /tidb-deploy/tikv-20162
6、mysql客户端操作tidb
#安装源 wget https://repo.mysql.com//mysql80-community-release-el7-3.noarch.rpm rpm -Uvh mysql80-community-release-el7-3.noarch.rpm #安装mysql客户端 yum install mysql-community-client.x86_64 #登录tidb mysql -h 192.168.1.111 -P 4000 -u root #和普通mysql操作区别很小
7、查看tidb管理界面
# 性能监控 http://192.168.1.111:3000 admin/admin # 管理界面 http://192.168.1.111:2379/dashboard root/空
InfluxDB环境搭建06
本节用HTTP方式读写InfluxDB数据。
1、InfluxDB API路径
Endpoint | Description |
/debug/pprof | Generate profiles for troubleshooting |
/debug/requests | Track HTTP client requests to the /write and /query endpoints |
/debug/vars | Collect internal InfluxDB statistics |
/ping | Check the status of your InfluxDB instance and your version of InfluxDB |
/query | Query data using InfluxQL, manage databases, retention policies, and users |
/write | Write data to a database |
2、ping服务状态
curl -i 'http://localhost:8086/ping' HTTP/1.1 204 No Content Content-Type: application/json Request-Id: ff6febe5-bb85-11ea-8060-fa163e4dc996 X-Influxdb-Build: OSS X-Influxdb-Version: 1.8.0 X-Request-Id: ff6febe5-bb85-11ea-8060-fa163e4dc996 Date: Wed, 01 Jul 2020 10:31:17 GMT
3、查看并新建数据库
curl -i -XPOST http://localhost:8086/query --data-urlencode "q=show databases" HTTP/1.1 200 OK Content-Type: application/json Request-Id: 4eddc157-bb86-11ea-8061-fa163e4dc996 X-Influxdb-Build: OSS X-Influxdb-Version: 1.8.0 X-Request-Id: 4eddc157-bb86-11ea-8061-fa163e4dc996 Date: Wed, 01 Jul 2020 10:33:31 GMT Transfer-Encoding: chunked {"results":[{"statement_id":0,"series":[{"name":"databases","columns":["name"],"values":[["_internal"],["NOAA_water_database"]]}]}]} curl -i -XPOST http://localhost:8086/query --data-urlencode "q=CREATE DATABASE mydb" HTTP/1.1 200 OK Content-Type: application/json Request-Id: 05a455eb-bb89-11ea-8062-fa163e4dc996 X-Influxdb-Build: OSS X-Influxdb-Version: 1.8.0 X-Request-Id: 05a455eb-bb89-11ea-8062-fa163e4dc996 Date: Wed, 01 Jul 2020 10:52:56 GMT Transfer-Encoding: chunked {"results":[{"statement_id":0}]}
4、写入数据
curl -i -XPOST 'http://localhost:8086/write?db=mydb' --data-binary 'cpu_load_short,host=server01,region=us-west value=0.64 1422568543700000000' HTTP/1.1 204 No Content Content-Type: application/json Request-Id: 171405ad-bb8a-11ea-8063-fa163e4dc996 X-Influxdb-Build: OSS X-Influxdb-Version: 1.8.0 X-Request-Id: 171405ad-bb8a-11ea-8063-fa163e4dc996 Date: Wed, 01 Jul 2020 11:00:35 GMT curl -i -XPOST 'http://localhost:8086/write?db=mydb' --data-binary 'cpu_load_short,host=server02,region=asia-east value=0.67 1422568543700000000 > cpu_load_short,host=server02,region=us-west value=0.55 1422568543900000000 > cpu_load_short,host=server01,region=asia-east value=2.0 1422568543900000000' HTTP/1.1 204 No Content Content-Type: application/json Request-Id: 1ad799eb-bb8a-11ea-8064-fa163e4dc996 X-Influxdb-Build: OSS X-Influxdb-Version: 1.8.0 X-Request-Id: 1ad799eb-bb8a-11ea-8064-fa163e4dc996 Date: Wed, 01 Jul 2020 11:00:41 GMT
5、查询数据
# curl双引号里面支持转义符、支持变量 # curl单引号里面不支持转义符、不支持变量 curl -i -XPOST 'http://localhost:8086/query?pretty=true&db=mydb' --data-binary "q=select * from cpu_load_short where \"region\"='us-west'" HTTP/1.1 200 OK Content-Type: application/json Request-Id: d976bd19-bb8c-11ea-8076-fa163e4dc996 X-Influxdb-Build: OSS X-Influxdb-Version: 1.8.0 X-Request-Id: d976bd19-bb8c-11ea-8076-fa163e4dc996 Date: Wed, 01 Jul 2020 11:20:20 GMT Transfer-Encoding: chunked { "results": [ { "statement_id": 0, "series": [ { "name": "cpu_load_short", "columns": [ "time", "host", "region", "value" ], "values": [ [ "2015-01-29T21:55:43.7Z", "server01", "us-west", 0.64 ], [ "2015-01-29T21:55:43.9Z", "server02", "us-west", 0.55 ] ] } ] } ] }
InfluxDB环境搭建05
本节给出了InfluxQL的函数清单。
1、InfluxQL函数清单
分类 | 函数 | 功能 |
Aggregations | COUNT() | 计数 |
Aggregations | DISTINCT() | 数据去重 |
Aggregations | INTEGRAL() | 数值曲线包含的面积 |
Aggregations | MEAN() | 平均数 |
Aggregations | MEDIAN() | 中位数 |
Aggregations | MODE() | 频次最高的数据 |
Aggregations | SPREAD() | 最大值于最小值时间的差 |
Aggregations | STDDEV() | 标准差 |
Aggregations | SUM() | 求和 |
Selectors | BOTTOM() | 返回最小的数据集 |
Selectors | FIRST() | 返回最旧的数据 |
Selectors | LAST() | 返回最新的数据 |
Selectors | MAX() | 最大值 |
Selectors | MIN() | 最小值 |
Selectors | PERCENTILE() | 百分位数数据 |
Selectors | SAMPLE() | 随机抽样 |
Selectors | TOP() | 返回最大的数据集 |
Transformations | ABS() | 绝对值 |
Transformations | ACOS() | 反余弦 |
Transformations | ASIN() | 反正弦 |
Transformations | ATAN() | 反正切,一和四象限 |
Transformations | ATAN2() | 反正切,四个象限 |
Transformations | CEIL() | 向上取整 |
Transformations | COS() | 余弦 |
Transformations | CUMULATIVE_SUM() | 序列从第一个值的连续求和 |
Transformations | DERIVATIVE() | 相邻序列值之间的差除以时间差 |
Transformations | DIFFERENCE() | 相邻序列值之间的差 |
Transformations | ELAPSED() | 时间戳差异 |
Transformations | EXP() | 指数 |
Transformations | FLOOR() | 向下取整 |
Transformations | HISTOGRAM() | Flux提供的功能,将序列数值近似的转换为指定的直方图分布 |
Transformations | LN() | 自然对数 |
Transformations | LOG() | 对数 |
Transformations | LOG2() | 2为底的对数 |
Transformations | LOG10() | 10为底的对数 |
Transformations | MOVING_AVERAGE() | 滚进计算序列平均值 |
Transformations | NON_NEGATIVE_DERIVATIVE() | 相邻序列值之间的差除以时间差,仅包括非负值 |
Transformations | NON_NEGATIVE_DIFFERENCE() | 相邻序列值之间的差,仅包括非负值 |
Transformations | POW() | 幂 |
Transformations | ROUND() | 四舍五入 |
Transformations | SIN() | 正弦 |
Transformations | SQRT() | 平方根 |
Transformations | TAN() | 正切 |
Predictors | HOLT_WINTERS() | 预测 |
Technical Analysis | CHANDE_MOMENTUM_OSCILLATOR() | 钱德动量振荡器(CMO):通过计算所有最近的较高数据点和所有最近的较低数据点的总和之间的差值,然后将结果除以给定时间段内所有数据移动的总和来创建的。结果乘以100,得到-100到+100的范围。 |
Technical Analysis | EXPONENTIAL_MOVING_AVERAGE() | 指数移动平均线:类似于简单移动平均线,但给最新数据更多的权重。这种移动平均线比简单的移动平均线对最近的数据变化反应更快。 |
Technical Analysis | DOUBLE_EXPONENTIAL_MOVING_AVERAGE() | 双指数移动平均线:将均线的值翻倍,同时为了使其与实际数据保持一致,并消除滞后,会从之前翻倍的均线中减去“均线的均线”值。 |
Technical Analysis | KAUFMANS_EFFICIENCY_RATIO() | 考夫曼的效率比:是通过将一段时间内的数据变化除以为实现该变化而发生的数据移动的绝对总和来计算的。最终的比率在0到1之间,较高的值代表更有效或更有趋势的市场。 |
Technical Analysis | KAUFMANS_ADAPTIVE_MOVING_AVERAGE() | 考夫曼自适应移动平均线:旨在考虑样本噪声或波动性,当数据波动相对较小且噪声较低时,KAMA将密切跟踪数据点;当数据波动变大时,KAMA将进行调整,并从更远的距离跟踪数据。该趋势跟踪指示器可用于识别总体趋势、时间转折点和过滤数据移动。 |
Technical Analysis | TRIPLE_EXPONENTIAL_MOVING_AVERAGE() | 三重指数移动平均线:是用来过滤传统移动平均线的波动性,实际上是单指数移动平均线、双指数移动平均线和三倍指数移动平均线的组合。 |
Technical Analysis | TRIPLE_EXPONENTIAL_DERIVATIVE() | 三重指数衍生指标:是一个用来识别超卖和超买市场的振荡器,也可以用作动量指标。TRIX计算一段时间内输入数据的对数的三重指数移动平均值。从前一个值中减去前一个值。这可防止指示器考虑比规定周期短的周期。 |
Technical Analysis | RELATIVE_STRENGTH_INDEX() | 相对强度指数:一个动量指标,用于比较特定时间段内最近的上升和下降幅度,以测量数据移动的速度和变化。 |
2、time函数时间单位
单位 | 含义 |
ns | nanoseconds |
u or µ | microseconds |
ms | milliseconds |
s | seconds |
m | minutes |
h | hours |
d | days |
w | weeks |
InfluxDB环境搭建04
本节讲解InfluxDB的部分高级功能
时间序列数据处理的数据量会十分大,而且很多数据只是在一定时期内意义会更大一些,比如服务器性能日志。
所以,一般来说近期数据会按采样间隔全量储存,越远的数据,需要保存的采样间隔就会越大。
比如,服务器性能日志:
近一周的,可以按10s采样一次
近一月的,可以按5m采样一次
近三个月的,可以按1h采样一次
近半年的,可以按1d采样一次
近一年的,可以按1w采样一次
再之前的,可以删除
这样降低采样率的操作,在时间序列数据库中的应用场景很多,所以一般都会进行支持。
InfluxDB,采用了两个方法来解决这个问题:
通过Continuous Query定期降低采样频率
通过Retention Policy定期删除高频采样数据
1、Continuous Query
> SHOW CONTINUOUS QUERIES > CREATE CONTINUOUS QUERY "my_cq" ON "NOAA_water_database" RESAMPLE EVERY 1w BEGIN SELECT MEAN("water_level") INTO "water_level_averages_coyote_creek_4w" FROM "h2o_feet" WHERE "location" = 'coyote_creek' GROUP BY time(4w) END > CREATE CONTINUOUS QUERY "my_cq" ON "NOAA_water_database" BEGIN SELECT MEAN("water_level") INTO "transportation"."weeks24"."water_level_averages_coyote_creek_4w" FROM "h2o_feet" WHERE "location" = 'coyote_creek' GROUP BY time(4w) END > DROP CONTINUOUS QUERY "my_cq" ON "NOAA_water_database"
2、Retention Policy
> CREATE RETENTION POLICY "my_rp" ON "my_database" DURATION 3d REPLICATION 1 SHARD DURATION 1h > CREATE RETENTION POLICY "my_rp" ON "my_database" DURATION 3d REPLICATION 1 SHARD DURATION 1h default > DROP RETENTION POLICY "my_rp" ON "my_database"
3、新建数据库
# 默认RP是autogen > CREATE DATABASE "my_database" # 新建数据库,默认RP是my_rp # my_rp:数据库数据保留3天,shard保留1份数据,shard group按1小时保留数据 > CREATE DATABASE "my_database" WITH DURATION 3d REPLICATION 1 SHARD DURATION 1h NAME "my_rp"
4、删除数据库
> drop database NOAA_water_database
5、删除series
> DROP SERIES FROM "h2o_feet" > DROP SERIES FROM "h2o_feet" WHERE "location" = 'santa_monica' > DROP SERIES WHERE "location" = 'santa_monica'
6、删除series中数据
> DELETE FROM "h2o_feet" > DELETE FROM "h2o_quality" WHERE "randtag" = '3' > DELETE WHERE time < '2016-01-01'
7、删除measurement
> DROP MEASUREMENT "h2o_feet"
8、删除shard
> show shards name: _internal id database retention_policy shard_group start_time end_time expiry_time owners -- -------- ---------------- ----------- ---------- -------- ----------- ------ 18 _internal monitor 18 2020-07-01T00:00:00Z 2020-07-02T00:00:00Z 2020-07-09T00:00:00Z name: NOAA_water_database id database retention_policy shard_group start_time end_time expiry_time owners -- -------- ---------------- ----------- ---------- -------- ----------- ------ 19 NOAA_water_database autogen 19 2019-08-05T00:00:00Z 2019-08-12T00:00:00Z 2019-08-12T00:00:00Z 10 NOAA_water_database autogen 10 2019-08-12T00:00:00Z 2019-08-19T00:00:00Z 2019-08-19T00:00:00Z 11 NOAA_water_database autogen 11 2019-08-19T00:00:00Z 2019-08-26T00:00:00Z 2019-08-26T00:00:00Z 12 NOAA_water_database autogen 12 2019-08-26T00:00:00Z 2019-09-02T00:00:00Z 2019-09-02T00:00:00Z 13 NOAA_water_database autogen 13 2019-09-02T00:00:00Z 2019-09-09T00:00:00Z 2019-09-09T00:00:00Z 14 NOAA_water_database autogen 14 2019-09-09T00:00:00Z 2019-09-16T00:00:00Z 2019-09-16T00:00:00Z 15 NOAA_water_database autogen 15 2019-09-16T00:00:00Z 2019-09-23T00:00:00Z 2019-09-23T00:00:00Z > drop shard 10
9、删除measurement
> DROP MEASUREMENT "h2o_feet"
10、删除measurement
> DROP MEASUREMENT "h2o_feet"
11、杀死慢查询
> show queries qid query database duration status --- ----- -------- -------- ------ 78 SHOW QUERIES NOAA_water_database 58µs running > kill query 78
InfluxDB环境搭建03
本节讲解InfluxDB的查询
1、普通查询
influx -database NOAA_water_database Connected to http://localhost:8086 version 1.8.0 InfluxDB shell version: 1.8.0 > SELECT * FROM "h2o_feet" WHERE time>1568750040000000000 ORDER BY time DESC name: h2o_feet time level description location water_level ---- ----------------- -------- ----------- 1568756520000000000 between 3 and 6 feet santa_monica 4.938 1568756160000000000 between 3 and 6 feet santa_monica 5.066 1568755800000000000 between 3 and 6 feet santa_monica 5.01 1568755440000000000 between 3 and 6 feet santa_monica 5.013 1568755080000000000 between 3 and 6 feet santa_monica 5.072 1568754720000000000 between 3 and 6 feet santa_monica 5.213 1568754360000000000 between 3 and 6 feet santa_monica 5.341 1568754000000000000 between 3 and 6 feet santa_monica 5.338 1568753640000000000 between 3 and 6 feet santa_monica 5.322 1568753280000000000 between 3 and 6 feet santa_monica 5.24 1568752920000000000 between 3 and 6 feet santa_monica 5.302 1568752560000000000 between 3 and 6 feet santa_monica 5.62 1568752200000000000 between 3 and 6 feet santa_monica 5.604 1568751840000000000 between 3 and 6 feet santa_monica 5.502 1568751480000000000 between 3 and 6 feet santa_monica 5.551 1568751120000000000 between 3 and 6 feet santa_monica 5.459 1568750760000000000 between 3 and 6 feet santa_monica 5.62 1568750400000000000 between 3 and 6 feet santa_monica 5.627 > SELECT "location","water_level" FROM "h2o_feet" WHERE time>1568750040000000000 name: h2o_feet time location water_level ---- -------- ----------- 1568750400000000000 santa_monica 5.627 1568750760000000000 santa_monica 5.62 1568751120000000000 santa_monica 5.459 1568751480000000000 santa_monica 5.551 1568751840000000000 santa_monica 5.502 1568752200000000000 santa_monica 5.604 1568752560000000000 santa_monica 5.62 1568752920000000000 santa_monica 5.302 1568753280000000000 santa_monica 5.24 1568753640000000000 santa_monica 5.322 1568754000000000000 santa_monica 5.338 1568754360000000000 santa_monica 5.341 1568754720000000000 santa_monica 5.213 1568755080000000000 santa_monica 5.072 1568755440000000000 santa_monica 5.013 1568755800000000000 santa_monica 5.01 1568756160000000000 santa_monica 5.066 1568756520000000000 santa_monica 4.938 > SELECT *::field FROM "h2o_feet" WHERE time>1568750040000000000 name: h2o_feet time level description water_level ---- ----------------- ----------- 1568750400000000000 between 3 and 6 feet 5.627 1568750760000000000 between 3 and 6 feet 5.62 1568751120000000000 between 3 and 6 feet 5.459 1568751480000000000 between 3 and 6 feet 5.551 1568751840000000000 between 3 and 6 feet 5.502 1568752200000000000 between 3 and 6 feet 5.604 1568752560000000000 between 3 and 6 feet 5.62 1568752920000000000 between 3 and 6 feet 5.302 1568753280000000000 between 3 and 6 feet 5.24 1568753640000000000 between 3 and 6 feet 5.322 1568754000000000000 between 3 and 6 feet 5.338 1568754360000000000 between 3 and 6 feet 5.341 1568754720000000000 between 3 and 6 feet 5.213 1568755080000000000 between 3 and 6 feet 5.072 1568755440000000000 between 3 and 6 feet 5.013 1568755800000000000 between 3 and 6 feet 5.01 1568756160000000000 between 3 and 6 feet 5.066 1568756520000000000 between 3 and 6 feet 4.938 > SELECT "water_level"-3 FROM "h2o_feet" WHERE time>1568750040000000000 name: h2o_feet time water_level ---- ----------- 1568750400000000000 2.627 1568750760000000000 2.62 1568751120000000000 2.4589999999999996 1568751480000000000 2.551 1568751840000000000 2.502 1568752200000000000 2.604 1568752560000000000 2.62 1568752920000000000 2.3019999999999996 1568753280000000000 2.24 1568753640000000000 2.322 1568754000000000000 2.338 1568754360000000000 2.341 1568754720000000000 2.213 1568755080000000000 2.072 1568755440000000000 2.013 1568755800000000000 2.01 1568756160000000000 2.066 1568756520000000000 1.9379999999999997 > SELECT * FROM "NOAA_water_database"."autogen"."h2o_feet" WHERE time>1568750040000000000 name: h2o_feet time level description location water_level ---- ----------------- -------- ----------- 1568750400000000000 between 3 and 6 feet santa_monica 5.627 1568750760000000000 between 3 and 6 feet santa_monica 5.62 1568751120000000000 between 3 and 6 feet santa_monica 5.459 1568751480000000000 between 3 and 6 feet santa_monica 5.551 1568751840000000000 between 3 and 6 feet santa_monica 5.502 1568752200000000000 between 3 and 6 feet santa_monica 5.604 1568752560000000000 between 3 and 6 feet santa_monica 5.62 1568752920000000000 between 3 and 6 feet santa_monica 5.302 1568753280000000000 between 3 and 6 feet santa_monica 5.24 1568753640000000000 between 3 and 6 feet santa_monica 5.322 1568754000000000000 between 3 and 6 feet santa_monica 5.338 1568754360000000000 between 3 and 6 feet santa_monica 5.341 1568754720000000000 between 3 and 6 feet santa_monica 5.213 1568755080000000000 between 3 and 6 feet santa_monica 5.072 1568755440000000000 between 3 and 6 feet santa_monica 5.013 1568755800000000000 between 3 and 6 feet santa_monica 5.01 1568756160000000000 between 3 and 6 feet santa_monica 5.066 1568756520000000000 between 3 and 6 feet santa_monica 4.938 > SELECT * FROM "h2o_feet" WHERE "water_level" > 9.9 name: h2o_feet time level description location water_level ---- ----------------- -------- ----------- 1566975960000000000 at or greater than 9 feet coyote_creek 9.902 1566976320000000000 at or greater than 9 feet coyote_creek 9.938 1566976680000000000 at or greater than 9 feet coyote_creek 9.957 1566977040000000000 at or greater than 9 feet coyote_creek 9.964 1566977400000000000 at or greater than 9 feet coyote_creek 9.954 1566977760000000000 at or greater than 9 feet coyote_creek 9.941 1566978120000000000 at or greater than 9 feet coyote_creek 9.925 1566978480000000000 at or greater than 9 feet coyote_creek 9.902 1567380600000000000 at or greater than 9 feet coyote_creek 9.902 > SELECT * FROM "h2o_feet" WHERE "level description" = 'below 3 feet' and "water_level" >= 3 > SELECT "water_level" FROM "h2o_feet" WHERE "location" = 'santa_monica' and "water_level" < -0.2 name: h2o_feet time water_level ---- ----------- 1566988560000000000 -0.243 1567077840000000000 -0.21
2、group by查询
> SELECT MEAN("water_level") FROM "h2o_feet" GROUP BY "location" name: h2o_feet tags: location=coyote_creek time mean ---- ---- 0 5.3591424203039155 name: h2o_feet tags: location=santa_monica time mean ---- ---- 0 3.5307120942458807 > SELECT COUNT("water_level") FROM "h2o_feet" WHERE "location"='coyote_creek' GROUP BY time(4w) name: h2o_feet time count ---- ----- 1565222400000000000 4559 1567641600000000000 3045 1570060800000000000 0 1572480000000000000 0 1574899200000000000 0 1577318400000000000 0 1579737600000000000 0 1582156800000000000 0 1584576000000000000 0 1586995200000000000 0 1589414400000000000 0 1591833600000000000 0 > SELECT * FROM "h2o_feet" WHERE time > now() - 7d > SELECT * FROM "h2o_feet" WHERE time = '2020-07-01T00:00:00Z'
3、分页查询
> SELECT "water_level","location" FROM "h2o_feet" LIMIT 3 name: h2o_feet time water_level location ---- ----------- -------- 1566000000000000000 8.12 coyote_creek 1566000000000000000 2.064 santa_monica 1566000360000000000 8.005 coyote_creek > SELECT "water_level","location" FROM "h2o_feet" LIMIT 3 OFFSET 3 name: h2o_feet time water_level location ---- ----------- -------- 1566000360000000000 2.116 santa_monica 1566000720000000000 7.887 coyote_creek 1566000720000000000 2.028 santa_monica > SELECT "water_level" FROM "h2o_feet" GROUP BY * LIMIT 3 SLIMIT 1 name: h2o_feet tags: location=coyote_creek time water_level ---- ----------- 1566000000000000000 8.12 1566000360000000000 8.005 1566000720000000000 7.887 > SELECT "water_level","location" FROM "h2o_feet" LIMIT 3 SLIMIT 1 SOFFSET 1
4、带正则表达式的查询
> SELECT /l/ FROM "h2o_feet" LIMIT 1 name: h2o_feet time level description location water_level ---- ----------------- -------- ----------- 1566000000000000000 below 3 feet santa_monica 2.064 > SELECT MEAN("degrees") FROM /temperature/ name: average_temperature time mean ---- ---- 0 79.98472932232272 name: h2o_temperature time mean ---- ---- 0 64.98872722506226 > SELECT MEAN(water_level) FROM "h2o_feet" WHERE "location" =~ /[m]/ AND "water_level" > 3 name: h2o_feet time mean ---- ---- 0 4.471366691627881 > SELECT FIRST("index") FROM "h2o_quality" GROUP BY /l/ name: h2o_quality tags: location=coyote_creek time first ---- ----- 1566000000000000000 41 name: h2o_quality tags: location=santa_monica time first ---- ----- 1566000000000000000 99 > SELECT MEAN("water_level") FROM "h2o_feet" WHERE "location" = 'santa_monica' AND "level description" =~ /between/ name: h2o_feet time mean ---- ---- 0 4.471366691627881
5、嵌套查询
> SELECT SUM("max") FROM (SELECT MAX("water_level") AS "max" FROM "h2o_feet" GROUP BY "location") name: h2o_feet time sum ---- --- 0 17.169
6、SELECT INTO
> SELECT "water_level" INTO "h2o_feet_coyote_creek" FROM "h2o_feet" WHERE "location" = 'coyote_creek' name: result time written ---- ------- 0 7604 > SELECT MEAN("water_level") INTO "water_level_averages_4w" FROM "h2o_feet" WHERE "location" = 'coyote_creek' GROUP BY time(4w) name: result time written ---- ------- 0 2
7、整库复制
> CREATE DATABASE noaa_water_db > SELECT * INTO "NOAA_water_database_copy"."autogen".:MEASUREMENT FROM "NOAA_water_database"."autogen"./.*/ GROUP BY *
InfluxDB环境搭建02
本节讲解InfluxDB的安装及测试数据导入
1、InfluxDB安装
# 配置apt库 wget -qO- https://repos.influxdata.com/influxdb.key | sudo apt-key add - source /etc/lsb-release echo "deb https://repos.influxdata.com/${DISTRIB_ID,,} ${DISTRIB_CODENAME} stable" | sudo tee /etc/apt/sources.list.d/influxdb.list # 安装influxdb sudo apt-get update sudo apt-get install influxdb # 服务启动 sudo service influxdb start # 或 sudo systemctl start influxdb # 前台运行 influxd -config /etc/influxdb/influxdb.conf # 查看日志 journalctl -u influxdb # 日志输出到文件 journalctl -u influxdb> influxd.log # 默认开启端口 #TCP 8086 客户端操作 #TCP 8088 备份及还原 # 默认关闭端口 # TCP 2003 Graphite service # TCP 4242 OpenTSDB service # UDP 8089 UDP service # TCp 25826 Collectd service
2、测试数据导入
# 获取数据 wget https://s3.amazonaws.com/noaa.water-database/NOAA_data.txt # 创建数据库 influx Connected to http://localhost:8086 version 1.8.0 InfluxDB shell version: 1.8.0 > CREATE DATABASE noaa_water_db > exit # 导入数据 influx -import -path=NOAA_data.txt -precision=s -database=NOAA_water_database 2020/06/15 16:34:37 Processed 1 commands 2020/06/15 16:34:37 Processed 76290 inserts 2020/06/15 16:34:37 Failed 0 inserts
3、查看数据库情况
# 查看数据库 influx Connected to http://localhost:8086 version 1.8.0 InfluxDB shell version: 1.8.0 > show databases name: databases name ---- _internal NOAA_water_database > use NOAA_water_database Using database NOAA_water_database > show measurements name: measurements name ---- average_temperature h2o_feet h2o_pH h2o_quality h2o_temperature > show retention policies name duration shardGroupDuration replicaN default ---- -------- ------------------ -------- ------- autogen 0s 168h0m0s 1 true > show series key --- average_temperature,location=coyote_creek average_temperature,location=santa_monica h2o_feet,location=coyote_creek h2o_feet,location=santa_monica h2o_pH,location=coyote_creek h2o_pH,location=santa_monica h2o_quality,location=coyote_creek,randtag=1 h2o_quality,location=coyote_creek,randtag=2 h2o_quality,location=coyote_creek,randtag=3 h2o_quality,location=santa_monica,randtag=1 h2o_quality,location=santa_monica,randtag=2 h2o_quality,location=santa_monica,randtag=3 h2o_temperature,location=coyote_creek h2o_temperature,location=santa_monica > show tag keys name: average_temperature tagKey ------ location name: h2o_feet tagKey ------ location name: h2o_pH tagKey ------ location name: h2o_quality tagKey ------ location randtag name: h2o_temperature tagKey ------ location > show tag values with key="location" name: average_temperature key value --- ----- location coyote_creek location santa_monica name: h2o_feet key value --- ----- location coyote_creek location santa_monica name: h2o_pH key value --- ----- location coyote_creek location santa_monica name: h2o_quality key value --- ----- location coyote_creek location santa_monica name: h2o_temperature key value --- ----- location coyote_creek location santa_monica > show field keys name: average_temperature fieldKey fieldType -------- --------- degrees float name: h2o_feet fieldKey fieldType -------- --------- level description string water_level float name: h2o_pH fieldKey fieldType -------- --------- pH float name: h2o_quality fieldKey fieldType -------- --------- index float name: h2o_temperature fieldKey fieldType -------- --------- degrees float
InfluxDB环境搭建01
一、设计原则
InfluxDB是一个时间序列数据库,其性能优化方向为大规模、高效的时间序列数据的插入及查询操作。经权衡,部分场景下需要以牺牲功能为代价来提高性能,以下列出了部分设计理念:
1、数据按时间戳顺序存储,不允许重复数据存在。
2、数据删除是低频操作,批量数据删除除外;
3、数据更新是低频操作,实际场景很少发生;对某一数据的持续更新,不应该发生;绝大多数的数据都是新产生的数据,而且不会发生变化;
4、数据写入的时间戳接近当前时间,数据写入顺序与数据时间戳顺序一致;
5、数据规模至关重要,数据库性能优化,主要针对大规模的写入和读取操作,而不是更新和删除操作;
6、数据库可用性高于强一致性;
7、很多时间序列都是短暂,而不是永恒的产生数据;
8、不需要特别关注某个时刻的数据,从整体分析数据趋势才更重要;
二、基本概念
时间序列数据库,有一些特有的概念。其实只需要正确理解measurement、tag、series和存储文件之间的关系就可以比较容易的理解整个系统了。
概念 | 解释 | 对应于SQL |
时间戳timestamp |
在InfluxDB里的所有时间都是UTC的,时间戳timestamp就是一个UTC时刻下,某一序列series在该时刻的全部数据的唯一标识。知道了时间戳timestamp就可以获取序列series下指定时间的数据点point,这也意味着同一series下时间戳不允许重复。 同一个序列series中,两个相同时间戳数据的写入,会触发数据更新及数据合并; |
类似于SQL中的自增主键,但是时间戳类型的 |
数据点point |
一个序列series中,具有相同timestamp的的field数据集合。也就是说在同一个series中同一个timestamp的数据点point只会出现一次(可以视为在series中数据点以时间戳timestamp为主键存储)。 数据点可以保存一个measurement的field的一个或多个,也可以随时增加新的field。但对于已存储的数据,不允许修改数据类型。 |
类似于SQL中的一行记录 |
数据库database | 一个包含schema、时序数据、数据保留策略、持续查询、用户、权限的逻辑集合。 | 类似于SQL中的数据库 |
模式schema | schema包含database,retention policy,series,measurement,tag key,tag value以及field keys,其确定了在InfluxDB里面如何组织数据。 | 对应于SQL的数据库模式+表模式 |
度量measurement |
用于描述数据如何保存的数据结构,包含了tag,field和时间序列数据。measurement是字符串。 一个度量measurement可以有不同的存续策略retention policy。 |
对用于SQL的表 |
序列series |
measurement中的逻辑的数据分组,也是实际的数据存储分组。由shared measurement,tag set和feild key组成,不包含field value。 换句话就是,在InfluxDB中,数据不是按measurement存的,而是按series存的。 |
类似于SQL中按某一列的值,进行分表操作 |
序列候选series cardinality |
在InfluxDB实例中,由database、measurement、tag set、field key确定的不重复的组合数量。 举例说明一下:比如记录一个人员信息的measurement,只有两个tag(性别和是否已就业),其中性别为2个可选值(M、F),是否已就业为2个可选值(Y、N),那么这个measurement下会存在2×2共4个series cardinality。而实际存储时,数据也会按每个series进行处理。 |
类似于SQL中按某一列的值,可以分表的数量 |
field | InfluxDB数据中记录metadata和数据的键值对。field数量必须大于0,不支持无索引。 | 对应于SQL的无索引列 |
field set | 数据点上field key和field value的集合。 | 对应于SQL的无索引列名称和数值的集合 |
field key | 组成field的键值对里面的键的部分。field key是字符串且保存在metadata中。 | 对应于SQL的无索引列的名称 |
field value |
组成field的键值对里面的值的部分。field value才是真正的数据,可以是字符串,浮点数,整数,布尔型数据。 一个field value总是和一个timestamp相关联。 field value不会被索引,如果要对field value做过滤话,那就必须遍历所选时间范围里面的所有数据点,这种方式对比与tag效率会差很多。 |
对应于SQL的无索引列的值 |
tag | InfluxDB数据中记录metadata的键值对。tags用于存储常用的metadata,在InfluxDB的数据中是可选的。tags会被索引,因此tag上的查询是很高效的。 | 对应于SQL的索引列 |
tag set | 数据点上tag key和tag value的集合。 | 对应于SQL的无索引列名称和数值的集合 |
tag key | 组成tag的键值对中的键部分,tag key是字符串,存在metadata中。 | 对应于SQL的无索引列名称 |
tag value | 组成tag的键值对中的值部分,tag value是字符串,存在metadata中。 | 对应于SQL的无索引列数值 |
分片shard | 在分布式部署的情况下,shard决定了数据要存几份以及存在哪里。shard中包含实际的编码和压缩数据,并由磁盘上的TSM文件表示。每个shard都属于唯一的一个shard group。一个shard group中可以有1到多个shard。每个shard包含一组特定的序列series。同一shard group中的同一series上的所有点将存储在磁盘上的相同shard(TSM文件)中。 | 和SQL不太一样,有些类似于kafka分布式存储partition的机制 |
分片存续期间shard duration |
shard duration决定了每个shard group跨越多少时间。具体间隔由retention policy中的SHARD DURATION决定。 例如,如果retention policy的SHARD DURATION设置为1w,则每个shard group将跨越一周,并包含时间戳在该周内的所有点。之前的数据将被删除。 |
和SQL不太一样,有些类似于kafka中日志保留时间log.retention.hours |
分片组shard group |
shard group是shard的逻辑组合。shard group由时间和retention policy组织。 包含数据的retention policy至少包含一个关联的shard group。 一个shard group包含shard group覆盖的间隔的数据的所有shard。 每个shard group跨越的间隔是shard的持续时间。 |
这个和kafka分组就差别很大了 |
存续策略retention policy |
描述了InfluxDB保存数据的长短(duration),数据存在集群里面的副本数(replication factor),以及shard group的时间范围(shard group duration)。 当新建database的时候,InfluxDB会自动创建一个叫做autogen的retention policy,其duration为永远,replication factor为1,shard group的duration设为的7天。 retention policy在每个database里面是唯一的,是series的重要组成部分。 |
|
TSM(Time Structured Merge tree) | 一种InfluxDB的专用数据存储格式,比现有的B+或LSM树实现更大的压缩和更高的写入和读取吞吐量。 | 存储结果,类似于SQL中的B+树,或KV数据库中的LSM树。 |
每秒写入量values per second |
InfluxDB数据持久化速度的度量,等于将每秒写入的数据点数 乘以 每个数据点存储的值的数量。 例如,如果数据共有四个field,并且每秒处理10个batch,每个batch是5000个数据点。那么每秒10个batch x 每batch5000个数据点 x values per second是每数据点4个field =每秒200,000个值。 |
|
数据点缓存WAL(Write Ahead Log) |
为了TSM数据文件访问频率,InfluxDB将最新的数据点缓存在WAL中,直到其总大小或时间达到系统阈值,会触发将数据以batch的方式写入到TSM文件中。 WAL中的数据点是可以被查询的。 而且WAL中的树节点已经完成了持久化,系统重启后仍然会被保留。InfluxDB进程启动时,会确保WAL中的所有点都写入到TSM中,然后才会接受新的数据写入,。 |
WAL与TSM的关系,和SQL日志以及B+树关系有些相似。 |
持续查询Continuous Query(CQ) |
Continuous Query(CQ)是在数据库内部,自动的周期性运行的一个InfluxQL的查询。 Continuous Query(CQ)需要在SELECT语句中使用一个函数,并且一定包括一个GROUP BY time()语句。 |
有些类似于SQL中的存储过程+JOB |
桶bucket |
在InfluxDB 2.0中,桶是一个有命名的,时间序列数据存储位置。 在InfluxDB 1.8+版本中, 一个database和一个retention policy的组合就代表了一个桶bucket。 |
2.0新增概念 |
三、数据存储对比
我们以官方文档中的foodships表为例子:
包含了3个带索引里的列park_id,planet,and time;和一个不带索引的列#_foodships。
在SQL里,foodships表看起来是这个样子的:
+---------+---------+---------------------+--------------+ | park_id | planet | time | #_foodships | +---------+---------+---------------------+--------------+ | 1 | Earth | 1429185600000000000 | 0 | | 1 | Earth | 1429185601000000000 | 3 | | 1 | Earth | 1429185602000000000 | 15 | | 1 | Earth | 1429185603000000000 | 15 | | 2 | Saturn | 1429185600000000000 | 5 | | 2 | Saturn | 1429185601000000000 | 9 | | 2 | Saturn | 1429185602000000000 | 10 | | 2 | Saturn | 1429185603000000000 | 14 | | 3 | Jupiter | 1429185600000000000 | 20 | | 3 | Jupiter | 1429185601000000000 | 21 | | 3 | Jupiter | 1429185602000000000 | 21 | | 3 | Jupiter | 1429185603000000000 | 20 | | 4 | Saturn | 1429185600000000000 | 5 | | 4 | Saturn | 1429185601000000000 | 5 | | 4 | Saturn | 1429185602000000000 | 6 | | 4 | Saturn | 1429185603000000000 | 5 | +---------+---------+---------------------+--------------+
但在InfluxDB中,要按索引列也就是tag划分series,每个series要按时间存储,看起来是这样的:
name: foodships tags: park_id=1, planet=Earth time #_foodships ---- ------------ 2015-04-16T12:00:00Z 0 2015-04-16T12:00:01Z 3 2015-04-16T12:00:02Z 15 2015-04-16T12:00:03Z 15 name: foodships tags: park_id=2, planet=Saturn time #_foodships ---- ------------ 2015-04-16T12:00:00Z 5 2015-04-16T12:00:01Z 9 2015-04-16T12:00:02Z 10 2015-04-16T12:00:03Z 14 name: foodships tags: park_id=3, planet=Jupiter time #_foodships ---- ------------ 2015-04-16T12:00:00Z 20 2015-04-16T12:00:01Z 21 2015-04-16T12:00:02Z 21 2015-04-16T12:00:03Z 20 name: foodships tags: park_id=4, planet=Saturn time #_foodships ---- ------------ 2015-04-16T12:00:00Z 5 2015-04-16T12:00:01Z 5 2015-04-16T12:00:02Z 6 2015-04-16T12:00:03Z 5
四、schema设计原则
1、建议将meta data做成tag
2、建议不要用数据关键字作为tag或field的名称
3、不建议使用过多的series
4、不建议tag和field采用相同名称
5、不建议用measurement的名称作为区分数据的依据
6、不建议在一个tag中塞入多个分类的信息
五、小结
基于以上原则和概念,我们对时间序列数据库有了一个初步的认识:
1、InfluxDB是一个无模式(schemaless)的数据库,你可以在任意时间添加measurement,tags和fields,但对于已存储的数据,是不允许修改数据类型的
2、对比与SQL数据库,InfluxDB其实更接近与KV数据库,而且InfluxDB对列扩展的支持是十分好的
3、和SQL不同,不支持跨measurement的JOIN
4、measurement和series的关系,是通过tag来划分的,这是和SQL十分不一样的地方
5、InfluxDB中数据分两大类tag和field,其中:tag是用来划分series用的,只能是数量可控的字符串,而且支持索引;field是真正用于数据存储的
6、由于优先考虑数据插入及数据查询的性能,而不是数据更新和数据删除,InfluxDB不能算一个完整的CRUD数据库,更像是一个CR-ud数据库:
更新某个数据点时,只需要在同一个时间戳下重复insert一次数据;
删除某个数据点时,是无法根据field的值进行删除操作的,遇到这种情况就需要先进行一次查询,然后通过时间戳进行删除;
不允许更新或重命名tag,遇到这种情况需要新建一个tag,迁移数据后,drop原tag的series;
不允许通过tag key删除tag,但可以直接drop整个series;
7、单条数据的更新和删除操作,尽量少做,效率比较低
8、时间戳及其重要,数据的组织是按时间戳进行的
9、支持数据分片、数据保存时间、数据保存份数的设置
10、十分适合按时间为维度进行采样、生产和更新的系统
SQL Server连接占线
使用ADO及ODBC连接SQL Server时,如果连接没有处理好,很容易出现下面的错误:
[Microsoft][ODBC SQL Server Driver]Connection is busy with results for another hstmt.
连接占线导致另一个hstmt。
[Microsoft][ODBC SQL Server Driver]Connection is busy with results for another command.
连接占线导致另一个命令。
产生这个错误的原因是,在同一个connection中,同时操作了多个数据集。
解决这个问题有两个方法
1、不要在一个connection中,同时打开多个数据集
2、在SQL SERVER2005以上版本,可以使用MARS (Multiple Active ResultSet)。在连接字符串中加上”Mars_Connection=yes;”就可以支持多数据集了。
关于MARS,可以参考这个:
http://msdn2.microsoft.com/en-us/library/ms345109.aspx
Oracle数据泵导入导出
首先要说明一下,数据泵只可以在服务端运行,而且数据泵要用到DIRECTORY对象。
一、新建DIRECTORY并授权
#创建DIRECTORY CREATE DIRECTORY dump_dir AS 'DIRECTORY_FULL_PATH'; #授权 GRANT READ, WRITE ON DIRECTORY dump_dir TO user_id;
二、数据泵导出数据
#导出表 expdp user_id/user_pwd directory=dump_dir dumpfile=table01.dmp tables=table01 logfile=table01.log #导出方案 expdp user_id/user_pwd directory=dump_dir dumpfile=schema01.dmp schemas=schema01 logfile=schema01.log #导出表空间 expdp user_id/user_pwd directory=dump_dir dumpfile=tbs01.dmp tablespaces=tbs01 logfile=tbs01.log #导出数据库 expdp user_id/user_pwd directory=dump_dir dumpfile=db01.dmp full=y logfile=db01.log
三、数据泵导入数据
#导入表 impdp user_id/user_pwd directory=dump_dir dumpfile=table01.dmp tables=table01 #导入方案 impdp user_id/user_pwd directory=dump_dir dumpfile=schema01.dmp schemas=schema01 #导入表空间 impdp user_id/user_pwd directory=dump_dir dumpfile=tbs01.dmp tablespaces=tbs01 #导入数据库 impdp user_id/user_pwd directory=dump_dir dumpfile=db01.dmp full=y