Elastic Stack 5.0 GA Release Rollover API and Shrink API 令人振奮的提供了兩個對於 time-based index 更友善的API

[後記:目前 Elastic Stack 5.0 已經 GA , 文章寫於 2016.07.01 Elastic Stack Release – 5.0.0-alpha4 的興奮 ]

Elastic Stack Release – 5.0.0-alpha4 令人振奮的提供了兩個對於 time-based index 更友善的API 。

New Rollover API and Shrink API makes managing indices for time-based event data much easier 繼續閱讀

[elasticsearch] elasticsearch bechmarks baseline performance 官方效能基準

elasticsearch 官方有提供一個 nightly indexing performance 可供當做一個效能的 baseline。

https://benchmarks.elastic.co/index.html

Nightly indexing performance on master

2015/10/27 20:24:09:

Defaults: 16.21
Defaults (4G heap): 18.01
Fast: 17.29
FastUpdate: 14.97
Defaults (2 nodes): 6.96
EC2 i2.2xlarge Defaults 4G: 7.35
單位 K Docs/sec

他們所使用的配置與指標非常值得參考

This test indexes 6.9M short documents (log lines, total 14 GB json) using 8 client threads and 500 docs per _bulk request against a single node running on a dual Xeon X5680 (12 real cores, 24 with hyperthreading) and 48 GB RAM.
Defaults, 2 nodes is append-only, using all default settings, but runs 2 nodes on 1 box (5 shards, 1 replica).
Defaults is append-only, using all default settings.
Defaults (4G heap) is the same as Defaults except using a 4 GB heap (ES_HEAP_SIZE), because the ES default (-Xmx1g) sometimes hits OOMEs.
Fast is append-only, using 4 GB heap, and these settings:

  refresh_interval: 30s
  index.store.throttle.type: none
  indices.store.throttle.type: none

  index.number_of_shards: 6
  index.number_of_replicas: 0

  index.translog.flush_threshold_size: 4g
  index.translog.flush_threshold_ops: 500000

FastUpdate is the same as fast, except we pass in an ID (worst case random UUID) for each document and 25% of the time the ID already exists in the index.
Defaults (doc values) is the same as Defaults, but also indexes doc values for most fields

螢幕快照 2015-10-26 下午10.01.17

螢幕快照 2015-10-26 下午10.01.37 螢幕快照 2015-10-26 下午10.01.46 螢幕快照 2015-10-26 下午10.01.55 螢幕快照 2015-10-26 下午10.02.03 螢幕快照 2015-10-26 下午10.02.16 螢幕快照 2015-10-26 下午10.02.25

[elasticsearch]如何在索引時替 record 自動加上 timestamp


如何在索引的同時替每筆 record 都加上 timestamp ?

典型的作法可能會有兩種:

  • 把 record post 到 Elasticsearch前就先加上 timestamp欄位,所以這個timestamp 是您自己的應用服務處理該條 record的時間。若您使用 logstash 來做 logs indexing 的操作, logstash會替你加上 @timestamp 這個欄位。
  • Elasticsearch 的 mapping 內有個 _timestamp ,只要在 put mapping 時 啟用這個欄位,索引時就會自動把索引時間加入 records中。

繼續閱讀