
Elassandra
Elassandra是Elasticsearch的一个分支,经过修改,可以作为Apache Cassandra的插件运行,具有可扩展和灵活的点对点架构。 Elasticsearch代码嵌入在Cassanda节点中,在Cassandra表上提供高级搜索功能,Cassandra用作Elasticsearch数据和配置存储。
Elassandra支持Cassandra vnodes,并通过添加更多节点进行水平扩展。
项目文档可在doc.elassandra.io上获得。
Elassandra的好处
对于Cassandra用户,elassandra提供Elasticsearch功能:
- 在Elasticsearch中更新Cassandra索引。
- 对Cassandra数据进行全文和空间搜索。
- 实时聚合(不需要Spark或Hadoop来完成GROUP BY)
- 在一个查询中提供对多个键空间和表的搜索。
- 使用“用户定义的类型”提供自动模式创建和支持嵌套文档。
- 提供JSON REST API对Cassandra数据的读/写访问。
- 许多Elasticsearch插件和Kibana等产品。
- 管理并发弹性搜索映射更改并应用批处理原子CQL架构更改。
- 支持Elasticsearch摄取处理器,允许转换输入数据。
对于Elasticsearch用户,elassandra提供了有用的功能:
- Elassandra是无主的。群集状态通过cassandra轻量级事务进行管理。
- Elassandra是一个分片的多主数据库,其中Elasticsearch是分片主从。因此,Elassandra没有单点写入,有助于实现高可用性。
- Elassandra继承了Cassandra数据修复机制(暗示切换,读取修复和nodetool修复),为跨数据中心复制提供支持。
- 将节点添加到Elassandra集群时,只有从现有节点提取的数据才会在Elasticsearch中重新编制索引。
- Cassandra可能是您索引和非索引数据的唯一数据存储区。它更易于管理和保护。源文档现在存储在Cassandra中,如果您需要NoSQL数据库和Elasticsearch,则会减少磁盘空间。
- 写操作不限于一个主分片,而是分布在虚拟数据中心的所有Cassandra节点上。分片数量不会限制您的写入吞吐量。添加elassandra节点会增加读写吞吐量。
- Elasticsearch索引可以在许多Cassandra数据中心之间复制,允许写入最近的数据中心并进行全局搜索。
- cassandra驱动程序可识别数据中心和令牌,提供自动负载平衡和故障转移。
- Elassandra有效地将Elasticsearch文档存储在二进制SSTable中,而不会产生任何JSON开销。
快速开始
使用docker启动单节点的Elassandra集群:
# 下载docker镜像
$ docker pull docker.io/strapdata/elassandra:latest
# 启动
$ docker run -d --rm \
--name elassandra \
-p 9042:9042 \
-p 9200:9200 \
-e JVM_OPTS="-Dcassandra.custom_query_handler_class=org.elassandra.index.ElasticQueryHandler" \
docker.io/strapdata/elassandra:latest
检查Elassandra集群状态:
$ docker exec -i elassandra nodetool status
Datacenter: DC1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 172.17.0.3 80.97 KiB 8 100.0% 81a9e4e0-efe4-458d-861e-8d527835372d r1
从Cassandra表创建Elasticsearch索引
使用cassandra CQLSH创建一个cassandra Keyspace,一个User Defined Type,一个Table并添加两行:
$ docker exec -i elassandra cqlsh <<EOF
CREATE KEYSPACE IF NOT EXISTS test WITH replication = {'class': 'NetworkTopologyStrategy', 'DC1': 1};
CREATE TYPE IF NOT EXISTS test.user_type (first text, last text);
CREATE TABLE IF NOT EXISTS test.docs (uid int, username frozen<user_type>, login text, PRIMARY KEY (uid));
INSERT INTO test.docs (uid, username, login) VALUES (1, {first:'vince',last:'royer'}, 'vroyer');
INSERT INTO test.docs (uid, username, login) VALUES (2, {first:'barthelemy',last:'delemotte'}, 'barth');
EOF
通过发现CQL结构从Cassandra表架构创建Elasticsearch索引:
$ curl -XPUT -H 'Content-Type: application/json' http://localhost:9200/test -d'{"mappings":{"docs":{"discover":".*"}}}'
{"acknowledged":true,"shards_acknowledged":true,"index":"test"}
此命令发现与提供的正则表达式匹配的所有列,并创建Eslasticsearch索引。
从头开始创建Elasticsearch索引
Elassandra在创建索引或使用新字段更新映射时自动生成基础CQL结构。
$ curl -XPUT -H 'Content-Type: application/json' http://localhost:9200/test2 -d'{
"mappings":{
"docs":{
"properties": {
"first": {
"type":"text"
},
"last": {
"type":"text",
"cql_collection":"singleton"
}
}
}
}
}'
{"acknowledged":true,"shards_acknowledged":true,"index":"test2"}
生成的CQL结构:
$ docker exec -it elassandra cqlsh
Connected to Test Cluster at 127.0.0.1:9042.
[cqlsh 5.0.1 | Cassandra 3.11.4.2 | CQL spec 3.4.4 | Native protocol v4]
Use HELP for help.
cqlsh> desc KEYSPACE test2;
CREATE KEYSPACE test2 WITH replication = {'class': 'NetworkTopologyStrategy', 'DC1': '1'} AND durable_writes = true;
CREATE TABLE test2.docs (
"_id" text PRIMARY KEY,
first list<text>,
last text
) WITH bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = ''
AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99PERCENTILE';
CREATE CUSTOM INDEX elastic_docs_idx ON test2.docs () USING 'org.elassandra.index.ExtendedElasticSecondaryIndex';
搜索文档
通过Elasticsearch API搜索文档:
$ curl "http://localhost:9200/test/_search?pretty"
{
"took" : 53,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 2,
"max_score" : 1.0,
"hits" : [
{
"_index" : "test",
"_type" : "docs",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"uid" : 1,
"login" : "vroyer",
"username" : {
"last" : "royer",
"first" : "vince"
}
}
},
{
"_index" : "test",
"_type" : "docs",
"_id" : "2",
"_score" : 1.0,
"_source" : {
"uid" : 2,
"login" : "barth",
"username" : {
"last" : "delemotte",
"first" : "barthelemy"
}
}
}
]
}
}
要通过CQL驱动程序搜索文档,请在表模式中添加以下两个虚拟列。 然后,执行Elasticsearch嵌套查询。 伪列允许您在索引名称与键空间名称不匹配时指定目标索引。
$ docker exec -it elassandra cqlsh
Connected to Test Cluster at 127.0.0.1:9042.
[cqlsh 5.0.1 | Cassandra 3.11.4.2 | CQL spec 3.4.4 | Native protocol v4]
Use HELP for help.
cqlsh> ALTER TABLE test.docs ADD es_query text;
cqlsh> ALTER TABLE test.docs ADD es_options text;
cqlsh> SELECT uid, login, username FROM test.docs WHERE es_query='{ "query":{"nested":{"path":"username","query":{"term":{"username.first":"barthelemy"}}}}}' AND es_options='indices=test' ALLOW FILTERING;
uid | login | username
-----+-------+------------------------------------------
2 | barth | {first: 'barthelemy', last: 'delemotte'}
(1 rows)
管理Elasticsearch索引
获取Elasticsearch集群状态:
$ curl "http://localhost:9200/_cluster/state?pretty"
{
"cluster_name" : "Test Cluster",
"compressed_size_in_bytes" : 730,
"version" : 14,
"state_uuid" : "tMqNt8PpS5ySuaRxHUGc1w",
"master_node" : "81a9e4e0-efe4-458d-861e-8d527835372d",
"blocks" : { },
"nodes" : {
"81a9e4e0-efe4-458d-861e-8d527835372d" : {
"name" : "172.17.0.3",
"status" : "ALIVE",
"ephemeral_id" : "81a9e4e0-efe4-458d-861e-8d527835372d",
"transport_address" : "172.17.0.3:9300",
"attributes" : {
"rack" : "r1",
"dc" : "DC1"
}
}
},
"metadata" : {
"version" : 4,
"cluster_uuid" : "81a9e4e0-efe4-458d-861e-8d527835372d",
"templates" : { },
"indices" : {
"test" : {
"state" : "open",
"settings" : {
"index" : {
"creation_date" : "1561883488853",
"number_of_shards" : "1",
"number_of_replicas" : "0",
"uuid" : "h34je53mSVqojho2gIH91A",
"version" : {
"created" : "6020399"
},
"provided_name" : "test"
}
},
"mappings" : {
"docs" : {
"properties" : {
"uid" : {
"cql_partition_key" : true,
"cql_primary_key_order" : 0,
"type" : "integer",
"cql_collection" : "singleton"
},
"login" : {
"type" : "keyword",
"cql_collection" : "singleton"
},
"username" : {
"cql_udt_name" : "user_type",
"type" : "nested",
"properties" : {
"last" : {
"type" : "keyword",
"cql_collection" : "singleton"
},
"first" : {
"type" : "keyword",
"cql_collection" : "singleton"
}
},
"cql_collection" : "singleton"
}
}
}
},
"aliases" : [ ],
"primary_terms" : {
"0" : 0
},
"in_sync_allocations" : {
"0" : [ ]
}
},
"test2" : {
"state" : "open",
"settings" : {
"index" : {
"creation_date" : "1561883720880",
"number_of_shards" : "1",
"number_of_replicas" : "0",
"uuid" : "u7BgODwuT_idHbcuenvi6g",
"version" : {
"created" : "6020399"
},
"provided_name" : "test2"
}
},
"mappings" : {
"docs" : {
"properties" : {
"last" : {
"type" : "text",
"cql_collection" : "singleton"
},
"first" : {
"type" : "text"
}
}
}
},
"aliases" : [ ],
"primary_terms" : {
"0" : 0
},
"in_sync_allocations" : {
"0" : [ ]
}
}
},
"index-graveyard" : {
"tombstones" : [ ]
}
},
"routing_table" : {
"indices" : {
"test" : {
"shards" : {
"0" : [
{
"state" : "STARTED",
"primary" : true,
"node" : "81a9e4e0-efe4-458d-861e-8d527835372d",
"relocating_node" : null,
"shard" : 0,
"index" : "test",
"token_ranges" : [
"(-9223372036854775808,9223372036854775807]"
],
"allocation_id" : {
"id" : "dummy_alloc_id"
}
}
]
}
},
"test2" : {
"shards" : {
"0" : [
{
"state" : "STARTED",
"primary" : true,
"node" : "81a9e4e0-efe4-458d-861e-8d527835372d",
"relocating_node" : null,
"shard" : 0,
"index" : "test2",
"token_ranges" : [
"(-9223372036854775808,9223372036854775807]"
],
"allocation_id" : {
"id" : "dummy_alloc_id"
}
}
]
}
}
}
},
"routing_nodes" : {
"unassigned" : [ ],
"nodes" : {
"81a9e4e0-efe4-458d-861e-8d527835372d" : [
{
"state" : "STARTED",
"primary" : true,
"node" : "81a9e4e0-efe4-458d-861e-8d527835372d",
"relocating_node" : null,
"shard" : 0,
"index" : "test",
"token_ranges" : [
"(-9223372036854775808,9223372036854775807]"
],
"allocation_id" : {
"id" : "dummy_alloc_id"
}
},
{
"state" : "STARTED",
"primary" : true,
"node" : "81a9e4e0-efe4-458d-861e-8d527835372d",
"relocating_node" : null,
"shard" : 0,
"index" : "test2",
"token_ranges" : [
"(-9223372036854775808,9223372036854775807]"
],
"allocation_id" : {
"id" : "dummy_alloc_id"
}
}
]
}
},
"snapshots" : {
"snapshots" : [ ]
},
"restore" : {
"snapshots" : [ ]
},
"snapshot_deletions" : {
"snapshot_deletions" : [ ]
}
}
获取Elasticsearch索引信息:
$ curl "http://localhost:9200/_cat/indices?v"
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
green open test2 u7BgODwuT_idHbcuenvi6g 1 0 0 0 208b 208b
green open test h34je53mSVqojho2gIH91A 1 0 4 0 4kb 4kb
删除Elasticsearch索引(默认情况下不删除底层的Cassandra表):
curl -XDELETE http://localhost:9200/test
{"acknowledged":true}