MariaDB-Replikation funktioniert nicht
Gibt es zwei DBs im cluster. Master - (db3.example.com):
[mysqld]
wsrep_cluster_address = 'gcomm://'
wsrep_provider = /usr/lib/galera/libgalera_smm.so
wsrep_retry_autocommit = 0
wsrep_sst_method = rsync
wsrep_node_address = 12.79.29.209
- Und slave - (db4.example.com):
[mysqld]
wsrep_cluster_address='gcomm://db3.example.com'
wsrep_provider=/usr/lib/galera/libgalera_smm.so
wsrep_sst_method=rsync
wsrep_sst_receive_address=12.79.29.206
wsrep_node_address = 12.79.29.206
Dem slave wird nicht die Verbindung zum master. Melden Sie unten.
130821 20:12:04 [Warning] option 'table_cache': unsigned value 2097152 adjusted to 524288
130821 20:12:04 [Warning] The syntax '--log-slow-queries' is deprecated and will be removed in a future release. Please use '--slow-query-log'/'--slow-query-log-file' instead.
130821 20:12:04 [Note] WSREP: Read nil XID from storage engines, skipping position init
130821 20:12:04 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib/galera/libgalera_smm.so'
130821 20:12:04 [Note] WSREP: wsrep_load(): Galera 23.2.4(r147) by Codership Oy <info@codership.com> loaded succesfully.
130821 20:12:04 [Note] WSREP: Found saved state: 00000000-0000-0000-0000-000000000000:-1
130821 20:12:04 [Note] WSREP: Reusing existing '/var/lib/mysql//galera.cache'.
130821 20:12:04 [Note] WSREP: Passing config to GCS: base_host = 12.79.29.206; base_port = 4567; cert.log_conflicts = no; gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.size = 128M; gcs.fc_debug = 0; gcs.fc_factor = 1; gcs.fc_limit = 16; gcs.fc_master_slave = NO; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = NO; replicator.causal_read_timeout = PT30S; replicator.commit_order = 3
130821 20:12:04 [Note] WSREP: Assign initial position for certification: -1, protocol version: -1
130821 20:12:04 [Note] WSREP: wsrep_sst_grab()
130821 20:12:04 [Note] WSREP: Start replication
130821 20:12:04 [Note] WSREP: Setting initial position to 00000000-0000-0000-0000-000000000000:-1
130821 20:12:04 [Note] WSREP: protonet asio version 0
130821 20:12:04 [Note] WSREP: backend: asio
130821 20:12:04 [Note] WSREP: GMCast version 0
130821 20:12:04 [Note] WSREP: (2ecb4dfa-0a8d-11e3-0800-d1d310d8d2ff, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
130821 20:12:04 [Note] WSREP: (2ecb4dfa-0a8d-11e3-0800-d1d310d8d2ff, 'tcp://0.0.0.0:4567') multicast: , ttl: 1
130821 20:12:04 [Note] WSREP: EVS version 0
130821 20:12:04 [Note] WSREP: PC version 0
130821 20:12:04 [Note] WSREP: gcomm: connecting to group 'my_wsrep_cluster', peer 'db3.example.com:'
130821 20:12:04 [Note] WSREP: declaring 1694931b-0a8b-11e3-0800-11c89939f034 stable
130821 20:12:04 [Note] WSREP: Node 1694931b-0a8b-11e3-0800-11c89939f034 state prim
130821 20:12:04 [Note] WSREP: view(view_id(PRIM,1694931b-0a8b-11e3-0800-11c89939f034,4) memb {
1694931b-0a8b-11e3-0800-11c89939f034,
2ecb4dfa-0a8d-11e3-0800-d1d310d8d2ff,
} joined {
} left {
} partitioned {
})
130821 20:12:05 [Note] WSREP: gcomm: connected
130821 20:12:05 [Note] WSREP: Changing maximum packet size to 64500, resulting msg size: 32636
130821 20:12:05 [Note] WSREP: Shifting CLOSED -> OPEN (TO: 0)
130821 20:12:05 [Note] WSREP: Opened channel 'my_wsrep_cluster'
130821 20:12:05 [Note] WSREP: Waiting for SST to complete.
130821 20:12:05 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 1, memb_num = 2
130821 20:12:05 [Note] WSREP: STATE EXCHANGE: Waiting for state UUID.
130821 20:12:05 [Note] WSREP: STATE EXCHANGE: sent state msg: 2f15888a-0a8d-11e3-0800-6af8c4db9fd1
130821 20:12:05 [Note] WSREP: STATE EXCHANGE: got state msg: 2f15888a-0a8d-11e3-0800-6af8c4db9fd1 from 0 (db3.example.com)
130821 20:12:05 [Note] WSREP: STATE EXCHANGE: got state msg: 2f15888a-0a8d-11e3-0800-6af8c4db9fd1 from 1 (db4.example.com)
130821 20:12:05 [Note] WSREP: Quorum results:
version = 2,
component = PRIMARY,
conf_id = 3,
members = 1/2 (joined/total),
act_id = 44,
last_appl. = -1,
protocols = 0/4/2 (gcs/repl/appl),
group UUID = 16a07f81-0a8b-11e3-0800-cbc8be9154ae
130821 20:12:05 [Note] WSREP: Flow-control interval: [23, 23]
130821 20:12:05 [Note] WSREP: Shifting OPEN -> PRIMARY (TO: 44)
130821 20:12:05 [Note] WSREP: State transfer required:
Group state: 16a07f81-0a8b-11e3-0800-cbc8be9154ae:44
Local state: 00000000-0000-0000-0000-000000000000:-1
130821 20:12:05 [Note] WSREP: New cluster view: global state: 16a07f81-0a8b-11e3-0800-cbc8be9154ae:44, view# 4: Primary, number of nodes: 2, my index: 1, protocol version 2
130821 20:12:05 [Warning] WSREP: Gap in state sequence. Need state transfer.
130821 20:12:07 [Note] WSREP: Running: 'wsrep_sst_rsync --role 'joiner' --address '12.79.29.206' --auth '' --datadir '/var/lib/mysql/' --defaults-file '/etc/mysql/my.cnf' --parent '4643''
cat: /var/lib/mysql//rsync_sst.pid: No such file or directory
130821 20:12:07 [Note] WSREP: Prepared SST request: rsync|12.79.29.206:4444/rsync_sst
130821 20:12:07 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
130821 20:12:07 [Note] WSREP: Assign initial position for certification: 44, protocol version: 2
130821 20:12:07 [Warning] WSREP: Failed to prepare for incremental state transfer: Local state UUID (00000000-0000-0000-0000-000000000000) does not match group state UUID (16a07f81-0a8b-11e3-0800-cbc8be9154ae): 1 (Operation not permitted)
at galera/src/replicator_str.cpp:prepare_for_IST():442. IST will be unavailable.
130821 20:12:07 [Note] WSREP: Node 1 (db4.example.com) requested state transfer from '*any*'. Selected 0 (db3.example.com)(SYNCED) as donor.
130821 20:12:07 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 44)
130821 20:12:07 [Note] WSREP: Requesting state transfer: success, donor: 0
130821 20:13:10 [Warning] WSREP: 0 (db3.example.com): State transfer to 1 (db4.example.com) failed: -1 (Operation not permitted)
130821 20:13:10 [ERROR] WSREP: gcs/src/gcs_group.c:gcs_group_handle_join_msg():719: Will never receive state. Need to abort.
130821 20:13:10 [Note] WSREP: gcomm: terminating thread
130821 20:13:10 [Note] WSREP: gcomm: joining thread
130821 20:13:10 [Note] WSREP: gcomm: closing backend
130821 20:13:11 [Note] WSREP: view(view_id(NON_PRIM,1694931b-0a8b-11e3-0800-11c89939f034,4) memb {
2ecb4dfa-0a8d-11e3-0800-d1d310d8d2ff,
} joined {
} left {
} partitioned {
1694931b-0a8b-11e3-0800-11c89939f034,
})
130821 20:13:11 [Note] WSREP: view((empty))
130821 20:13:11 [Note] WSREP: gcomm: closed
130821 20:13:11 [Note] WSREP: mysqld: Terminated.
Aborted
root@db4:~# Parent mysqld process (PID:4643) terminated unexpectedly.
/usr//bin/wsrep_sst_common: line 94: /dev/stderr: Permission denied
InformationsquelleAutor Tomáš Fejfar | 2013-08-22
Du musst angemeldet sein, um einen Kommentar abzugeben.
Nach endlosen googeln (ohne Ergebnis), fanden wir durch Zufall das problem war, dass der port 4444 verwendet rsync-Replikation wurde geschlossen zwischen den Servern. Nach dem öffnen des Ports auf der firewall (alles annehmen, was von jedem anderen) funktioniert alles einwandfrei.
War die Meldung nicht sehr aussagekräftig. Also, wenn Sie am Ende mit:
Könnte es bedeuten, dass Sie Ihre Replikations-Einstellungen sind gut, aber die Verbindung selbst funktioniert nicht.
BEARBEITEN
Könnte es auch ein problem sein, wo der rsync-daemon nicht stoppen, richtig und wird nicht erkannt von mariadb läuft. Es muss getötet werden manuell auf einen freien port 4444.
Hände bis zu den beschreibenden Fehlermeldungen! yay! 😀
Nicht sicher, ob es verbunden ist oder nicht, aber dass die alle server-Adressen, in allen configs zB: wsrep_cluster_address='gcomm://10.0.0.1,10.0.0.2,10.0.0.3' das Problem für mich.
Nicht sicher, wenn durch, dass Sie nicht wechseln Sie zu master-master-Replikation. Was ich hatte, war eigentlich master-slave-Replikation als master nicht bewusst war, der rest des Clusters. So habe ich sichergestellt, dass unbeabsichtigte schreibt zum slave wird nie propagieren zu meistern.
InformationsquelleAutor Tomáš Fejfar
Prüfen, ob
port 4444
ist offen zwischen den Servern. Wenn es von einer firewall geblockt, dann bekommst du auch den gleichen Fehler.InformationsquelleAutor Ishtiyaque