Skip to content

Upgrade from 3.6.5 to 3.7.11 fails. #288

Open
@BaconFries

Description

@BaconFries

I have a few clusters configured the same way and two cluster's successfully upgraded. On one I get the following error when running the arangodb upgrade command.

#2021-05-17T17:35:45Z |FATA| Failed to start database automatic upgrade component=arangodb error="Get http://10.0.18.21:8538/version: dial tcp 10.0.18.21:8538: connect: connection refused"

I'm not sure where it is getting port 8538 from. There isn't a service listening on that port on any host. I have tried manually upgrade without success.
What can I do to get this working?

Step used to upgrade.

#Add 'KillMode=process' to systemd in [Service] section on all nodes
vim /etc/systemd/system/arangodb.service
systemctl daemon-reload

#enable maintenance mode on one node
curl http://localhost:8529/_admin/cluster/maintenance -XPUT -d'"on"'

#restart nodes one at a time
service arangodb restart
service arangodb status

#install RPM on all nodes
yum -y install arangodb3-3.7.11-1.0.x86_64

#kill starter on all nodes wait for them to come back up
ps -C arangodb -fww
kill -9 <pid-of-starter>

#upgrade schema on one node
arangodb upgrade --starter.endpoint=http://10.0.18.20:8528

#remove 'KillMode=process' from systemd in [Service] section on all nodes
vim /etc/systemd/system/arangodb.service
systemctl daemon-reload

#restart nodes one at a time
systemctl restart arangodb

#disable maintenance mode on one node
curl http://localhost:8529/_admin/cluster/maintenance -XPUT -d'"off"'

Additional Info

#db1
#running processes
root     10421  0.0  0.0 114736 14856 ?        Sl   May17   1:36 /usr/bin/arangodb --starter.data-dir=/var/lib/arangodb34-cluster
root     10437  0.4  1.5 719312 252744 ?       Sl   May17  16:27  \_ /usr/sbin/arangod -c /var/lib/arangodb34-cluster/agent8531/arangod.conf --database.directory /var/lib/arangodb34-cluster/agent8531/data --javascript.startup-directory /usr/share/arangodb3/js --javascript.app-path /var/lib/arangodb34-cluster/agent8531/apps --log.file /var/lib/arangodb34-cluster/agent8531/arangod.log --log.force-direct false --javascript.copy-installation true --agency.activate true --agency.my-address tcp://10.0.18.20:8531 --agency.size 3 --agency.supervision true --foxx.queues false --server.statistics false --agency.endpoint tcp://10.0.18.21:8531 --agency.endpoint tcp://10.0.18.22:8531
root     10576  0.4 10.5 3740700 1682128 ?     Sl   May17  19:27  \_ /usr/sbin/arangod -c /var/lib/arangodb34-cluster/dbserver8530/arangod.conf --database.directory /var/lib/arangodb34-cluster/dbserver8530/data --javascript.startup-directory /usr/share/arangodb3/js --javascript.app-path /var/lib/arangodb34-cluster/dbserver8530/apps --log.file /var/lib/arangodb34-cluster/dbserver8530/arangod.log --log.force-direct false --javascript.copy-installation true --cluster.my-address tcp://10.0.18.20:8530 --cluster.my-role PRIMARY --foxx.queues false --server.statistics true --cluster.agency-endpoint tcp://10.0.18.20:8531 --cluster.agency-endpoint tcp://10.0.18.21:8531 --cluster.agency-endpoint tcp://10.0.18.22:8531
root     10657  0.4  0.7 805796 116588 ?       Sl   May17  20:24  \_ /usr/sbin/arangod -c /var/lib/arangodb34-cluster/coordinator8529/arangod.conf --database.directory /var/lib/arangodb34-cluster/coordinator8529/data --javascript.startup-directory /usr/share/arangodb3/js --javascript.app-path /var/lib/arangodb34-cluster/coordinator8529/apps --log.file /var/lib/arangodb34-cluster/coordinator8529/arangod.log --log.force-direct false --javascript.copy-installation true --cluster.my-address tcp://10.0.18.20:8529 --cluster.my-role COORDINATOR --foxx.queues true --server.statistics true --cluster.agency-endpoint tcp://10.0.18.20:8531 --cluster.agency-endpoint tcp://10.0.18.21:8531 --cluster.agency-endpoint tcp://10.0.18.22:8531

#systemd service script
[Unit]
  Description=Run the ArangoDB Starter
  After=sysinit.target sockets.target timers.target paths.target slices.target network.target syslog.target
[Service]
  LimitNOFILE=1048576
  Type=forking
  User=root
  Group=root
  TimeoutSec=5min
  Restart=always
  RestartSec=20
  ExecStart=/usr/bin/arangodb start \
     --starter.data-dir=/var/lib/arangodb34-cluster/  \
     --starter.join=10.0.18.20

  ExecStop=/usr/bin/arangodb stop
[Install]
  WantedBy=multi-user.target
  
#log
2021-05-17T17:15:52Z [7566] INFO [144fe] using storage engine 'rocksdb'
2021-05-17T17:15:52Z [7566] INFO [a1c60] {syscall} file-descriptors (nofiles) hard limit is 8192, soft limit is 8192
2021-05-17T17:15:52Z [7566] WARNING [ef6ca] Database version check failed for '_system': downgrade needed
2021-05-17T17:15:52Z [7566] FATAL [290c2] Database version check failed: downgrade needed
2021-05-17T17:46:57Z [9830] INFO [43396] {authentication} Jwt secret not specified, generating...
2021-05-17T17:46:57Z [9830] INFO [144fe] using storage engine 'rocksdb'
2021-05-17T17:46:57Z [9830] INFO [a1c60] {syscall} file-descriptors (nofiles) hard limit is 8192, soft limit is 8192
2021-05-17T17:46:57Z [9830] INFO [3844e] {authentication} Authentication is turned on (system only), authentication for unix sockets is turned on
2021-05-17T17:46:57Z [9830] WARNING [ef6ca] Database version check failed for '_system': downgrade needed
2021-05-17T17:46:57Z [9830] FATAL [290c2] Database version check failed: downgrade needed


#db2
#running processes
root     20328  0.0  0.0 114736 13820 ?        Sl   May17   0:56 /usr/bin/arangodb --starter.data-dir=/var/lib/arangodb34-cluster --starter.join=10.0.18.20:8528
root     20342  3.7  1.8 890320 289776 ?       Sl   May17 154:26  \_ /usr/sbin/arangod -c /var/lib/arangodb34-cluster/agent8531/arangod.conf --database.directory /var/lib/arangodb34-cluster/agent8531/data --javascript.startup-directory /usr/share/arangodb3/js --javascript.app-path /var/lib/arangodb34-cluster/agent8531/apps --log.file /var/lib/arangodb34-cluster/agent8531/arangod.log --log.force-direct false --javascript.copy-installation true --agency.activate true --agency.my-address tcp://10.0.18.21:8531 --agency.size 3 --agency.supervision true --foxx.queues false --server.statistics false --agency.endpoint tcp://10.0.18.20:8531 --agency.endpoint tcp://10.0.18.22:8531
root     20483  0.6 15.2 3907612 2433596 ?     Sl   May17  25:28  \_ /usr/sbin/arangod -c /var/lib/arangodb34-cluster/dbserver8530/arangod.conf --database.directory /var/lib/arangodb34-cluster/dbserver8530/data --javascript.startup-directory /usr/share/arangodb3/js --javascript.app-path /var/lib/arangodb34-cluster/dbserver8530/apps --log.file /var/lib/arangodb34-cluster/dbserver8530/arangod.log --log.force-direct false --javascript.copy-installation true --cluster.my-address tcp://10.0.18.21:8530 --cluster.my-role PRIMARY --foxx.queues false --server.statistics true --cluster.agency-endpoint tcp://10.0.18.20:8531 --cluster.agency-endpoint tcp://10.0.18.21:8531 --cluster.agency-endpoint tcp://10.0.18.22:8531
root     20622  0.4  0.9 781948 149896 ?       Sl   May17  18:39  \_ /usr/sbin/arangod -c /var/lib/arangodb34-cluster/coordinator8529/arangod.conf --database.directory /var/lib/arangodb34-cluster/coordinator8529/data --javascript.startup-directory /usr/share/arangodb3/js --javascript.app-path /var/lib/arangodb34-cluster/coordinator8529/apps --log.file /var/lib/arangodb34-cluster/coordinator8529/arangod.log --log.force-direct false --javascript.copy-installation true --cluster.my-address tcp://10.0.18.21:8529 --cluster.my-role COORDINATOR --foxx.queues true --server.statistics true --cluster.agency-endpoint tcp://10.0.18.20:8531 --cluster.agency-endpoint tcp://10.0.18.21:8531 --cluster.agency-endpoint tcp://10.0.18.22:8531

#systemd service script
[Unit]
  Description=Run the ArangoDB Starter
  After=sysinit.target sockets.target timers.target paths.target slices.target network.target syslog.target
[Service]
  LimitNOFILE=1048576
  Type=forking
  User=root
  Group=root
  TimeoutSec=5min
  Restart=always
  RestartSec=20
  ExecStart=/usr/bin/arangodb start \
     --starter.data-dir=/var/lib/arangodb34-cluster/  \
     --starter.join=10.0.18.20

  ExecStop=/usr/bin/arangodb stop
[Install]
  WantedBy=multi-user.target
  
#log
2021-05-17T17:15:52Z [19478] INFO [144fe] using storage engine 'rocksdb'
2021-05-17T17:15:52Z [19478] INFO [a1c60] {syscall} file-descriptors (nofiles) hard limit is 8192, soft limit is 8192
2021-05-17T17:15:53Z [19478] WARNING [ef6ca] Database version check failed for '_system': downgrade needed
2021-05-17T17:15:53Z [19478] FATAL [290c2] Database version check failed: downgrade needed
2021-05-17T17:46:57Z [20122] INFO [43396] {authentication} Jwt secret not specified, generating...
2021-05-17T17:46:57Z [20122] INFO [144fe] using storage engine 'rocksdb'
2021-05-17T17:46:57Z [20122] INFO [a1c60] {syscall} file-descriptors (nofiles) hard limit is 8192, soft limit is 8192
2021-05-17T17:46:57Z [20122] INFO [3844e] {authentication} Authentication is turned on (system only), authentication for unix sockets is turned on
2021-05-17T17:46:57Z [20122] WARNING [ef6ca] Database version check failed for '_system': downgrade needed
2021-05-17T17:46:57Z [20122] FATAL [290c2] Database version check failed: downgrade needed

#db3
#running processes
root     12912  0.0  0.0 114736 13364 ?        Sl   May17   0:59 /usr/bin/arangodb --starter.data-dir=/var/lib/arangodb34-cluster --starter.join=10.0.18.20:8528
root     12927  0.3  1.5 714192 244988 ?       Sl   May17  15:12  \_ /usr/sbin/arangod -c /var/lib/arangodb34-cluster/agent8531/arangod.conf --database.directory /var/lib/arangodb34-cluster/agent8531/data --javascript.startup-directory /usr/share/arangodb3/js --javascript.app-path /var/lib/arangodb34-cluster/agent8531/apps --log.file /var/lib/arangodb34-cluster/agent8531/arangod.log --log.force-direct false --javascript.copy-installation true --agency.activate true --agency.my-address tcp://10.0.18.22:8531 --agency.size 3 --agency.supervision true --foxx.queues false --server.statistics false --agency.endpoint tcp://10.0.18.20:8531 --agency.endpoint tcp://10.0.18.21:8531
root     13066  0.5 11.7 3285020 1868972 ?     Sl   May17  21:54  \_ /usr/sbin/arangod -c /var/lib/arangodb34-cluster/dbserver8530/arangod.conf --database.directory /var/lib/arangodb34-cluster/dbserver8530/data --javascript.startup-directory /usr/share/arangodb3/js --javascript.app-path /var/lib/arangodb34-cluster/dbserver8530/apps --log.file /var/lib/arangodb34-cluster/dbserver8530/arangod.log --log.force-direct false --javascript.copy-installation true --cluster.my-address tcp://10.0.18.22:8530 --cluster.my-role PRIMARY --foxx.queues false --server.statistics true --cluster.agency-endpoint tcp://10.0.18.20:8531 --cluster.agency-endpoint tcp://10.0.18.21:8531 --cluster.agency-endpoint tcp://10.0.18.22:8531
root     13206  0.5  1.0 914732 165792 ?       Sl   May17  21:11  \_ /usr/sbin/arangod -c /var/lib/arangodb34-cluster/coordinator8529/arangod.conf --database.directory /var/lib/arangodb34-cluster/coordinator8529/data --javascript.startup-directory /usr/share/arangodb3/js --javascript.app-path /var/lib/arangodb34-cluster/coordinator8529/apps --log.file /var/lib/arangodb34-cluster/coordinator8529/arangod.log --log.force-direct false --javascript.copy-installation true --cluster.my-address tcp://10.0.18.22:8529 --cluster.my-role COORDINATOR --foxx.queues true --server.statistics true --cluster.agency-endpoint tcp://10.0.18.20:8531 --cluster.agency-endpoint tcp://10.0.18.21:8531 --cluster.agency-endpoint tcp://10.0.18.22:8531

#systemd service script
[Unit]
  Description=Run the ArangoDB Starter
  After=sysinit.target sockets.target timers.target paths.target slices.target network.target syslog.target
[Service]
  LimitNOFILE=1048576
  Type=forking
  User=root
  Group=root
  TimeoutSec=5min
  Restart=always
  RestartSec=20
  ExecStart=/usr/bin/arangodb start \
     --starter.data-dir=/var/lib/arangodb34-cluster/  \
     --starter.join=10.0.18.20

  ExecStop=/usr/bin/arangodb stop
[Install]
  WantedBy=multi-user.target

#log
2021-05-17T17:15:52Z [8302] INFO [144fe] using storage engine 'rocksdb'
2021-05-17T17:15:52Z [8302] INFO [a1c60] {syscall} file-descriptors (nofiles) hard limit is 8192, soft limit is 8192
2021-05-17T17:15:53Z [8302] WARNING [ef6ca] Database version check failed for '_system': downgrade needed
2021-05-17T17:15:53Z [8302] FATAL [290c2] Database version check failed: downgrade needed
2021-05-17T17:46:57Z [8973] INFO [43396] {authentication} Jwt secret not specified, generating...
2021-05-17T17:46:57Z [8973] INFO [144fe] using storage engine 'rocksdb'
2021-05-17T17:46:57Z [8973] INFO [a1c60] {syscall} file-descriptors (nofiles) hard limit is 8192, soft limit is 8192
2021-05-17T17:46:57Z [8973] INFO [3844e] {authentication} Authentication is turned on (system only), authentication for unix sockets is turned on
2021-05-17T17:46:57Z [8973] WARNING [ef6ca] Database version check failed for '_system': downgrade needed
2021-05-17T17:46:57Z [8973] FATAL [290c2] Database version check failed: downgrade needed
2021-05-17T19:51:49Z [11303] INFO [43396] {authentication} Jwt secret not specified, generating...
2021-05-17T19:51:49Z [11303] INFO [144fe] using storage engine 'rocksdb'
2021-05-17T19:51:49Z [11303] INFO [a1c60] {syscall} file-descriptors (nofiles) hard limit is 8192, soft limit is 8192
2021-05-17T19:51:49Z [11303] INFO [3844e] {authentication} Authentication is turned on (system only), authentication for unix sockets is turned on
2021-05-17T20:06:34Z [12739] INFO [43396] {authentication} Jwt secret not specified, generating...
2021-05-17T20:06:34Z [12739] INFO [144fe] using storage engine 'rocksdb'
2021-05-17T20:06:34Z [12739] INFO [a1c60] {syscall} file-descriptors (nofiles) hard limit is 8192, soft limit is 8192
2021-05-17T20:06:34Z [12739] INFO [3844e] {authentication} Authentication is turned on (system only), authentication for unix sockets is turned on
2021-05-17T20:06:34Z [12739] WARNING [ef6ca] Database version check failed for '_system': downgrade needed
2021-05-17T20:06:34Z [12739] FATAL [290c2] Database version check failed: downgrade needed

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions