Managing Nodeos P2P Peer Connections

Recently there was a fault at one of the sites of the hosting providing we use at Telos UK. This caused an unclean shutdown of a couple of our nodeos instances leaving them in a “Database dirty flag set” state. I attempted to replay from a snapshot file with the existing blocks.log file in place as described in Replay from Snapshot with Blocks Log in my prior post Nodeos Replays, but this time this failed with the following error message in the log:

Error unpacking field new_producers
    {"field":"new_producers"}
    nodeos  raw.hpp:367 operator()
error unpacking eosio::chain::signed_block
    {"type":"eosio::chain::signed_block"}
    nodeos  raw.hpp:644 unpack

So, I removed the blocks.log and tried again. This succeeded but of course meant that the blocks required to take us from the snapshot back to being in synchronisation with the chain would have to be entirely sourced via the P2P (peer-to-peer) network rather than being partially replayed from the blocks.log.

At Telos UK, we run a main and standby pair of servers in different data centres. Each server runs two instances of nodeos, a block producing node and a kind of proxy node that synchronises with the peer nodes of other Telos block producers and provides our external API service. Our block producing nodes only synchronise in a mesh with our own internal nodes.

The snapshot that I used to recover both nodes at the affected site was a couple of months old. I set the synchronisation going late at night and checked on its progress the next morning, about eight hours later. To my surprise, whereas the block producing node had already caught up with the chain, the proxy node had only received about a week’s worth of blocks – ball park at least 180 times the normal synchronisation rate for the block producing node vs. only about 8 times for the proxy node.

I take care not to include hosts in the P2P network that produce Net Plugin errors in our nodeos logs, but I noted that whereas the log for our block producing node reported nothing other than requests for ranges of blocks during catch up, these were interspersed with various other Net Plugin info messages in the proxy node’s log, which, though I wasn’t confident that I understood them, suggested all was not well.

I also temporarily made the proxy node in catchup unavailable for incoming P2P connections to see if this made any difference to its catchup speed. It did not.

This set me thinking, could it be that:

  1. The info messages that I saw in the log for our proxy node are indeed indicative that all is not well as they appeared to be.
  2. Dealing with whatever it is that these info messages are reporting explains the poor catchup performance of our proxy node relative to our block producing node.

So, I set about exploring these questions by synchronising a test node with the Telos testnet. I believe that what I found out may be useful to the wider Telos community and indeed other communities who manage chains based on EOSIO. As always, comments that advance, expand on or correct the learning in this post are most welcome.

Contents

Starting a Testnet Node from Genesis

I repeated the steps described under Start from Genesis and Successful Shutdown in my prior post Nodeos Replays – note, again using my Docker Compose EOSIO services, which is therefore reflected throughout what follows in this post. The successful shutdown and restarting of nodeos using the command docker-compose up nodeos was done to bring nodeos up with convenient cleos access, again via my Docker Compose EOSIO services.

As described previously, I used the “Get Peers” tool in my Telos toolbox to generate a fresh list of p2p-peer-address values for Telos testnet and added the statement to enable the EOSIO net API plugin, to produce the config.ini file below:

plugin = eosio::net_api_plugin
# p2p-peer-address list generated at 06:17:32 on 21-Oct-21
# Chain id=1eaa0824707c8c16bd25145493bf062aecddfeb56c736f6ba6397f3195f33c9f
# cryptosuviio
p2p-peer-address=testnet.telos.cryptosuvi.io:9876
# telosunlimit
p2p-peer-address=p2p.testnet.telosunlimited.io:9876
# blindblocart
p2p-peer-address=testnet.blindblocart.io:9876
# persiantelos
p2p-peer-address=mainnet.persiantelos.com:8880
# telosuknodes
p2p-peer-address=testnet-p2p.telosuk.io:9877
# eosphereiobp
p2p-peer-address=peer1-telos-testnet.eosphere.io:9876
# caleosblocks
p2p-peer-address=testnet.telos.caleos.io:9879
# telosgermany
p2p-peer-address=telosgermany-testnet.genereos.io:9876
# telosarabia1
p2p-peer-address=testnet2p2p.telosarabia.net:9876
# katalyotelos
p2p-peer-address=tlos-p2p-test.katalyo.com:29877
# bp.boid
p2p-peer-address=boid-telos-testnet-p2p.animusystems.com:3535
p2p-peer-address=boid-telos-testnet-p2p.animusystems.com:5050
# telosboidbp1
p2p-peer-address=boid-telos-testnet-p2p.animusystems.com:3535
p2p-peer-address=boid-telos-testnet-p2p.animusystems.com:5050
# edeniaedenia
p2p-peer-address=telos-testnet.eosio.cr:9879
# 3boidanimus3
p2p-peer-address=boid-telos-testnet-p2p.animusystems.com:3535
p2p-peer-address=boid-telos-testnet-p2p.animusystems.com:5050

While I have noted the inclusion of the line plugin = eosio::net_api_plugin, I should perhaps highight the following “Caution” notice on the Net Api Plugin page of the EOSIO version 2.0 manual:

This plugin exposes endpoints that allow management of p2p connections. Running this plugin on a publicly accessible node is not recommended as it can be exploited.

Of course, there is no issue enabling this plugin for my isolated test node but we can not enable it on our mainnet proxy nodes since they combine both peering and API functions – see Conclusions.

In the usage instructions for the “Get Peers” tool in my Telos toolbox it says:

Note that the report does not attempt nodeos communication with a P2P endpoint as a validation check for the p2p-peer-address entries. It’s tests include, for example, validation by telnet to the advertised address but they do not include anything based on the relevant EOSIO communication protocols. Thus if an advertised p2p-peer-address passes all the tests that this tool applies this is not a guarantee that synchronisation via that address will be possible.

In light of this, I always monitor the error messages appearing in the nodeos log when starting a node up. I do this for my test node now using:

docker logs --follow eosio-nodeos 2>&1 >/dev/null | grep error

I monitor this for a few minutes as I’ve observed before that there can be initial connection errors for peers that resolve themselves after a short period of time. Before long, I only see this report for a single peer, appearing every thirty seconds in the log:

error 2021-10-21T06:35:06.046 net-1     net_plugin.cpp:2388           operator()           ] incoming message length unexpected (1347703880)
error 2021-10-21T06:35:06.046 net-1     net_plugin.cpp:2437           operator()           ] Closing connection to: mainnet.persiantelos.com:8880

Fortunately, log entries in the “error” category are usually highlight red in the console and so are easy to spot in the nodeos logs even without filtering them.

I believe that what’s happening here is that the “Get Peers” tool in my Telos toolbox :

  1. Finds a producer record on the Telos testnet chain for persiantelos, with a URL indicated as https://persiantelos.com.
  2. Does a GET to https://persiantelos.com/chains.json, which returns the following JSON:
{
   "chains": {
      "4667b205c6838ef70ff7988f6e8257e8be0e1284a2f59699054a018f743b1d11": "/bp.json",
      "1eaa0824707c8c16bd25145493bf062aecddfeb56c736f6ba6397f3195f33c9f": "/bp.json"
   }
}
  1. Matches to the Telos testnet chain id (1eaa082…), which directs to to /bp.json, though as we can see both the entries for Telos mainnet and Telos testnet in this file direct to /bp.json, which I think is the source of the error here.
  2. Does a GET to https://persiantelos.com/bp.json and retrieves the values for p2p_endpoint in the array of nodes in the JSON returned, which yields a single value of mainnet.persiantelos.com:8880.

I believe that, as the name of this p2p_endpoint would suggest, it’s actually a peering node for Telos mainnet and that is why my attempt to connect to it from a testnet node is producing Net Plugin errors. I should point out here that:

  • The Telos block producer community work collaboratively not in competition for the good of Telos.
  • This is Telos testnet, not mainnet and indeed this post would not have been possible had I not encountered both a p2p_enpoint that gave me Net Plugin errors and another that gave me Net Plugin issues that manifest as info messages.
  • We can see that the inherent resilience that is built into EOSIO readily overcomes any such issues.

Using the EOSIO Net API Plugin

Since in my test node have the EOSIO Net API plugin enabled, I am able to use cleos to get the status of all peers connected to it:

docker-compose run cleos net peers

This produces the following output:

[{
    "peer": "testnet.telos.cryptosuvi.io:9876",
    "connecting": false,
    "syncing": false,
    "last_handshake": {
      "network_version": 1210,
      "chain_id": "1eaa0824707c8c16bd25145493bf062aecddfeb56c736f6ba6397f3195f33c9f",
      "node_id": "0d6400bcaa37f115a4c7f64da4ff4f54b6ae9f07e5fbe44cba5efe2f8171a86e",
      "key": "EOS1111111111111111111111111111111114T1Anm",
      "time": "1634798269583622709",
      "token": "0000000000000000000000000000000000000000000000000000000000000000",
      "sig": "SIG_K1_111111111111111111111111111111111111111111111111111111111111111116uk5ne",
      "p2p_address": "Ubuntu-1804-bionic-64-minimal:9876 - 0d6400b",
      "last_irreversible_block_num": 136399500,
      "last_irreversible_block_id": "08214a8c78ffc5939a2cb2b8da0eaa33e8ecb3e8bd92f39d45a15c6fa14b9ecc",
      "head_num": 136399828,
      "head_id": "08214bd487895288817e6e27fe4b8fdb8916bec564cddcff4d837f853c530ae0",
      "os": "linux",
      "agent": "\"Cryptosuvi.io\"",
      "generation": 1
    }
  },{
    "peer": "p2p.testnet.telosunlimited.io:9876",
    "connecting": false,
    "syncing": false,
    "last_handshake": {
      "network_version": 1210,
      "chain_id": "1eaa0824707c8c16bd25145493bf062aecddfeb56c736f6ba6397f3195f33c9f",
      "node_id": "686ec1a2c16abda10be4688ac8692a3150cc99a910237068b5c205fdac38e927",
      "key": "EOS1111111111111111111111111111111114T1Anm",
      "time": "1634798269716439000",
      "token": "0000000000000000000000000000000000000000000000000000000000000000",
      "sig": "SIG_K1_111111111111111111111111111111111111111111111111111111111111111116uk5ne",
      "p2p_address": "condor2233:9876 - 686ec1a",
      "last_irreversible_block_num": 136399500,
      "last_irreversible_block_id": "08214a8c78ffc5939a2cb2b8da0eaa33e8ecb3e8bd92f39d45a15c6fa14b9ecc",
      "head_num": 136399828,
      "head_id": "08214bd487895288817e6e27fe4b8fdb8916bec564cddcff4d837f853c530ae0",
      "os": "linux",
      "agent": "\"telosunlimit\"",
      "generation": 1
    }
  },{
    "peer": "testnet.blindblocart.io:9876",
    "connecting": false,
    "syncing": false,
    "last_handshake": {
      "network_version": 1210,
      "chain_id": "1eaa0824707c8c16bd25145493bf062aecddfeb56c736f6ba6397f3195f33c9f",
      "node_id": "fc0c4915fbb5b08c123c843d96eab1386d7073c9fde36143145b872928ee564b",
      "key": "EOS1111111111111111111111111111111114T1Anm",
      "time": "1634798309643881526",
      "token": "0000000000000000000000000000000000000000000000000000000000000000",
      "sig": "SIG_K1_111111111111111111111111111111111111111111111111111111111111111116uk5ne",
      "p2p_address": "BBA-TEST.server.com:9876 - fc0c491",
      "last_irreversible_block_num": 136399572,
      "last_irreversible_block_id": "08214ad4ef030578f021e840f1ef4ac9ff76a1b7c2d0b8068ed85e1d875b0bfb",
      "head_num": 136399908,
      "head_id": "08214c24709d309b228fc89447393c4eddadc97aada563c8ee12ae2508f45184",
      "os": "linux",
      "agent": "EOS Test Agent",
      "generation": 8
    }
  },{
    "peer": "mainnet.persiantelos.com:8880",
    "connecting": false,
    "syncing": false,
    "last_handshake": {
      "network_version": 0,
      "chain_id": "0000000000000000000000000000000000000000000000000000000000000000",
      "node_id": "0000000000000000000000000000000000000000000000000000000000000000",
      "key": "EOS1111111111111111111111111111111114T1Anm",
      "time": 0,
      "token": "0000000000000000000000000000000000000000000000000000000000000000",
      "sig": "SIG_K1_111111111111111111111111111111111111111111111111111111111111111116uk5ne",
      "p2p_address": "",
      "last_irreversible_block_num": 0,
      "last_irreversible_block_id": "0000000000000000000000000000000000000000000000000000000000000000",
      "head_num": 0,
      "head_id": "0000000000000000000000000000000000000000000000000000000000000000",
      "os": "",
      "agent": "",
      "generation": 0
    }
  },{
    "peer": "testnet-p2p.telosuk.io:9877",
    "connecting": false,
    "syncing": false,
    "last_handshake": {
      "network_version": 1206,
      "chain_id": "1eaa0824707c8c16bd25145493bf062aecddfeb56c736f6ba6397f3195f33c9f",
      "node_id": "9beeaf6e9e957c0c65df7df3f3ea272b7690a4fe7c035bec47fba6c1abe48a07",
      "key": "EOS1111111111111111111111111111111114T1Anm",
      "time": "1634798269803860000",
      "token": "0000000000000000000000000000000000000000000000000000000000000000",
      "sig": "SIG_K1_111111111111111111111111111111111111111111111111111111111111111116uk5ne",
      "p2p_address": "telostestnet01:9877 - 9beeaf6",
      "last_irreversible_block_num": 136399500,
      "last_irreversible_block_id": "08214a8c78ffc5939a2cb2b8da0eaa33e8ecb3e8bd92f39d45a15c6fa14b9ecc",
      "head_num": 136399828,
      "head_id": "08214bd487895288817e6e27fe4b8fdb8916bec564cddcff4d837f853c530ae0",
      "os": "linux",
      "agent": "\"latin textbook\"",
      "generation": 1
    }
  },{
    "peer": "peer1-telos-testnet.eosphere.io:9876",
    "connecting": false,
    "syncing": false,
    "last_handshake": {
      "network_version": 1206,
      "chain_id": "1eaa0824707c8c16bd25145493bf062aecddfeb56c736f6ba6397f3195f33c9f",
      "node_id": "dae17e1c30aaeb41d4beded401301a1390aff3a421bd1c07be8064717cfa8ed0",
      "key": "EOS1111111111111111111111111111111114T1Anm",
      "time": "1634798270161575000",
      "token": "0000000000000000000000000000000000000000000000000000000000000000",
      "sig": "SIG_K1_111111111111111111111111111111111111111111111111111111111111111116uk5ne",
      "p2p_address": "telos-testnet.eosphere.io:9876 - dae17e1",
      "last_irreversible_block_num": 136399500,
      "last_irreversible_block_id": "08214a8c78ffc5939a2cb2b8da0eaa33e8ecb3e8bd92f39d45a15c6fa14b9ecc",
      "head_num": 136399829,
      "head_id": "08214bd5226b86ce3c71cfb5e8464f496255f2d2eb631fafe1ff345de50165ce",
      "os": "linux",
      "agent": "\"EOSphere Telos Testnet\"",
      "generation": 1
    }
  },{
    "peer": "testnet.telos.caleos.io:9879",
    "connecting": false,
    "syncing": false,
    "last_handshake": {
      "network_version": 1206,
      "chain_id": "1eaa0824707c8c16bd25145493bf062aecddfeb56c736f6ba6397f3195f33c9f",
      "node_id": "528d52d68af88e81e7aa9e7fd362ab63351838e45a07b7a8037a70577f0f70b9",
      "key": "EOS1111111111111111111111111111111114T1Anm",
      "time": "1634798269942640000",
      "token": "0000000000000000000000000000000000000000000000000000000000000000",
      "sig": "SIG_K1_111111111111111111111111111111111111111111111111111111111111111116uk5ne",
      "p2p_address": "api2.hosts.caleos.io:9876 - 528d52d",
      "last_irreversible_block_num": 136399500,
      "last_irreversible_block_id": "08214a8c78ffc5939a2cb2b8da0eaa33e8ecb3e8bd92f39d45a15c6fa14b9ecc",
      "head_num": 136399828,
      "head_id": "08214bd487895288817e6e27fe4b8fdb8916bec564cddcff4d837f853c530ae0",
      "os": "linux",
      "agent": "\"CalEOS testnet state-history\"",
      "generation": 1
    }
  },{
    "peer": "telosgermany-testnet.genereos.io:9876",
    "connecting": false,
    "syncing": false,
    "last_handshake": {
      "network_version": 1210,
      "chain_id": "1eaa0824707c8c16bd25145493bf062aecddfeb56c736f6ba6397f3195f33c9f",
      "node_id": "daf4d0033171e1cb583ae0172478d64448912988ba5f70a12103e7f7534070d6",
      "key": "EOS1111111111111111111111111111111114T1Anm",
      "time": "1634798269931051000",
      "token": "0000000000000000000000000000000000000000000000000000000000000000",
      "sig": "SIG_K1_111111111111111111111111111111111111111111111111111111111111111116uk5ne",
      "p2p_address": "tf-telos-testnet-fn-1:9876 - daf4d00",
      "last_irreversible_block_num": 136399500,
      "last_irreversible_block_id": "08214a8c78ffc5939a2cb2b8da0eaa33e8ecb3e8bd92f39d45a15c6fa14b9ecc",
      "head_num": 136399829,
      "head_id": "08214bd5226b86ce3c71cfb5e8464f496255f2d2eb631fafe1ff345de50165ce",
      "os": "linux",
      "agent": "\"Telos Germany\"",
      "generation": 1
    }
  },{
    "peer": "testnet2p2p.telosarabia.net:9876",
    "connecting": false,
    "syncing": false,
    "last_handshake": {
      "network_version": 1206,
      "chain_id": "1eaa0824707c8c16bd25145493bf062aecddfeb56c736f6ba6397f3195f33c9f",
      "node_id": "cd23b484b2a790c718c86173080b1168adf8d543b0a5de5f9291453237f9a308",
      "key": "EOS1111111111111111111111111111111114T1Anm",
      "time": "1634798269862566017",
      "token": "0000000000000000000000000000000000000000000000000000000000000000",
      "sig": "SIG_K1_111111111111111111111111111111111111111111111111111111111111111116uk5ne",
      "p2p_address": "65.21.11.228:9876 - cd23b48",
      "last_irreversible_block_num": 136399500,
      "last_irreversible_block_id": "08214a8c78ffc5939a2cb2b8da0eaa33e8ecb3e8bd92f39d45a15c6fa14b9ecc",
      "head_num": 136399828,
      "head_id": "08214bd487895288817e6e27fe4b8fdb8916bec564cddcff4d837f853c530ae0",
      "os": "linux",
      "agent": "\"EOS Test Agent\"",
      "generation": 1
    }
  },{
    "peer": "tlos-p2p-test.katalyo.com:29877",
    "connecting": false,
    "syncing": false,
    "last_handshake": {
      "network_version": 1206,
      "chain_id": "1eaa0824707c8c16bd25145493bf062aecddfeb56c736f6ba6397f3195f33c9f",
      "node_id": "b4d29f4ae3abfd9c3cd994ebd76c202eb205581e285d42e28c44aae9b68fcde0",
      "key": "EOS1111111111111111111111111111111114T1Anm",
      "time": "1634798269932457000",
      "token": "0000000000000000000000000000000000000000000000000000000000000000",
      "sig": "SIG_K1_111111111111111111111111111111111111111111111111111111111111111116uk5ne",
      "p2p_address": "telos2.katalyo.com:29877 - b4d29f4",
      "last_irreversible_block_num": 136399500,
      "last_irreversible_block_id": "08214a8c78ffc5939a2cb2b8da0eaa33e8ecb3e8bd92f39d45a15c6fa14b9ecc",
      "head_num": 136399829,
      "head_id": "08214bd5226b86ce3c71cfb5e8464f496255f2d2eb631fafe1ff345de50165ce",
      "os": "linux",
      "agent": "\"EOS Test Agent\"",
      "generation": 1
    }
  },{
    "peer": "boid-telos-testnet-p2p.animusystems.com:3535",
    "connecting": false,
    "syncing": false,
    "last_handshake": {
      "network_version": 1210,
      "chain_id": "1eaa0824707c8c16bd25145493bf062aecddfeb56c736f6ba6397f3195f33c9f",
      "node_id": "09223c8964768688aebde789f21041620c91a0db75ffc11c9a1ba39330d3103a",
      "key": "EOS1111111111111111111111111111111114T1Anm",
      "time": "1634798269964773000",
      "token": "0000000000000000000000000000000000000000000000000000000000000000",
      "sig": "SIG_K1_111111111111111111111111111111111111111111111111111111111111111116uk5ne",
      "p2p_address": "boid-telos-testnet-p2p.animusystems.com:3535 - 09223c8",
      "last_irreversible_block_num": 136399500,
      "last_irreversible_block_id": "08214a8c78ffc5939a2cb2b8da0eaa33e8ecb3e8bd92f39d45a15c6fa14b9ecc",
      "head_num": 136399829,
      "head_id": "08214bd5226b86ce3c71cfb5e8464f496255f2d2eb631fafe1ff345de50165ce",
      "os": "linux",
      "agent": "bp.boid",
      "generation": 1
    }
  },{
    "peer": "boid-telos-testnet-p2p.animusystems.com:5050",
    "connecting": false,
    "syncing": false,
    "last_handshake": {
      "network_version": 1210,
      "chain_id": "1eaa0824707c8c16bd25145493bf062aecddfeb56c736f6ba6397f3195f33c9f",
      "node_id": "051cb1d94b02a600d35267db15e299d47ce9aa8a13747caab516204f1b513cbb",
      "key": "EOS1111111111111111111111111111111114T1Anm",
      "time": "1634798269945186000",
      "token": "0000000000000000000000000000000000000000000000000000000000000000",
      "sig": "SIG_K1_111111111111111111111111111111111111111111111111111111111111111116uk5ne",
      "p2p_address": "boid-telos-testnet-p2p.animusystems.com:5353 - 051cb1d",
      "last_irreversible_block_num": 136399500,
      "last_irreversible_block_id": "08214a8c78ffc5939a2cb2b8da0eaa33e8ecb3e8bd92f39d45a15c6fa14b9ecc",
      "head_num": 136399829,
      "head_id": "08214bd5226b86ce3c71cfb5e8464f496255f2d2eb631fafe1ff345de50165ce",
      "os": "linux",
      "agent": "bp.boid-backup",
      "generation": 1
    }
  },{
    "peer": "telos-testnet.eosio.cr:9879",
    "connecting": false,
    "syncing": false,
    "last_handshake": {
      "network_version": 1210,
      "chain_id": "1eaa0824707c8c16bd25145493bf062aecddfeb56c736f6ba6397f3195f33c9f",
      "node_id": "7cd6f8576a741522fc62f84d640811225607c78c9c60403ef1f237d87548ce75",
      "key": "EOS1111111111111111111111111111111114T1Anm",
      "time": "1634798270159928000",
      "token": "0000000000000000000000000000000000000000000000000000000000000000",
      "sig": "SIG_K1_111111111111111111111111111111111111111111111111111111111111111116uk5ne",
      "p2p_address": "telos-testnet.eosio.cr:9879 - 7cd6f85",
      "last_irreversible_block_num": 136399500,
      "last_irreversible_block_id": "08214a8c78ffc5939a2cb2b8da0eaa33e8ecb3e8bd92f39d45a15c6fa14b9ecc",
      "head_num": 136399829,
      "head_id": "08214bd5226b86ce3c71cfb5e8464f496255f2d2eb631fafe1ff345de50165ce",
      "os": "linux",
      "agent": "\"telos-testnet.eosio.cr\"",
      "generation": 1
    }
  }
]

Some observations:

  • The peer field for each entry corresponds to the p2p-peer-address in the config.ini file. However, it is the p2p_address where one can be established that is reported in the nodeos log files. The p2p_address appears to be as reported by the peer when communication can be successfully established via the p2p-peer-address. This listing gives the mapping between the two.
  • The connecting and syncing fields are always reported as false whenever I produce this peers list. However, I am definitely receiving blocks from the peers and so must be communicating with at least one of them. Perhaps these fields will show true after my node has caught up with the chain? I would welcome any insight into the meaning of these values as comments on this post if anybody has it?
  • The entry for the peer mainnet.persiantelos.com:8880 is distinct from all the others and consistent with what we’re seeing in the nodeos logs.

At this point, I decided to conduct some little tests to further examine the capabilities of the EOSIO Net API Plugin as follows:

Test

Run:
docker-compose run --rm cleos net disconnect mainnet.persiantelos.com:8880

Result

Error reports for this peer in the nodeos log cease and it is no longer reported by:
docker-compose run --rm cleos net peers

Inference

The net disconnect command takes effect immediately, without having to restart the node.

Test

Shutdown the node normally and start it up again.

Result

Error reports for mainnet.persiantelos.com:8880 resume in the nodeos log and it reported again by docker-compose run --rm cleos net peers

Inference

The disconnect of this peer is not remembered and the node attempts to connect to all peers identified by p2p-peer-address entries in the config.ini file on each start-up.

Test

Shutdown the node normally, comment out the entry for testnet.telos.cryptosuvi.io:9876 and restart the node.

Run in turn:
docker-compose run --rm cleos net peers
docker-compose run --rm cleos net connect testnet.telos.cryptosuvi.io:9876
docker-compose run --rm cleos net peers

Result

The peer testnet.telos.cryptosuvi.io:9876 is reported the second time that the net peers command is run but not the first time.

Inference

It is possible to use the net connect command to dynamically connect to peers that are not referenced in a p2p-peer-address entry in the node’s config.ini file.

Net Plugin Info Messages During Catchup

For this next part of my test, I first make sure that my node is not attempting to connect with any peer that is currently being reported as an error, in this case just the single instance for mainnet.persiantelos.com:8880. This is inline with my normal operational practice. While my node remains in catchup with the chain, i.e. well behind the current head block, the Net Plugin continuously outputs info messages like this to the nodeos log:

info 2021-10-23T07:45:35.158 net-1 net_plugin.cpp:1556 operator() ] requesting range 209465 to 209564, from 65.21.11.228:9876 - cd23b48

It’s clear that while my node is in this catchup mode, it is requesting ranges of historical blocks from the peers that it is connected to. If I look closely, I can see that for one peer only messages like this are immediately followed by two other Net Plugin info messages

info  2021-10-23T07:45:52.323 net-1     net_plugin.cpp:1556           operator()           ] requesting range 212365 to 212464, from BBA-TEST.server.com:9876 - fc0c491
info  2021-10-23T07:45:52.442 net-0     net_plugin.cpp:1670           recv_handshake       ] handshake from BBA-TEST.server.com:9876 - fc0c491, lib 136753022, head 136753356, head id d9613153ec8f34a5.. sync 1
info  2021-10-23T07:45:57.324 net-0     net_plugin.cpp:1619           sync_reassign_fetch  ] reassign_fetch, our last req is 212464, next expected is 212365 peer BBA-TEST.server.com:9876 - fc0c491

Every time my node requests a range of blocks from this peer, it responds with a handshake and then it appears that my node has reassigned that fetch to another, later range of blocks. I confess that I don’t understand exactly what is going on here. However, I infer that all is not well in respect of catchup of blocks from this particular peer node.

In this test, I am only seeing info messages that are out of the ordinary in this way for this single node. I observed during the catchup of mainnet that motivated this post that there were out of the ordinary messages appearing for multiple peers during the uncomfortably slow catchup of our external peering node.

I will monitor the catchup speed of my experimental, testnet node, first with this peer included in the list of those that I am connecting to and then again with it removed. To do this, I run this command to get a timestamped status report on my node’s catchup with the chain:

date && docker-compose run --rm cleos get info

This produces the following output:

Sat 23 Oct 09:04:52 BST 2021
Starting eosio-keosd ... done
{
  "server_version": "32a47e39",
  "chain_id": "1eaa0824707c8c16bd25145493bf062aecddfeb56c736f6ba6397f3195f33c9f",
  "head_block_num": 216933,
  "last_irreversible_block_num": 216932,
  "last_irreversible_block_id": "00034f64dca30c7eb2ce81a92c68eb83f767f4d71504c510afc5481df55c2ada",
  "head_block_id": "00034f65462965fc4730a39e627fbb6a21c2f089e43d7308bd6a6d63cc470cac",
  "head_block_time": "2019-08-17T04:40:44.000",
  "head_block_producer": "eosio",
  "virtual_block_cpu_limit": 200000000,
  "virtual_block_net_limit": 1048576000,
  "block_cpu_limit": 200000,
  "block_net_limit": 1048576,
  "server_version_string": "v2.0.12",
  "fork_db_head_block_num": 216933,
  "fork_db_head_block_id": "00034f65462965fc4730a39e627fbb6a21c2f089e43d7308bd6a6d63cc470cac",
  "server_full_version_string": "v2.0.12-32a47e396a13b948263b1dc9762a6e2292a2c2ef"
}

After roughly half an hour, I run the same command again, which this time produces this output:

Sat 23 Oct 09:34:39 BST 2021
Starting eosio-keosd ... done
{
  "server_version": "32a47e39",
  "chain_id": "1eaa0824707c8c16bd25145493bf062aecddfeb56c736f6ba6397f3195f33c9f",
  "head_block_num": 468663,
  "last_irreversible_block_num": 468662,
  "last_irreversible_block_id": "000726b6705c27f8973be5e39f11efb923ce5e79b33f3780006514b9f36af9af",
  "head_block_id": "000726b765d2335c9993055a30c2d805daccf36724c2fa0c94d484d3324ded40",
  "head_block_time": "2019-08-18T15:38:29.000",
  "head_block_producer": "eosio",
  "virtual_block_cpu_limit": 200000000,
  "virtual_block_net_limit": 1048576000,
  "block_cpu_limit": 200000,
  "block_net_limit": 1048576,
  "server_version_string": "v2.0.12",
  "fork_db_head_block_num": 468663,
  "fork_db_head_block_id": "000726b765d2335c9993055a30c2d805daccf36724c2fa0c94d484d3324ded40",
  "server_full_version_string": "v2.0.12-32a47e396a13b948263b1dc9762a6e2292a2c2ef"
}

A gain of 251,730 blocks (468,663-216,933) in 1,787 seconds (09:34:39-09:04:52) or approximately 141 blocks per second.

I now run the command docker-compose run --rm cleos net disconnect testnet.blindblocart.io:9876, which produces the response "connection removed" and also check to see that my nodeos log output is “clean”, i.e. only reports the expected “requesting range” messages and no others.

I now once again run the command to produce a timestamped synchronisation statues report:

Sat 23 Oct 09:40:01 BST 2021
Starting eosio-keosd ... done
{
  "server_version": "32a47e39",
  "chain_id": "1eaa0824707c8c16bd25145493bf062aecddfeb56c736f6ba6397f3195f33c9f",
  "head_block_num": 539173,
  "last_irreversible_block_num": 539172,
  "last_irreversible_block_id": "00083a244ce1e7e412b646bc2a07667ef4c70bcb4a0cc422c6d3dcdb38dc5035",
  "head_block_id": "00083a25142563b990f3a1f777a5fe70fb059ee6396aa54b71a2a58a3a25e77e",
  "head_block_time": "2019-08-19T01:26:04.000",
  "head_block_producer": "eosio",
  "virtual_block_cpu_limit": 200000000,
  "virtual_block_net_limit": 1048576000,
  "block_cpu_limit": 200000,
  "block_net_limit": 1048576,
  "server_version_string": "v2.0.12",
  "fork_db_head_block_num": 539173,
  "fork_db_head_block_id": "00083a25142563b990f3a1f777a5fe70fb059ee6396aa54b71a2a58a3a25e77e",
  "server_full_version_string": "v2.0.12-32a47e396a13b948263b1dc9762a6e2292a2c2ef"
}

And yet again approximately half an hour later:

Sat 23 Oct 10:10:02 BST 2021
Starting eosio-keosd ... done
{
  "server_version": "32a47e39",
  "chain_id": "1eaa0824707c8c16bd25145493bf062aecddfeb56c736f6ba6397f3195f33c9f",
  "head_block_num": 1194673,
  "last_irreversible_block_num": 1194492,
  "last_irreversible_block_id": "001239fc04f6c4ec8322f374722a76edfde0adafc696ca34c417ce7f3f922ed8",
  "head_block_id": "00123ab1cdb18f34d9459f1bbaa428cf28ae4ef6f87a26a1baf0f2d4cf63557b",
  "head_block_time": "2019-08-22T20:37:42.000",
  "head_block_producer": "swedencornet",
  "virtual_block_cpu_limit": 200000000,
  "virtual_block_net_limit": 1048576000,
  "block_cpu_limit": 200000,
  "block_net_limit": 1048576,
  "server_version_string": "v2.0.12",
  "fork_db_head_block_num": 1194673,
  "fork_db_head_block_id": "00123ab1cdb18f34d9459f1bbaa428cf28ae4ef6f87a26a1baf0f2d4cf63557b",
  "server_full_version_string": "v2.0.12-32a47e396a13b948263b1dc9762a6e2292a2c2ef"
}

This time a gain of 655,500 blocks (1,194,673-539,173) in 1,801 seconds (10:10:02-09:40:01) or approximately 364 blocks per second.

So, there has certainly been a significant performance improvement from removing just that single peer from the list of those we were connecting to. This isn’t has dramatic as the difference we saw in the catchup performance of our mainnet block producing and proxy nodes to mainnet but there the proxy node showed several peer connect issues reported by Net Plugin as info messages.

Nodes that are in synchronisation with the chain and just peering to remain so only have to process two blocks per second. So, I assume that they are insenstive to such issues.

Conclusions

  1. It is extremely useful to have the Net Plugin API available on a peering node.
  2. Our current arrangement whereby our peering nodes also provide our external API service is problematic. It prevents us from having the Net API plugin enabled on those nodes. A better approach would be to separate these functions and only expose the API service on the peering nodes internally for our own, secure use.
  3. If you have nodes in catchup mode, which new block producers might have to do for example, then you need to pay attention to any out of the ordinary Net Plugin messages, whether they are categorised as errors or not in order to get good catchup performance.

0 thoughts on “Managing Nodeos P2P Peer Connections

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>