Skip to content

Problems with inter_dc when the first address in the list leads to connection timeout. #470

@define-null

Description

@define-null

So while experimenting with multi-dc setup on mac I discovered the following problems:

  1. Antidote picks all inet interfaces on the machine (including address for special utun interfaces, which correspond to vpn virtual interfaces on mac) when trying to obtain the addresses via inter_dc_pub:getting_addresses/1. Which leads to zeromq connection timeout when trying to connect to port on such an address from inter_dc_sub:add_dc.

  2. Second problem is that both timeouts for gen_server:call in inter_dc_sub:add_dc and zeromq connection timeouts are set to 5 seconds. So even though there is another address to try to connect to, the inter_dc_sub will just fail on the first faulty one. Retry logic in inter_dc_manager:connect_nodes on the other hand would not determine the reason for the failure and would just retry for the same node, with same address list, where first faulty address would be picked again, leading to the same failure.

First issue should be easily fixed by providing possibility to set the addresses antidote should listen on. The second though should result in more accurate decision how to handle reconnects.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions