Tuning these improperly can cause Consul to fail in unexpected ways. The default values are appropriate in almost all deployments. Increasing this number causes the gossip messages to propagate across the cluster more quickly at the expense of increased bandwidth. The default is 3. If this is set to zero, non-piggyback gossip is disabled. By lowering this value more frequent gossip messages are propagated across the cluster more quickly at the expense of increased bandwidth.
The default is ms. Setting this lower more frequent will cause the cluster to detect failed nodes more quickly at the expense of increased bandwidth usage. The default is 1s. This should be at least the percentile of RTT round-trip time on your network. The default is ms and is a conservative value suitable for almost all realistic deployments.
The number of retransmits is scaled using this multiplier and the cluster size. The higher the multiplier, the more likely a failed broadcast is to converge at the expense of increased bandwidth. The default is 4.
This allows the timeout to scale properly with expected propagation delay with a larger cluster size. The higher the multiplier, the longer an inaccessible node is considered part of the cluster before declaring it dead, giving that suspect node more time to refute if it is indeed still alive.
The default is 5s. The default is 3s and is a conservative value suitable for almost all realistic deployments. The default is 6. Any endpoint that has a common prefix with one of the entries on this list will be blocked and will return a response code when accessed. Any CLI commands that use disabled endpoints will no longer function as well.
It defaults to an empty list, which means all networks are allowed. This is used to make the agent read-only, except for select ip ranges. If disabled, the agent won't be using agent caching to answer the request. Even when the url parameter is provided. It does not limit the size of the request body.
If zero, or negative, http. DefaultMaxHeaderBytes is used, which equates to 1 Megabyte. The default behavior for this feature varies based on whether or not the agent is running as a client or a server prior to Consul 0.
On agents in client-mode, this defaults to true and for agents in server-mode, this defaults to false. See the licensing documentation for more information about Consul Enterprise license management. Added in versions 1. Prior to version 1. Prior to Consul 1. The following parameters are available:. See the -node-meta command-line flag for more information. See the Server Performance documentation for more details. Under normal circumstances, this can prevent clients from experiencing "no leader" errors when performing a rolling update of the Consul servers.
Defaults to 5s. Omitting this value or setting it to 0 uses default timing described below. Lower values are used to tighten timing and increase sensitivity while higher values relax timings and reduce sensitivity. Tuning this affects the time it takes Consul to detect leader failures and to perform leader elections, at the expense of requiring more network and CPU resources for better performance.
By default, Consul will use a lower-performance timing that's suitable for minimal Consul servers , currently equivalent to setting this to a value of 5 this default may be changed in future versions of Consul, depending if the target minimum server profile changes. Setting this to a value of 1 will configure Raft to its highest-performance mode, equivalent to the default timing of Consul prior to 0.
See the note on last contact timing for more details on tuning this parameter. The maximum allowed value is Under normal circumstances, this can prevent clients from experiencing "no leader" errors. Defaults to 7s. All servers and datacenters must agree on the primary datacenter. Takes a list of addresses to use as the mesh gateways for the primary datacenter when authoritative replicated catalog data is not present.
This is a low-level parameter that should rarely need to be changed. Very busy clusters experiencing excessive disk IO may increase this value to reduce disk IO, and minimize the chances of all servers taking snapshots at the same time. Increasing this trades off disk IO for disk space since the log will grow much larger and the space in the raft.
Servers may take longer to recover from crashes or failover if this is increased significantly as more logs will need to be replayed. Since Consul 1. This should only be adjusted when followers cannot catch up to the leader due to a very large snapshot size and high write throughput causing log truncation before an snapshot can be fully installed on a follower.
If you need to use this to recover a cluster, consider reducing write throughput or the amount of data stored on Consul as it is likely under a load it is not designed to handle. The default value is which is suitable for all normal workloads. Added in Consul 1. If this isn't specified, then Consul will automatically reap child processes if it detects it is running as PID 1.
If this is set to true or false, then it controls reaping regardless of Consul's PID forces reaping on or off, respectively. This option was removed in Consul 0. For later versions of Consul, you will need to reap processes using a wrapper, please see the Consul Docker image entry point script for an example.
If you are using Docker 1. More info on Docker docs. This defaults to 72 hours and it is recommended that this is set to at least double the maximum expected recoverable outage time for a node or network partition.
WARNING: Setting this time too low could cause Consul servers to be removed from quorum during an extended node failure or partition, which could complicate recovery of the cluster. The value is a time with a unit suffix, which can be "s", "m", "h" for seconds, minutes, or hours. For example, a node can use Consul directly as a DNS server, and if the record is outside of the "consul. As of Consul 1. IP addresses are resolved in order, and duplicates are ignored. This ensures sessions are not created with TTL's shorter than the specified limit.
It is recommended to keep this limit at or above the default to encourage clients to send infrequent heartbeats. When Consul receives an interrupt signal such as hitting Control-C in a terminal , Consul will gracefully leave the cluster. Setting this to true disables that behavior.
On agents in client-mode, this defaults to false and for agents in server-mode, this defaults to true i. Ctrl-C on a server will keep the server in the cluster and therefore quorum, and Ctrl-C on a client will gracefully leave. If provided, metric management is enabled. By default, this is set to "consul". By default, this is set to "10s" ten seconds.
The numeric portion of the check. If check management is enabled, the default behavior is to add new metrics as they are encountered. If the metric already exists in the check, it will not be activated. This setting overrides that behavior. By default, this is set to false. It can be used to maintain metric continuity with transient or ephemeral instances as they move around within an infrastructure.
By default, this is set to hostname:application name e. By default, this is set to service:application name e. This name is displayed in the Circonus UI Checks list. Available in Consul 0. The numeric portion of broker. By default, this is not used and a random Enterprise Broker is selected, or the default Circonus Public Broker.
The best use of this is to as a hint for which broker should be used based on where this particular instance is running e. By default, this is left blank and not used. DogStatsD is a protocol-compatible flavor of statsd, with the added ability to decorate metrics with tags and event information. If provided, Consul will send various telemetry information to that instance for aggregation.
This can be used to capture runtime information. Defaults to true , which will allow all metrics when no filters are provided. When set to false with no filters, no metrics will be sent. This was renamed in Consul 1. If there is overlap between two rules, the more specific rule will take precedence.
Blocking will take priority if the same prefix is listed multiple times. The duration can be expressed using the duration semantics and will aggregates all counters for the duration specified it might have an impact on Consul's memory usage. A good value for this parameter is at least 2 times the interval of scrape of Prometheus, but you might also put a very high retention time such as a few days for instance h to enable retention to 31 days.
The format is compatible natively with prometheus. Consul does not use the default Prometheus path, so Prometheus must be configured as follows. Note that using? This sends UDP packets only and can be used with statsd or statsite. If provided, Consul will stream various telemetry information to that instance for aggregation.
This streams via TCP and can only be used with statsite. This allows the node to be reached within its own datacenter using its local address, and reached from other datacenters using its WAN address, which is useful in hybrid setups with mixed networks.
This is disabled by default. Starting in Consul 0. An X-Consul-Translate-Addresses header will be present on all responses when translation is enabled to help clients know that the addresses may be translated. The TaggedAddresses field in responses also have a lan address for clients that need knowledge of that address, regardless of translation.
Equivalent to the -ui command-line flag. Configuring the UI with this stanza was added in Consul 1. Boolean value, defaults to false. In -dev mode this defaults to true. Replaces ui from before 1. This allows for customization or development. Equivalent to the -ui-dir command-line flag. Equivalent to the -ui-content-path flag. By default metrics are disabled. These files should contain metrics provider implementations and registration enabling UI metric queries to be customized or implemented for an alternative time-series backend.
Security Note: These javascript files are included in the UI with no further validation or sand-boxing. By configuring them here the operator is fully trusting anyone able to write to them as well as the original authors not to include malicious code in the UI being served.
This simplifies deployment where the metrics backend is not exposed externally to UI users' browsers. It may also be used to augment requests with API credentials to allow serving graphs to UI users without them needing individual access tokens for the metrics backend.
Security Note: Exposing your metrics backend via Consul in this way should be carefully considered in production. As Consul doesn't understand the requests, it can't limit access to only specific resources. For example this might make it possible for a malicious user on the network to query for arbitrary metrics about any server or workload in your infrastructure, or overload the metrics infrastructure with queries.
See Metrics Proxy Security for more details. It should be set to the base URL that the Consul agent should proxy requests for metrics too. This may include a path prefix which will then not be necessary in provider requests to the backend and the proxy will prevent any access to paths without that prefix on the backend.
If a custom provider is used that requires the metrics proxy, the correct allowlist must be specified to enable proxying to necessary endpoints. See Path Allowlist for more information. It may be used to inject Authorization tokens within the agent without exposing those to UI users. It is a map with the name of the template as a key.
The value is a string URL with optional placeholders. Each template may contain placeholders which will be substituted for the correct values in content when rendered in the UI. The placeholders available are listed for each template. For more information and examples see UI Visualization. It is shown as part of the Topology Visualization. This configuration key is not required as of Consul version 0.
Specifying this configuration key will enable the web UI. There is no need to specify both ui-dir and ui. Specifying both will result in an error. It is important to note that this option may have different effects on different operating systems.
Linux generally observes socket file permissions while many BSD variants ignore permissions on the socket file itself. It is important to test this feature on your specific distribution. This feature is currently not functional on Windows hosts. The following options are valid within this construct and apply globally to all sockets created by Consul:.
When enabled Consul client agents will use streaming rpc, instead of the traditional blocking queries, for endpoints which support streaming. All servers must have rpc. See the watch documentation for more detail. Watches can be modified when the configuration is reloaded. This section documents all of the configuration settings that apply to Agent TLS.
We recommend using a dedicated CA which should not be used with any other systems. Any certificate signed by the CA will be allowed to communicate with the cluster and a specially crafted certificate signed by the CA can be used to gain full access to Consul.
The certificate is provided to clients or servers to verify the agent's authenticity. The key is used with the certificate to verify the agent's authenticity. It can be used to ensure that the certificate name matches the hostname we declare. Accepted values are "tls10", "tls11", "tls12", or "tls13".
This defaults to "tls12". Applicable to TLS 1. The list of all supported ciphersuites is available through this search. Note: The ordering of cipher suites will not be guaranteed from Consul 1. See this post for details. Note: This config will be deprecated in Consul 1. By default, this is false, and Consul will not enforce the use of TLS or verify a client's authenticity. If the UI is served, the same checks are performed. By default, this is false, and Consul will not make use of TLS for outgoing connections.
This applies to clients and servers as both will make outgoing connections. By default, this is false, and Consul does not verify the hostname of the certificate, only that it is signed by a trusted CA.
This setting is critical to prevent a compromised client from being restarted as a server and having all cluster state including all ACL tokens and Connect CA root keys replicated to it.
This is new in 0. Security Note: From versions 0. See CVE for more details. Security Note: all three verify options should be set as true to enable secure mTLS communication, enabling both encryption and authentication. We recommend using for https as this default will automatically work with some tooling. Review the required ports table for a list of required ports and their default settings. Reloading configuration does not reload all configuration items. If there are plan accordingly before continuing.
One server at a time, shut down version A via consul leave and restart with version B. Wait until the server is healthy and has rejoined the cluster before moving on to the next server. Once all the servers are upgraded, begin a rollout of clients following the same process. You are now running the latest Consul agent. You can verify this by running consul members to make sure all members have the latest build and highest protocol version.
Operating a Consul datacenter that is multiple major versions behind the current major version can increase the risk incurred during upgrades. We encourage our users to remain no more than two major versions behind i. If you find yourself in a situation where you are many major versions behind, and need to upgrade, please review our Upgrade Instructions page for information on how to perform those upgrades.
In some cases, a backwards incompatible update may be released. This has not been an issue yet, but to support upgrades we support setting an explicit protocol version. This disables incompatible features and enables a 2-phase upgrade. For the steps below, assume you're running version A of Consul, and then version B comes out. Once all nodes are running version B, go through every node and restart the version B agent without the -protocol flag, again wait for each server to rejoin the cluster before continuing.
You're now running the latest Consul agent speaking the latest protocol. You can verify this is the case by running consul members to make sure all members are speaking the same, latest protocol version.
See the licensing documentation for more information about how to configure the license. Client agents previously retrieved their license from the servers in the cluster within 30 minutes of starting and the snapshot agent would similarly retrieve its license from the server or client agent it was configured to use.
As of Consul Enterprise 1. Both agents can still perform automatic retrieval of their license but with a few extra stipulations. First, license auto-retrieval now requires that ACLs are on and that the client or snapshot agent is configured with a valid ACL token. If those stipulations are not met, attempting to start the client or snapshot agent will result in it immediately shutting down. For the step by step upgrade procedures see the Upgrading to 1. For answers to common licensing questions please refer to the FAQ.
Consul versions 1. Consul 1. Both protocol variants are supported in this Consul version to facilitate upgrading Consul and Envoy in a stairstep order to avoid downtime. In a future version of Consul the v2 State of the World protocol support will be removed. Any escape hatches that are defined will likely need to be switched from using xDS v2 to xDS v3 structures. Mostly this involves migrating off of deprecated and now removed fields and switching untyped config to typed config with type attributes set appropriately.
As an example, here's a Zipkin integration before and after. Upgrade Envoy sidecars to the latest version of Envoy that is supported by the currently running version of Consul as well as Consul 1. Determine if you are using the escape hatch feature. If so, rewrite the escape hatch to use the xDS v3 syntax and update the service registration to reflect the updated escape hatch configuration by re-registering.
This should purge v2 elements from any configs. Perform a normal upgrade of both Consul servers and clients to 1. At this point the existing Envoy instances will continue to speak the v2 State of the World protocol to the new Consul instances without issue.
Once a Consul client is upgraded, use an updated CLI binary to re-bootstrap and restart Envoy using consul connect envoy. This will ensure it switches over to the v3 Incremental xDS protocol.
Depending upon how you have chosen to run Envoy this is either one step consul connect envoy or two steps consul connect envoy -bootstrap followed by running Envoy directly. Optionally upgrade Envoy to the latest version supported in Consul 1. This Service should be added before performing the upgrade. This will allow services to be managed by a central component, called endpoints-controller , which will enable features like transparent proxy.
After the upgrade is performed, all Pods of a service will need to be restarted. The service will be up and health checks will continue to work without restarting the service, but a restart is required so the Pods can be re-injected with the latest container configuration.
Consul has defaulted to using Raft protocol 3 since version 1. Users in that position should upgrade to a previous release supporting both protocol versions and update their configuration to use Raft protocol 3 before continuing their upgrade to Consul 1. By default this will now only list the intentions in a specific namespace, rather than listing all intentions across all namespaces. To achieve the same results as Consul versions prior to 1. Upgrading to Consul 1. This process will wait until all of the Consul servers in the primary datacenter are running Consul 1.
All write requests via either the Intentions API endpoints or Config Entry API endpoints for a service-intentions kind will be blocked until the migration process is complete after the upgrade.
Reads will function normally throughout the migration, so authorization enforcement will be unaffected. Secondary datacenters will perform their own one-time migration operations after the primary datacenter completes its migration and all of the Consul servers in the secondary datacenter are running Consul 1. It is safe to upgrade the datacenters in any order. Once the underlying config entry representation is edited it will transition the intention into the newer format where some fields are no longer present.
Once this transition occurs those intentions can no longer be used with the ID-based endpoints unless they are re-created via the old endpoints. Fields that are being removed or changing behavior:. ID after migration is stored in the LegacyID field. After transitioning this field is cleared. CreatedAt after migration is stored in the LegacyCreateTime field. UpdatedAt after migration is stored in the LegacyUpdateTime field. Meta after migration is stored in the LegacyMeta field.
To complete the transition, this field must be cleared manually and the metadata moved up to the enclosing config entry's Meta field. This is not done automatically since it is potentially a lossy operation.
Queries to view or renew sessions from agents on earlier versions will be rejected. This impacts features and products including: Vault, the Enterprise snapshot agent, and locks.
The issue occurs when clients are still running 1. For this reason, we recommend you upgrade directly to 1. Previously, Consul would simply ignore the unknown fields. You will need to ensure that your API usage only uses supported fields which are those documented in the example payloads in the API documentation.
Consul will now return the canonical service name in response to PTR queries. For OSS users the change is that the datacenter will be present where it was not before. For Consul Enterprise users, both the datacenter and the services namespace will be present. For example, where a PTR record would previously have contained web.
Consul has changed the semantics of query counts in its telemetry. The consul. The default value was , but Vault could use up to , which caused problems. If you want to use Vault with Consul 1. Starting with Consul 1. Managed proxies which have been deprecated since Consul 1.
Before upgrading, you will need to migrate any managed proxy usage to sidecar service registrations. There are two major features in Consul 1. Note: As with most major version upgrades, you cannot downgrade once the upgrade to 1. As always it is strongly recommended that you test the upgrade first outside of production and ensure you take backup snapshots of all datacenters before upgrading. The "ACL datacenter" in 1.
All configuration is backwards compatible and shouldn't need to change prior to upgrade although it's strongly recommended to migrate ACL configuration to the new syntax soon after upgrade. Datacenters can be upgraded in any order although secondaries will remain in Legacy ACL mode until the primary datacenter is fully upgraded.
Each datacenter should follow the standard rolling upgrade procedure. When a 1. The server advertises its ability to support 1. In the primary datacenter, the servers all wait in legacy ACL mode until they see every server in the primary datacenter advertise 1. Once this happens, the leader will complete the transition out of "legacy ACL mode" and write this into the state so future restarts don't need to go through the same transition. In a secondary datacenter, the same process happens except that servers additionally wait for all servers in the primary datacenter making it safe to upgrade datacenters in any order.
It should be noted that even if you are not upgrading, starting a brand new 1. As soon as all servers in the primary datacenter have been upgraded to 1. This process completes in the background and is rate limited to ensure it doesn't overload the leader. It completes upgrades in batches of tokens and will not upgrade more than one batch per second so on a cluster with 10, tokens, this may take several minutes.
0コメント