- Replication >
- Replication Concepts >
- Replica Set Read and Write Semantics >
- Read Preference Processes
Read Preference Processes¶
MongoDB drivers use the following procedures to direct operations to replica sets and sharded clusters. To determine how to route their operations, applications periodically update their view of the replica set’s state, identifying which members are up or down, which member is primary, and verifying the latency to each mongod instance.
Member Selection¶
Clients, by way of their drivers, and mongos instances for sharded clusters, periodically update their view of the replica set’s state.
When you select non-primary read preference, the driver will determine which member to target using the following process:
Assembles a list of suitable members, taking into account member type (i.e. secondary, primary, or all members).
Excludes members not matching the tag sets, if specified.
Determines which suitable member is the closest to the client in absolute terms.
Builds a list of members that are within a defined ping distance (in milliseconds) of the “absolute nearest” member.
Applications can configure the threshold used in this stage. The default “acceptable latency” is 15 milliseconds, which you can override in the drivers with their own secondaryAcceptableLatencyMS option. For mongos you can use the --localThreshold or localPingThresholdMs runtime options to set this value.
Selects a member from these hosts at random. The member receives the read operation.
Drivers can then associate the thread or connection with the selected member. This request association is configurable by the application. See your driver documentation about request association configuration and default behavior.
Request Association¶
Important
Request association is configurable by the application. See your driver documentation about request association configuration and default behavior.
Because secondary members of a replica set may lag behind the current primary by different amounts, reads for secondary members may reflect data at different points in time. To prevent sequential reads from jumping around in time, the driver can associate application threads to a specific member of the set after the first read, thereby preventing reads from other members. The thread will continue to read from the same member until:
- The application performs a read with a different read preference,
- The thread terminates, or
- The client receives a socket exception, as is the case when there’s a network error or when the mongod closes connections during a failover. This triggers a retry, which may be transparent to the application.
When using request association, if the client detects that the set has elected a new primary, the driver will discard all associations between threads and members.
Auto-Retry¶
Connections between MongoDB drivers and mongod instances in a replica set must balance two concerns:
- The client should attempt to prefer current results, and any connection should read from the same member of the replica set as much as possible. Requests should prefer request association (e.g. pinning).
- The client should minimize the amount of time that the database is inaccessible as the result of a connection issue, networking problem, or failover in a replica set.
As a result, MongoDB drivers:
Reuse a connection to a specific mongod for as long as possible after establishing a connection to that instance. This connection is pinned to this mongod.
Attempt to reconnect to a new member, obeying existing read preference modes, if the connection to mongod is lost.
Reconnections are transparent to the application itself. If the connection permits reads from secondary members, after reconnecting, the application can receive two sequential reads returning from different secondaries. Depending on the state of the individual secondary member’s replication, the documents can reflect the state of your database at different moments.
Return an error only after attempting to connect to three members of the set that match the read preference mode and tag set. If there are fewer than three members of the set, the client will error after connecting to all existing members of the set.
After this error, the driver selects a new member using the specified read preference mode. In the absence of a specified read preference, the driver uses primary.
After detecting a failover situation, [1] the driver attempts to refresh the state of the replica set as quickly as possible.
Changed in version 3.0.0: mongos instances take a slightly different approach. mongos instances return connections to secondaries to the connection pool after every request. As a result, the mongos reevaluates read preference for every operation.
[1] | When a failover occurs, all members of the set close all client connections that produce a socket error in the driver. This behavior prevents or minimizes rollback. |
Read Preference in Sharded Clusters¶
In most sharded clusters, each shard consists of a replica set. As such, read preferences are also applicable. With regard to read preference, read operations in a sharded cluster are identical to unsharded replica sets.
Unlike simple replica sets, in sharded clusters, all interactions with the shards pass from the clients to the mongos instances that are actually connected to the set members. mongos is then responsible for the application of read preferences, which is transparent to applications.
There are no configuration changes required for full support of read preference modes in sharded environments, as long as the mongos is at least version 2.2. All mongos maintain their own connection pool to the replica set members. As a result:
A request without a specified preference has primary, the default, unless, the mongos reuses an existing connection that has a different mode set.
To prevent confusion, always explicitly set your read preference mode.
All nearest and latency calculations reflect the connection between the mongos and the mongod instances, not the client and the mongod instances.
This produces the desired result, because all results must pass through the mongos before returning to the client.