[ad_1]
I have been operating within the internet construction business since 1999; and, prior to 2015, I might by no means heard the time period, “characteristic flag” (or “characteristic toggle”, or “characteristic transfer”). When my Director of Product—Christopher Andersson—pulled me apart and urged that characteristic flags would possibly assist us with our corporate’s downtime drawback, I did not know what he was once speaking about.
Reduce to me in 2023—after 8 years of trial, error, and experimentation—and I will be able to’t believe development some other product or platform with out characteristic flags. They’ve turn out to be a essential a part of my luck. I put characteristic flags in the similar class as I do Logs and Metrics: the crucial products and services on which all product efficiency and balance are constructed.
However, it wasn’t love to start with sight. If truth be told, when Christopher Andersson described characteristic flags to me in 2015, I did not see the worth. In the end, a characteristic flag is simply an if
observation:
if ( featureIsEnabled() ) {
// Execute NEW good judgment.
} else {
// Execute OLD good judgment.
}
I already had keep an eye on glide that gave the impression of this in my packages (see The Standing Quo). As such, I did not perceive why including but some other dependency to our tech-stack would make any distinction in our code, let by myself have a good have an effect on on our downtime.
What I failed to look then was once the basic distinction underlying the 2 tactics. In my way, converting the habits of the if
observation intended updating the code and re-deploying it to manufacturing. However, in relation to characteristic flags, converting the habits of the if
observation intended flipping a transfer.
That is it.
No code updates. No deployments. No latency. No ready.
That is the magic of characteristic flags: being able to dynamically alternate the habits of your utility at runtime. That is what units characteristic flags excluding setting variables, construct flags, and every other form of deploy-time or dev-time atmosphere.
To worry this level: if you’ll be able to’t dynamically alternate the habits of your utility with out touching the code or the servers, you are now not the use of “characteristic flags”. The dynamic runtime nature is not a nice-to-have, it’s the basic motive force that brings each mental protection and inclusion for your group.
This dynamic nature implies that in a single second, our characteristic flag settings can appear to be this:

Which means that that our utility’s keep an eye on glide operates like this:
if ( featureIsEnabled() /* false */ ) {
// ... dormant code ...
} else {
// This code is executing!
}
The featureIsEnabled()
serve as is recently returning false
, directing all incoming site visitors in the course of the else
block.
Then, if we turn the transfer on within the subsequent second, our characteristic flag settings appear to be this:

And, our utility’s keep an eye on glide operates like this:
if ( featureIsEnabled() /* true */ ) {
// This code is executing!
} else {
// ... dormant code ...
}
Immediately—or thereabouts—the featureIsEnabled()
serve as begins returning true
; and, the incoming site visitors is diverted clear of the else
block and into the if
block, converting the habits of our utility in close to real-time.
However, turning a characteristic flag on is most effective part the tale. It is similarly essential that—at any second—a characteristic flag may also be grew to become off. Which means that that, must we want to in case of emergency, we will be able to in an instant disable the characteristic flag settings:

Which is able to right away revert the appliance’s keep an eye on glide again to its earlier state:
if ( featureIsEnabled() /* false */ ) {
// ... dormant code ...
} else {
// This code is executing (once more)!
}
Even with the representation above, that is nonetheless a reasonably summary thought. To put across the ability of characteristic flags extra concretely, let’s dip-down into the real use-case that opened my eyes as much as the chances: refactoring a SQL database question.
The potency of a SQL question adjustments over the life of a product. Because the selection of rows building up and the entry patterns evolve, some SQL queries begin to decelerate. For this reason database index design is simply as a lot artwork as it’s science.
Historically, a refactoring of this kind would possibly contain operating an EXPLAIN
question in the neighborhood, taking a look on the question plan bottlenecks, after which updating the SQL so to higher leverage current desk indices. The question code, as soon as up to date, is then deployed to the manufacturing server. And, what the you hope to look is a latency graph that appears like this:

On this case, the SQL refactoring was once efficient in bringing the question latency instances backtrack. However, that is the most efficient case situation. Within the worst case situation, deploying the refactored question results in a latency graph that appears extra like this:

On this case, one thing went extraordinarily mistaken! For any selection of causes, the brand new SQL question that carried out smartly on your construction setting does now not carry out smartly in manufacturing. The question latency rockets upward, eating lots of the database’s to be had CPU. This, in flip, slows down all queries executing in opposition to the database. Which, in flip, results in a spike in concurrent queries. Which, in flip, starves the thread pool. Which, in flip, crashes the database.
If you happen to see this situation starting to spread on your metrics, you may attempt to roll-back the deployment; or most likely, attempt to revert the code and redeploy it. However, in both case, it is a race in opposition to time. Knocking down photographs, spinning up new nodes, warming up packing containers, beginning packages, operating builds, executing unit checks: all of it takes time—time that you simply shouldn’t have.
Now, believe that, as a substitute of utterly refactoring your code and deploying it, you design an optimized SQL question and gate it in the back of a characteristic flag. Code on your data-access layer may appear to be this:
public array serve as generateReport( userID ) {
if ( featureIsEnabled() ) {
go back( getData_withOptimization( userID ) );
}
go back( getData( userID ) );
}
On this way, each the prevailing SQL question and the optimized SQL question get deployed to manufacturing. Then again, the optimized SQL question may not be launched to the customers till the characteristic flag is enabled. And, at that time, the if
observation will short-circuit the keep an eye on glide and all new requests will use the optimized SQL question.
With this option flag gating the brand new question, the worst case situation appears strikingly other:

The similar sudden SQL efficiency problems exist on this situation. Then again, the end result may be very other. First, realize (within the determine) that the deployment itself had no impact at the latency of the question. That is since the optimized SQL question was once deployed in a dormant state in the back of the characteristic flag. Then, the characteristic flag was once enabled, inflicting site visitors to path in the course of the optimized SQL question. At this level, that latency begins to move up; however, as a substitute of the database crashing, the characteristic flag is grew to become off, right away re-gating the code and diverting site visitors again to the unique SQL question.
You simply have shyed away from an outage. The dynamic runtime capacity of your characteristic flag gave you the ability to react immediately, prior to the database—and your utility—become crushed and unresponsive.
Are you starting to see the chances?
Understanding that you’ll be able to disable a characteristic flag in case of emergency is empowering. This by myself creates an enormous quantity of mental protection. However, it is only the start. Even higher is to fully steer clear of an emergency within the first position. And, to do this, we need to dive deeper into the powerful runtime capability of characteristic flags.
Within the earlier idea experiment, our characteristic flag was once both fully on or fully off. This can be a huge growth over the established order; however, this is not in reality how characteristic flags get carried out. As an alternative, a characteristic flag is most often rolled-out incrementally with a purpose to reduce possibility.
However, prior to we will be able to assume incrementally, we need to perceive a couple of new ideas: concentrating on and variants. Focused on is the act of figuring out which customers will obtain a given a variant. And, a variant is the worth returned by way of comparing a characteristic flag within the context of a given request.
To assist explain those ideas, let’s take the primary if
observation—from previous within the bankruptcy—and factor-out the featureIsEnabled()
name. This may increasingly assist separate the characteristic flag analysis from the next keep an eye on glide and intake:
var booleanVariant = featureIsEnabled();
if ( booleanVariant == true ) {
// Execute NEW good judgment.
} else {
// Execute OLD good judgment.
}
On this instance, our characteristic flag makes use of a Boolean information kind, which is able to most effective ever constitute two imaginable values: true
and false
. Those values are the variants related to the characteristic flag. Focused on for this option flag then manner understanding which requests obtain the true
variant and which requests obtain the false
variant.

Boolean characteristic flags are, by way of a long way, the commonest. Then again, a characteristic flag can constitute any roughly information kind: Booleans, strings, numbers, dates, JSON (JavaScript Object Notation), and so on. The non-Boolean information sorts might compose any selection of variants and free up all way of compelling capability. However, for the instant, let’s persist with our Booleans.
Focused on—the act of funneling requests into a selected variant—calls for us to offer figuring out knowledge as a part of the characteristic flag analysis. There is no “proper kind” of figuring out knowledge—each and every analysis goes to be context-specific; however, I to find that Consumer ID and Consumer E mail are a great spot to begin (for user-facing capability):
var booleanVariant = featureIsEnabled(
userID = request.person.identity,
userEmail = request.person.e mail
);
if ( booleanVariant == true ) {
// Execute NEW good judgment.
} else {
// Execute OLD good judgment.
}
All through this guide, I will refer the request
object as a method to entry details about the incoming HTTP request. The request
object has not anything to do with characteristic flags; and, is right here most effective to give you the values that we want with a purpose to illustrate concentrating on:
-
request.person
– incorporates details about the authenticated person making the request. This may increasingly come with houses likeidentity
ande mail
(as proven above). -
request.consumer
– incorporates details about the browser making the request. This may increasingly come with houses likeipAddress
. -
request.server
– incorporates details about the server this is recently processing the request. This may increasingly come with houses likehost
.
After we incorporate this figuring out knowledge into our characteristic flag analysis, we will be able to start to differentiate one request from some other. That is the place issues begin to get thrilling. As an alternative of our characteristic flag being fully on for all customers, most likely we most effective need it to be on for an allow-listed set of Consumer IDs. One implementation of this type of featureIsEnabled()
serve as would possibly appear to be this:
public boolean serve as featureIsEnabled(
numeric userID = 0,
string userEmail = ""
) {
transfer ( userID ) {
case 1:
case 2:
case 3:
case 4:
go back( true );
ruin;
default:
go back( false );
ruin;
}
}
Or, most likely we most effective need the characteristic flag to be on for customers with an inner corporate e mail cope with:
public boolean serve as featureIsEnabled(
numeric userID = 0,
string userEmail = ""
) {
if ( userEmail incorporates "@bennadel.com" ) {
go back( true );
}
go back( false );
}
Or, most likely we most effective need the characteristic flag to be enabled for a small share of customers:
public boolean serve as featureIsEnabled(
numeric userID = 0,
string userEmail = ""
) {
var userPercentile = ( userID % 100 );
if ( userPercentile <= 5 ) {
go back( true );
}
go back( false );
}
On this case, we are the use of the modulo operator to continuously translate the Consumer ID right into a numeric price. This numeric price provides us a technique to continuously map customers onto a percentile: each and every further the rest represents an extra 1% of customers. Right here, we are enabling our characteristic flag for a consistently-segmented 5% of customers.
We will be able to even mix a number of other concentrating on ideas directly with a purpose to observe extra granular keep an eye on. Believe that we most effective need to goal inner corporate customers; and, of the ones focused customers, most effective allow the characteristic for 25% of them:
public boolean serve as featureIsEnabled(
numeric userID = 0,
string userEmail = ""
) {
// First, goal according to e mail.
if ( userEmail incorporates "@bennadel.com" ) {
var userPercentile = ( userID % 100 );
// 2nd, goal according to percentile.
if ( userPercentile <= 25 ) {
go back( true );
}
}
go back( false );
}
Consumer concentrating on, blended with a %-based rollout, is a surprisingly robust a part of the characteristic flag workflow. Now, as a substitute of enabling a probably dangerous characteristic for all customers at one time, believe a a lot more graduated rollout the use of characteristic flags:
- Deploy dormant code to manufacturing servers.
- Allow characteristic flag for your person ID.
- Take a look at characteristic in manufacturing.
- Find a computer virus.
- Repair computer virus and redeploy code (nonetheless most effective energetic in your person).
- Read about error logs.
- Allow characteristic flag for inner corporate customers.
- Read about error logs and metrics.
- Uncover computer virus(s).
- Repair computer virus(s) and redeploy code (nonetheless most effective energetic for inner corporate customers).
- Allow characteristic flag for 10% of all customers.
- Read about error logs and metrics.
- Allow characteristic flag for 25% of all customers.
- Read about error logs and metrics.
- Allow characteristic flag for 50% of all customers.
- Read about error logs and metrics.
- Allow characteristic flag for 75% of all customers.
- Read about error logs and metrics.
- Allow characteristic flag for all customers.
- Have fun!
Few deployments will want this a lot rigor. However, when the danger point is prime, the keep an eye on is there; and, virtually the entire possibility related along with your deployment may also be mitigated.
Are you starting to see the chances?
Up to now, for the sake of simplicity, I have been hard-coding the dynamic good judgment inside of our featureIsEnabled()
serve as. However, with a purpose to facilitate the graduated deployment defined above, this encapsulated good judgment should even be dynamic. That is, most likely, probably the most elusive a part of the characteristic flags psychological style.
The characteristic flag analysis procedure is powered by way of a laws engine. You supply inputs, figuring out the request context (ex, Consumer ID and Consumer E mail). And, the characteristic flag carrier then applies its laws for your inputs and returns a variant.
There may be not anything random about this procedure—it’s natural, deterministic, and repeatable. The similar laws carried out to the similar inputs will at all times lead to the similar variant output. Subsequently, once we communicate concerning the dynamic runtime nature of characteristic flags, it’s in truth the foundations, throughout the laws engine, which might be in fact dynamic.
Believe the sooner model of our featureIsEnabled()
serve as that ran in opposition to the userID
:
public boolean serve as featureIsEnabled(
numeric userID = 0,
string userEmail = ""
) {
transfer ( userID ) {
case 1:
case 2:
case 3:
case 4:
go back( true );
ruin;
default:
go back( false );
ruin;
}
}
As an alternative of a transfer
observation, let’s refactor this serve as to make use of a rule
information constitution that reads a bit of extra like a rule configuration. We are going to outline an array of values; after which, take a look at to look if the userID
is considered one of the values contained inside of that array:
public boolean serve as featureIsEnabled(
numeric userID = 0,
string userEmail = ""
) {
// Our "rule configuration" information constitution.
var rule = {
enter: "userID",
operator: "IsOneOf",
values: [ 1, 2, 3, 4 ],
variant: true
};
if (
( rule.operator == "IsOneOf" ) &&
rule.values.incorporates( arguments[ rule.input ] )
) {
go back( rule.variant );
}
go back( false );
}
The end result right here is strictly the similar, however the mechanics have modified. We are nonetheless taking the userID
and we are nonetheless on the lookout for it inside of a suite of outlined values; however, the static values and the ensuing variant had been pulled out of the analysis good judgment.
At this level, we will be able to transfer the guideline definition out of the featureIsEnabled()
serve as and into its personal serve as, getRuleDefinition()
:
public boolean serve as featureIsEnabled(
numeric userID = 0,
string userEmail = ""
) {
var rule = getRuleDefinition();
if (
( rule.operator == "IsOneOf" ) &&
rule.values.incorporates( arguments[ rule.input ] )
) {
go back( rule.variant );
}
go back( false );
}
public struct serve as getRuleDefinition() {
go back({
enter: "userID",
operator: "IsOneOf",
values: [ 1, 2, 3, 4 ],
variant: true
});
}
Right here, we’ve got utterly decoupled the intake of our characteristic flag rule from the definition of our characteristic flag rule. Which means that, if we would have liked to switch the end result of the featureIsEnabled()
name, we would not alternate the good judgment within the featureIsEnabled()
serve as. As an alternative, we would replace getRuleDefinition()
.
However, the whole thing continues to be hard-coded. In an effort to make our characteristic flag machine dynamic, we want to substitute the hard-coded data-structure with one thing to the impact of:
- A database question.
- A Redis
GET
command. - A connection with a shared in-memory cache (being up to date within the background).
Which creates an utility structure like this:

The implementation main points is determined by your selected answer. However, each and every way reduces all the way down to the similar set of ideas: a characteristic flag management machine that may replace the energetic laws getting used throughout the characteristic flag laws engine this is recently running in a given setting. That is what makes the dynamic runtime habits imaginable.
To start with blush, it’ll appear that integrating characteristic flags into your utility good judgment contains a large number of low-level complexity. However, do not be put-off by way of this—you do not in fact need to know the way the foundations engine works with a purpose to extract the worth. I most effective step down into the weeds right here as a result of having even a cursory working out of the low-level mechanics could make it a lot more uncomplicated to know how characteristic flags have compatibility into your product construction ecosystem.
The truth is, any characteristic flags implementation that you select will abstract-away lots of the complexity that we’ve got mentioned. All the variants and the user-targeting and %-based rollout configuration will probably be moved from your utility into the characteristic flags management, leaving you with somewhat easy code that appears like this:
var useNewWorkflow = featureFlags.getVariant(
characteristic = "new-workflow-optimization",
context = {
userID: request.person.identity,
userEmail: request.person.e mail
}
);
if ( useNewWorkflow ) {
// Execute NEW good judgment.
} else {
// Execute OLD good judgment.
}
This by myself could have a significant have an effect on for your product balance and uptime. However, it is only the start—the knock-on results of a feature-flag-based construction workflow will echo all through all your group. It is going to change into the best way you take into accounts product construction; it’s going to change into the best way you have interaction with shoppers; and, it’s going to change into the very nature of your corporate tradition.
Get the total guide at featureflagsbook.com →
[ad_2]