mrclownpants's comments

mrclownpants · on Aug 28, 2013

It would be cool if you could divert only a fraction of traffic into a particular test. A future feature, perhaps?

zackkitzmiller · on Aug 28, 2013

You actually already can do that, and all of the clients support it. It's probably just not documented.

Passing ?traffic_dist=10 to the participate endpoint will only direct 10% of traffic to the test. The rest will get the control alternative.

I'll add this to the documentation.

btilly · on Aug 28, 2013

That's not going to be the best API over time.

Give everything a default traffic weight of, say, 10. Make it alterable as above. Divide traffic proportionately between all versions according to their weight.

The reason to do it this way is so that when you go from 5 versions to 4, then 4 to 3, then 3 to 2 you just eliminate versions and don't have to calculate the percentages to rebalance. Trust me, calculating percentages gets very old, very fast.

While you're at it, add the ability for versions to be marked as not to be reported on. This allows people to have versions test, control, and unreported control. You start with a lot in unreported control. You ramp up the test by dropping the weight of unreported control.

sutterbomb · on Aug 28, 2013

+1 on the unreported control. It's a common use case to start an A/B test at a small test percentage, then ramp up over time. Without an unreported control, you can't do that and have valid samples for each group.

(E.g. your sample mix will be invalid if you start at 10 test / 90 control then ramp up to 30/70. But if you start 10 test / 10 control / 80 unreported, then ramp to 30/30/40, your samples will be valid.)

mrclownpants · on Aug 28, 2013

What is a feature toggle?

patio11 · on Aug 28, 2013

Let's hypothetically say you have a new feature Foo. Foo is under active development and works on the test and staging environments, but you're concerned it might not be ready for prime time. You first release Foo to your staff's accounts on the production servers. After they break Foo in a while, you roll it out to 10% of the user base selected randomly, while watching your automated instrumentation to see how it reacts (does it blow up anything? do users care about it? does anyone actually use the thing?). After you've proven Foo out you release it to the entire userbase. Should you at some point have a problem with Foo, you desire the ability to yank it back from all users while you get back to tinkering on it privately.

Feature flags are a way to do that. By happy coincidence, they share semantics almost verbatim with A/B testing. (At a high level of abstraction, the most interesting API is basically User#should_see?(feature_name_goes_here). They typically have a bit more going on in the API than that -- for example, the ability to assign users to groups (like, say, "our employees", "friends & family", "our relentlessly dedicated True Fans (TM) who are willing to suffer the odd bug", "10% of people who signed up last Monday", etc) and assign groups as being able to view a feature. There is often a UI visible for that.

ficklelarry · on Aug 28, 2013

@patio11 what do you think of this thing overall? (Not feature flags, but rather using a language-agnostic framework.)

patio11 · on Aug 28, 2013

I'm broadly in favor of any technology which increases the number of firms which are able to make A/B testing a routine practice at their organizations. This solves an issue which occurs in some deployments. It being available as OSS is therefore unmitigated good news, though I probably won't use it myself.

mrclownpants · on Dec 11, 2012

FWIW, I just downloaded the app that's referenced here, and it's quite nice. Well done.

mrclownpants · on Aug 10, 2012

This has already been posted several times.

ceejayoz · on Aug 10, 2012

First time I've seen it.