Over the course of Putin’s 17-year reign, Russian defense spending has increased 20-fold. Arms procurement grew by 60 percent in 2015 alone. Kremlin rhetoric over the past several years has also shifted in a disturbingly confrontational direction. Putin’s recent justification for the infamous Molotov-Ribbentrop pact between Nazi Germany and the Soviet Union—stating, alongside a stunned Merkel, that the infamous agreement which divided up Eastern Europe between the two totalitarian powers “ensur[ed] the security of the USSR”—epitomizes the moral failure of Russian elites to come to terms with the Soviet past. Other Russian officials, meanwhile, engage in shockingly loose talk about using nuclear weapons and Russian military exercises frequently end with simulated nuclear strikes on NATO capitals.

A resistance band is an elastic band used for strength training. They are also commonly used in physical therapy, specifically by convalescents of muscular injuries, including cardiac rehab patients to allow slow rebuilding of strength.


Originating in the early 20th century, the bands were originally made from surgical tubing and the exercises conducted for muscle rehabilitation, and resistance band training is now used widely as part of general fitness and strength training. Their flexibility in use and light weight are a significant advantage for many users.

Typically the bands are color coded to show different levels of resistance and users need to select an appropriate level.
Code colors vary between brands.

Also available are loop bands as well as tubing without handles and bands set up with handles [a common option for many purchasers].

The posts should appear around Sunday. Eventually, we hope that the posts will also form the basis of a new book on bandits that we are very excited about.

For now, we would like to invite everyone interested in bandit problems to follow this site, give us feedback by commenting on these pages, ask questions, make suggestions for other topics, or criticize what we write. In other words, we wish to leverage the wisdom of crowd in this adventure to help us to make the course better.

So this is the high level background.
Today, in the remainder of this post I will first briefly motivate why anyone should care about bandits and look at where the name comes from. Next, I will introduce the formal language that we will use later and finish by peeking into what will happen in the rest of the semester. This is pretty basic stuff.

Sometimes, this is also called “stochastic i.i.d.” bandits since for a given action, any reward for that action is independent of all the other rewards of the same action and they are all generated from identical distributions (that is, there is one distribution per action). Since “stochastic, stationary” and “stochastic i.i.d.” are a bit too long, in the future, we will just refer to this setting as that of “stochastic bandits“. This is the problem setting that we will start discussing next week.

For some applications the assumption that the rewards are generated in a stochastic and stationary way may be too restrictive.
In particular, stochastic assumptions can be hard to justify in the real-world: The world, for most of what we know about it, is deterministic, if it is hard to predict and often chaotic looking.

Thus, in step 1 of round $t$, the learner’s decision is based on the history of interaction up to the end of round $t-1$, i.e., on $H_{t-1} = (A_1,X_1,\dots,A_{t-1},X_{t-1})$.

Our goal, is to equip the learner with a learning algorithm to maximize its reward. Most of the time, the word “algorithm” will not be taken too seriously in the sense that a “learning algorithm” will be viewed as a mapping of possible histories to actions (possibly randomized). Nevertheless, throughout the course we will keep an eye on discussing whether such maps can be efficiently implemented on computers (justifying the name “algorithms”).

The next question to discuss is how to evaluate a learner (to simplify language, we identify learners and learning algorithms)? One idea is to measure learning speed by what is called the regret.

Those on the fringes of both sides — the “ban all guns” folks as well as the “they’re gonna take our guns” folks do NOT represent the majority of Americans. But because the algorithm thrives on enragement for engagement, it can often seem like those amplified views represent the majority.

With that in mind, I imagine I’m unable to consider some other reasonable solutions that gun advocates regularly see in their feeds simply because they don’t show up in mine.

Getting Out of Our Comfortable Bubbles

Being willing to leave our insulated echo chambers is the first step. But it’s a necessary step to open a dialogue. Each of us needs to put the outrage aside and really take the time to hear what the other side has to say.

I’m convinced, there is consensus to be found on this issue.

Only now, after Russia’s audacious interference in the American presidential election, have Obama and his allies in the Democratic Party belatedly awoken to the ideological challenge posed by Putin’s counter-Enlightenment, one that exports kleptocracy and disorder through a European fifth column of front organizations, political parties, media organs, reactivated KGB networks and plain hired hands.

Russian President Vladimir Putin, left, meets Hungarian Prime Minister Viktor Orbán | Sean Gallup/Getty Images

The avatar of the Kremlin-friendly conservative is Hungarian Prime Minister Viktor Orban, who, over the last quarter century, has undergone one of the more remarkable transformations in European politics from liberal, anti-communist firebrand to Putin’s closest ally in the EU.

Of course, stochasticity has been enormously successful to explain mass phenomenon and patterns in data and for some this may be sufficient reason to keep it as the modeling assumption. But what if the stochastic assumptions fail to hold? What if they are violated for a single round? Or just for one action, at some rounds? Will all our results become suddenly vacuous? Or will the algorithms developed be robust to smaller or larger deviations from the modeling assumptions? One approach, which is admittedly quite extreme, is to drop all the assumptions on how rewards are assigned to arms. More precisely, some minimal assumptions are kept, like that the rewards lie in a bounded interval, or that the reward assignment is done before the interaction begins, or simultaneously with the choice of the learner’s action.

Clinton. The assailant came to this conclusion after marinating in a stew of conspiracy websites that developed the story based upon email correspondence stolen by Russian hackers from Democratic Party servers. While this was a lone wolf incident, it is not difficult to fathom the prospect of more aimless, politically malleable young men in the West (a demographic disproportionately supportive of Trump and other far-right movements) “self-radicalizing” through the path of inflammatory material propagated by Russia or its proxies on the internet, à la Islamic jihadists.

The annexation of Crimea and invasion of Eastern Ukraine is a warning shot across the bow of the West, a message, written in blood, that the old ways of doing business are over.

Less implausible is Russia’s ability to alter the political trajectory of Western politics in a way that suits its geopolitical aims.

