One problem with using medians is that you can’t calculate a confidence interval for them the same way as you calculate one for a mean. There’s no “standard error of the median”. However, it turns out there is a way to calculate confidence intervals for them.

Let’s be clear about the context. When we calculate a confidence interval for a mean, we’re saying that our data is a sample from some population and that the confidence interval is related to this population mean. Similarly, when we calculate a confidence interval for a median, we’re saying our data is a sample. When there’s a ton of data, a point estimate tells a reasonable story about the population, but when there’s less data, knowing how accurate your estimate is can be important.

I found a nifty bit of math explaining how to calculate them here. I wrote a little R code to implement it here. **You can see it here** with source for the Shiny app here. Note that you’ll need a test file; here is a small one.

FunFact: The same algorithm allows you to generate a confidence interval for percentages other than the 50th (the median). The code I wrote lets you set the percentile and the desired confidence level.

If you find this useful, let me know!

]]>Suppose I was in LA during a drought and it was a weird, one-off shower. You’d say that I got wet because it rained.

Suppose I was in Seattle during an especially rainy season and, uncharacteristically, I forgot my umbrella. You’d say that I got wet because I didn’t have an umbrella.

In both cases, the cause is a necessary condition. In LA and Seattle both, it is necessary (1) for it to rain and (2) for me to not have an umbrella. So a cause seems to be a necessary condition.

There are other necessary conditions in our story. For example, the sun must have risen. Without the sun, the Earth would freeze (or be blown away, it depends on why the sun was gone) so I certainly wouldn’t be getting wet. We don’t say (with seriousness) that the continued existence of the sun caused me to get wet. It’s necessary, but it’s too likely to be considered a cause.

**The cause is the least likely necessary condition.**

In LA, it was unlikely to rain and very likely (in a drought) that I wouldn’t have an umbrella, so the rain is the cause. In Seattle, it was very likely to rain but very unlikely I’d forget my umbrella, so the lack of umbrella is the cause.

There are times when we won’t agree on identifying a cause: (1) when we consider different sets of necessary conditions, and (2) when we don’t agree on the relative likelihood of two necessary conditions. Let’s consider some examples.

**Disagreeing About Cause, Case 1**

Suppose I told you that the Vogons were going to make an interstellar highway and that they were supposed to have destroyed the Earth today. At the last second someone got the date wrong and they waited until tomorrow. Now it’s very unlikely that the sun would have risen (no Earth, no sunrise) but it did, just by luck. You could credibly argue that that was the least likely necessary condition because the Vogons are very detail-oriented and are unlikely to get the date wrong. If we don’t know about the Vogons, we won’t attribute my sogginess to an alien clerical error but instead to the rain or my lack of umbrella, depending on the city. We can’t identify all of the necessary conditions, so which is least likely depends on the set of conditions we consider.

**Disagreeing About Cause, Case 2**

In our LA example we attribute my dampness to the weather because it was less likely than me (quite normally) not bringing an umbrella. However, my wife knows that she saw the weather forecast predicting weather and that she had told me over breakfast that I should take my umbrella. Typically I heed my much smarter spouse, so it was exceedingly unlikely that I’d forget my umbrella. And yet I did. Knowing that, my wife blames me for getting wet. You, not knowing that, blame the weather. (Someone else might ask about what distracted me from taking my umbrella, and blame that.) You and my wife have different subjective beliefs about the likelihood I’d forget my umbrella, so you disagree about the cause. Subjective probability is a central idea here. A little more context can always shift the blame.

This means that cause itself is not objective but rather is subjective, dependent on the conditions identified by the observer and the observer’s beliefs about relative likelihood.

If we can’t be sure of agreement, why bother? One reason is that when we do agree on the relative likelihood of necessary conditions, we can come to agreement about causes. Additionally, even without agreeing on all the necessary conditions to consider it’s possible to agree that something in particular is *not* the cause.

Suppose your loved one just died in a car accident right after you argued with them. Would you blame yourself? Probably. But suppose your loved one lost control when the brakes failed and they, distracted from the argument, didn’t regain control quickly enough to avoid the crash. Brakes fail like that *very* infrequently. Unfortunately, you probably argue with your loved one more often than brakes fail like that. Even if you get along very well, they’ve probably driven after arguing with you and they weren’t in an accident before. The brake failure is less likely, so you can console yourself that you didn’t cause the accident.

This might not relieve you completely. You might consider whether you contributed. There are two things to consider. First, was your argument a *necessary* condition? Would they have crashed without the distraction? If so, relax. But what if you think they wouldn’t have crashed without the distraction? Then secondly you might consider how much you contributed…but let’s leave that for another time.

Here are some questions the reader might consider:

- What does partial culpability (cause) mean using this definition of cause?
- Does it makes a difference whether the potential causes are human? Sometimes we’re looking for
*who*is responsible. - How can we agree “enough” about the set of necessary conditions being considered to conclude something specific is the cause?

How much is it worth for a user to listen to an episode? Clearly listening to more of an episode is better than listening to less of an episode (Assumption 1). Almost as clear is the idea that listening to all of a longer episode shows more engagement than listening to all of a shorter episode (Assumption 2).

Suppose we have two podcast episodes: A is 5 minutes long and B is 30 minutes long.

Alice listens to all of both A and B. We could give A a value of 5 and B a value of 30, the listening time in minutes (supporting Assumption 2). However, basing recommendations on these values will reward too much longer episodes over shorter ones. In this case, B is valued six times what A is. We could give both A and B ratings of 1 (in line with Assumption 1) indicating that Alice listened to 100% of both episodes. This violates Assumption 2 that B should get a higher rating than A.

Bob listens to all five minutes of A and only five minutes of B. We could give both of them ratings of 5, the listening time. However, finishing A shows more satisfaction than skipping out of B 17% of the way into it, which goes against the spirit of Assumption 1. We could give A a value of 1 and B a value of 5/30 = .17. This follows Assumption 1 but emphasizes episode A too much by giving it six times the rating of B when they both captured the same amount of the user’s time.

What we need is a rating that is somewhere between listening time and percent completed.

Percent time completed is calculated as

We can “calculate” listening time as

To get something in between, we need to find the exponent e on duration so that 0 < e < 1.

Here’s the trick: we need to choose two pairs of (listening time, duration) that we want to have the same value. This gives us an equation

Suppose you want the same Time Scaled Value for listening to 90% of a two minute story and half of a thirty minute podcast. Listening time 1 is 1.8 minutes and listening time 2 is 15 minutes. Solving, this gives us e = .783. Using this, consider the TSV for listening 1.8 minutes to episodes of varying length. We get

This makes sense: as the duration increases, the fraction of the episode heard decreases, so the value decreases. Yet, it’s not as fast a decrease as if we just used the percent time completed. You can see this by looking at, say, the ratio of the TSVs for 5 minutes and 30 minutes: .511 / .126 = 4.07. If we used listening time, the ratio would be 1; if we used percent time completed the ratio would be 6.

Does this make sense? Listening to 1.8 minutes of a 30 minute episode shows that you heard the intro and the very beginning of the story, but passed on the rest. It shows interest, but not much commitment. Listening to 1.8 minutes of a two minute story means you heard most of the story, but 1.8 minutes isn’t very long; you might have just toughed it out rather than reach for your phone to skip the story.

Empirically, we’ve found that .912 works well for the app we’re developing. The ranking of TSVs mirrors other measures of success like the rate at which stories are shared on social media.

If you’re looking for something that gives more weight to longer content but not so much that it swamps the percent of the content, Time Scaled Values are worth a look.

]]>Suppose we are doing a difference of means* on the time a user spends on a site. To the frequentist, the difference of means is entirely a function of the data. This means that as we run the test and check as we go, the first time the data (randomly?) suggests significance, we probably don’t really have a significant result. The p-value is .05, but that means that 5% of the time when there’s no effect, we’re wrong; by checking regularly, we can easily turn what should be a null result into what appears to be a significant result. Such is folly.

Now, consider the Bayesian using an informative prior that the difference of means is zero. The result is a balance between the data and the prior. If there is no effect, the result is a balance between the prior suggesting the null result and the data randomly fluctuating. By balancing it with the null prior, the result of the test fluctuates less and is not likely to give a false positive result.

Note that the prior has to be informative. If we use a flat prior, there is nothing to balance out the fluctuations in the data and we just get the (false positive) frequentist result. The prior has to be informative to protect us from false positives.

For my A/B tests of the difference of means* I like to use a normal prior for the difference having mean zero and having a standard deviation so that there is only a 5% chance of it having an effect larger than the minimum meaningful effect. R code to calculate the standard deviation might look like

prior.sd <- - MinimumMeaningfulEffect / qnorm(.05/2)

What’s the downside? Your results are the weighted average of the data and the prior, so it will take more data to get a positive result. If you have very little data, this is not the approach for you: design your study well using power calculations and be patient. However, if you have a lot of data, this can be just the ticket.

Trust your A/B tests *and* check them as you go — but do it right.

* Or medians — they’re just as easy to model in a Bayesian context.

]]>- A beat B or B beat A, so you can stop.
- Neither A nor B beat the other, so you can stop.
- We can’t conclude #1 or #2 but you’ll need about m more data points to conclude one of them.

The tools I’ve found for analyzing A/B tests can all answer #1. Some of the better ones can answer #3. **None of the tools I’ve seen will answer #2** and tell you that A and B are not meaningfully different and that you have enough data to be pretty sure about that.

This has to do with the way most people use hypothesis testing. Stats students are taught to test the simple hypothesis “Is the amount B improves over A positive?” They get a p-value (related to the notion of the effect actually being negative) and go from there. **The problem is that the probability that the effect is precisely zero is zero.**

**Here’s a fix for that: Choose a minimum meaningful effect and test the hypothesis that the absolute value of the effect is smaller than that value.**

Here’s an example. Recently I was testing the hypothesis that version B of an app improved daily listening times over version A of the same app. Daily listening times for this app are around 40 minutes and cost difference for implementing B over A is small, so our product owner and I decided that any change less than 30 seconds was not meaningful. This left me with three hypotheses to test:

- Listening times for B are more than 30 seconds longer per day than for A (B wins).
- Listening times for A are more than 30 seconds longer per day than for B (A wins).
- Listening times for B and A are within 30 seconds of each other (tie).

I can test **all** of these hypotheses with standard hypothesis tests. If none are true, I can assume the mean difference in times is correct (it *is* our best estimate of the mean, given the data) and do a power calculation (although this is not the standard calculation, it’s pretty straightforward) to tell how much more data I need.

**All three questions answered.**

I implemented this in R using the ‘shiny‘ package to make it an interactive Web-based tool. A live demo is here and sample code is here on Github. You’ll need a server with shiny-server installed to use and test it or you can run it on ShinyApps.io (like my demo). I found it trivial to install on my Ubuntu server which I run at work for internal use.

]]>GitHub is superior to Subversion in notable ways, but that’s not our topic here. GitHub does make it easy to read source code directly from the site as plain text. Here’s an example of an address for a bit of code I use almost daily to give me a clean R session.

https://raw.github.com/shaptonstahl/R/master/Decruft/Decruft.R

Anyone see the problem? Two things: (1) The URL for the plain text version of code hosted on GitHub is reached via a secure connection, and (2) R can’t source via https without the use of an external library. I’m a big fan of R’s external libraries, but it doesn’t fit the purpose of the code. This code usually sits at the top of just about every `.R` file I write:

## Start fresh! source("http://address.of/Decruft.R")

Isn’t that pretty? Short, sweet, easy to remember. This is how I used to do business. Unfortunately, this is the most concise way that I could find for doing it with GitHub:

### Fugly code if( !any("devtools" == installed.packages()[,"Package"] ) install.packages("devtools") library(devtools) source("https://github.com/crikey/thats/as/long/as/the/code/Im/sourcing.R")

I checked the Google and such but nobody seemed to be asking precisely what I was: how can I read code stored on GitHub in plain text *using http, not https*? We will not be discussing how long it took me to come up with a satisfactory solution. Let’s just say it took long enough that I really don’t want anyone else to have to go through it.

Here’s my solution:

- Have a Web server running PHP that allows you to create and use
`.htaccess`files. - Choose a URL for the stem of where your code will appear to be.
- Use
`.htaccess`to point`404 Not Found`errors to a custom error page. - The PHP-based error page uses https to get the live file from GitHub and feeds it to the person requesting it.

It could be worse. I decided that I would request pages from (nonexistent) subfolders of `http://www.haptonstahl.org/R/` in order to read code stored under `https://raw.github.com/shaptonstahl/R/`. So I put two little files in the document root of my site. The first is named `.htaccess` and contains this:

ErrorDocument 404 /404.php

The other file is the `404.php` file mentioned in `.htaccess`. You can download the PHP file here. This is a copy of the actual one I am using. Now to get a clean R session I just type the following:

source("http://www.haptonstahl.org/R/Decruft/Decruft.R")

Easy peasy. Perhaps the best part is that I’m done. I never have to update or modify this if I want to source other public code in that GitHub repository. For example, without changes I can source

http://www.haptonstahl.org/R/RoundBoundsNicely/RoundBoundsNicely.R

to get

https://raw.github.com/shaptonstahl/R/master/RoundBoundsNicely/RoundBoundsNicely.R

Lesson to take home:

- Putting code you reuse up on teh webz makes it easy for you to
*use*it over and over instead of*writing*it over and over. - GitHub rocks, now even more since I can source my R code from the live GitHub versions.
- Safety first, kids!

Those in society’s minority who did well in math courses are “shocked” at the suggestion that we change the typical math curriculum. The teaching may be “dismal” but algebra is a “foundation stone” in developing critical thinking skills. “It teaches one how to think.” It’s a little amusing but mostly disheartening to see folks who claim to support more challenging math standards fall back on strawman arguments, condescension, sarcasm and, my favorite, math errors in their arguments.

Those in society’s majority who did poorly in math tended to respond with relief at the suggestion of dropping algebra, although there are a few PMSD (post-mathematics stress disorder) victims whose career paths were altered by failing math and who still carry the associated baggage and resentment.

Let’s set aside the hysterics (“We are breeding a nation of morons“) and give both sides of this debate a fair shake, shall we?

**Arguments in Favor of Eliminating Algebra as Courses Required for All**

We definitely teach too much algebra and do so mindlessly, without considering whether it’s useful. As a teacher of math courses from arithmetic through calculus at a community college, I fought the losing fight to remove useless topics from the curriculum. For example, Cramer’s rule is a relic from the days before computers and is as practical as a slide rule, but trying to remove it from the required topic list elicited resistance and deep resentment from many of my fellow faculty. Hacker’s suggestion that we reconsider the requirement for so much symbolic manipulation is sensible.

While teaching algebra I tried to limit my syllabi to (1) the topics that would be used in later courses, (2) topics that might be useful outside the classroom, and (3) some examples of true beauty. I emphasized (1) but snuck in some of (2) and (3). Still, I did teach some material that would *only* be useful in later math courses and never in *any* kind of applied setting. We could easily cut more topics by curtailing the length of the required math sequence, at least for the math subjects taught.

Algebra is not the only way to teach disciplined thinking. One can teach precisely the same thinking skills while removing the abstraction that makes math seem useless and difficult to many students. One idea (not mine): we could integrate math beyond middle school into the science curriculum and use applications as motivation. Then there’s no need to learn how to apply math to story problems; the stories *are* the original problems. This also prevents folks from running around with “hammers” looking for “nails”. Unfortunately I am not sure how to get there from here; teachers of other subjects would have to cover the needed material and that would require revamping the way teachers are taught.

We don’t need to learn algebra to develop our intuition about rates of change, interest rates, probability, statistics, and other topics that typically follow algebra. There are ways (videos, interactive widgets, simulation using simple programming) to develop the intuition of calculus — often the only calculus needed in a job like medicine — without approaching it the rigorous, analytic, symbol-pushing way we typically do. Even for those who eventually will need algebra, we can teach more advanced symbol manipulation skills later as needed.

Nobody (well, almost nobody) is saying that learning algebra has no value whatsoever. However, as long as we have limited resources the pertinent question is, does algebra give us more benefit than spending that time elsewhere? I suggest that **programming, statistics, and finance** are better uses of most students’ time. Programming is how the nearly countless computers in our lives work; a basic understanding of how they do their magic would be invaluable. Statistics are essential for making sense out of the sea of information around us. Finance is challenging and vital for artists and engineers alike as long as they want to buy a home or save for retirement. If we remove the symbol-pushing exercises of algebra and replace that class time with simple programming, statistics, and finance, we’ll gain more than we’ll lose.

**Arguments in Favor of Keeping Algebra a Courses Required for All (with occasional rejoinders)
**

Let’s be more specific about “algebra”. A first course (“Algebra I”) often includes basic linear algebra (lines, graphing them, solving systems of linear equations, and matrices) plus evaluating polynomial functions, graphing quadratic functions, and solving single quadratic equations. A second course (“Algebra II”) builds on this with lots of factoring polynomials, exponential and logarithmic functions, quadratic inequalities and other algebraic prep for calculus.

An Algebra I course like this is incredibly useful. The concepts generalize to every science, from physics to political science. With this foundation a student can learn the intuition behind calculus, statistics, and other tools that are *useful to have seen* but usually *not useful to retain.* **I strongly recommend keeping Algebra I part of the core curriculum.** With moderate resources, this material can be covered in middle school for most if not all students.

Algebra II is where common responses to “Why am I learning this?” jump the tracks.

- “It builds your brain like exercising build muscle.” I used this one regularly; it’s true, but only a half truth. Programming, statistics and finance can do the same with the added bonus of being unquestionably practical.
- “It helps students understand where more advanced math comes from.” Playing with simulations is even better for most students in understanding why more advanced math techniques work the way they do.
- “It teaches structured thinking.” Programming is even better for that
*and*it is easier for students to see what why structure matters. Mess up the structure and programs do odd things.

Algebra II is full of topics that you don’t need in order to *understand the intuition* of common useful advanced ideas; you need Algebra II when you will try to *master* more advanced ideas. **I strongly recommend making Algebra II something that fewer students take.**

**Known Unknown**

My recommendation to remove Algebra II from the universal curriculum is contingent on at least one assumption:* Students who need more algebra will have sufficient time to learn it later.* To see if this is true we would need to take average to bright students interested in technical fields and wait until college to teach them algebra. This is not common. Interestingly, I have some experience that is close to this: teaching math to political science graduate students. While there are a few notable exceptions, most political science undergraduates take very little math. Graduate students work very hard to learn the math necessary for advanced statistics and game theory, and generally they succeed. This evidence is circumstantial but shifts the burden of proof onto those who might argue college is too late to learn algebra.

**Bottom Line: Question Mathematical Authority**

Hacker thoughtfully asked a good question: are we teaching what we should be teaching? One cannot decry the educational establishment as ossified but resist any attempt to change for the better. Hacker may be throwing out the baby. My ideas might not be the best. What are your ideas? Let’s discuss it.

* This blogger is an award-winning teacher of math, statistics, and programming at the high-school through graduate levels, holds graduate degrees in math and political science, and works in the defense industry as a data scientist.

]]>Patil takes up the necessary and generally thankless task of writing a “big-think piece”. It’s necessary because, with all the recent talk about Data Scientists, it would be easy for some to see “Data Science” as a recent entry in the long history of business fads. Business fads tend to indulge in one of two sins: oversimplifying, or being so general that anything counts. Both are sins to which Data Science advocates are susceptible. An advocate of Data Science might oversimplify by giving a recipe or IT shopping list for doing Data Science; Patil avoids this triteness by emphasizing the wide diversity of the good teams he’s built or worked with. Alternatively, one could sin by making anything count as Data Science. Commenter “Verbose” anticipates this problem with his erudite critique of BDST: “Same technical data analysis, new bullshit name“. Patil avoids this and provides a blueprint for others to do the same.

When discussing the etymology of “Data Scientist” Patil writes that “Research Scientist” would not be as appropriate a term for this profession:

“Research scientist” was a reasonable job title used by companies like Sun, HP, Xerox, Yahoo, and IBM. However, we felt that most research scientists worked on projects that were futuristic and abstract, and the work was done in labs that were isolated from the product development teams. It might take years for lab research to affect key products, if it ever did. Instead,

the focus of our teams was to work on data applications that would have an immediate and massive impact on the business. (emphasis added) The term that seemed to fit best was data scientist: those who use both data and science to create something new.

Elsewhere Patil provides a solid definition of Data Scientist, but this paragraph encapsulates the concept just as well: A Data Scientist uses data and science to have an immediate and massive impact on the business. Just moving data around? Not data science. Have an impact in the vague future? Not data science. Improving entirely on the margins? Not (all of) data science. Holding the Data Scientist’s feet to the fire — asking “How does this immediately and massively impact our business?” — provides accountability and hence focus for the team.

Writing a “big-think piece” is also a thankless task: the breadth of the topic means that it’s easy for critics to find something to criticize as being presented too simply. This ignores the contribution of providing a view of the new discipline from space, showing all of it as a piece, and showing (if briefly) how the disparate parts fit together. I was glad for a look at a map for this road I’m traveling.

Overall BDST is short, shorter than I would have liked. (I’m glad that Patil is sharing more of his experience through other venues.) The advice Patil gives about Data Science, forging teams, and hiring Data Scientists seems both specific and useful; I’ll post again as I have occasion to use his advice.

**Recommendation**: “Building Data Science Teams” is short, but with enough good ideas as to be required for anyone in business intelligence, internal data analysis, or applied computational modeling and prediction.

In theory, a panel takes place in a room where 3-300 (median = 10) people watch three to five papers get presented by their authors. Then a discussant, who reads the papers in advance, comments on the papers both to draw connections among them and to stimulate conversation among the attendees. Then the audience asks questions and offers feedback to the authors. The whole panel takes about 1 3/4 hours.

Although panels comprise most of the scheduled events at a conference, they are not the best reason for scientists to attend conferences and they are far from the most rewarding part of a conference. Panels are often poorly attended. The papers in a panel often have very little to do with each other. The discussant may not receive the papers until days or moments before the panel, if at all, and even so the comments may focus more on typography than on big ideas.

Panels are a party game. They are an excuse to get smart people, who are interested in similar things, together in a room talking. Put a bunch of clever folks together and strange, wonderful, unpredictable things happen. A conference is **mass planned serendipity**.

The largest conference benefits to my research have happened when I have not been at panels: between panels, skipping panels, into the evening and the night. It’s the networking, but not “networking” in the Machiavelian, sales-person sense. It’s the comment on my paper that someone was a little too shy to offer in front of everyone, the comment that helps me recast the paper so it will place higher. It’s running into the same person at three panels and finally discovering we would love to work together on some research. It’s the dinner outing that leads to an invited talk or an interview. It’s the shared coffee followed up with a Facebook friending that leads to a new real friendship.

All of this is made possible by panels, but it’s not the direct result of the panels. So when someone tells me they didn’t go to a lot of panels, I understand that they probably got a lot of professional good out of the conference.

Also, I had a lot of fun at the Space Needle.

]]>It is regular sport for Bayesians to criticize frequentist confidence intervals as unintuitive, usually misinterpreted, and based on what are usually unjustifiable assumptions. There are good reasons for this: they are unintuitive, usually misinterpreted, and based on what are usually unjustifiable assumptions.

Confidence intervals take into account the uncertainty we have in trying to describe a population given that we only observe a random sample. Maybe our sample is representative of the population, maybe not. The smaller the sample, the more likely it is that, through random variation, we got a sample that suggests a relationship (causes us to reject the null hypothesis) when really there is no relationship (the null is true.) If we take 100 random samples of the same size and for each construct a (different) confidence interval, how many will contain the true value of the parameter? We expect that 95 of those constructed intervals will contain the true value.

We don’t have 100 samples of the same size, so this is not the question we want to answer. Here’s what we want to know: What is the probability that the true value is in the interval?

There are a variety of very good, sound reasons why the frequentist approach does not make sense. I understand the difference between believing that the true parameter is “fixed and known to God” (per the frequentist assumption) and a random variable (the Bayesian assumption.) I agree completely that a confidence interval answers the wrong question.

It doesn’t matter.

For large samples and given the regularity conditions of maximum likelihood estimators, the marginal posterior distribution for a single parameter is approximately normal with a mean equal to the MLE and standard deviation equal to the standard error. Under these conditions, maximum likelihood and Bayesian estimators give you the same inferences.

**Bayesian researcher perspective:** Suppose I want a credible interval for a parameter where I have a lot of data and the model is well-behaved but I don’t have convenient code for drawing posteriors. I could estimate the MLE and use the resulting confidence interval as an approximate credible interval. My friend and colleague Jeff Gill calls this the “lazy Bayesian” approach. Lazy, efficient, tomato, tomahto.

**Bayesian consumer perspective:** Suppose you are reading an article where there is plenty of data and a well-behaved model, but the author provides frequentist confidence intervals. You can treat them as approximate credible intervals. Easy peasy.

**Pragmatist perspective:** As long as the conditions are met, you can go on “misinterpreting” confidence intervals.

This “lazy Bayesian” approach is not limited to making simple inferences about single parameters. King, Tomz, and Wittenberg explain how to generate draws from the approximate posterior (as opposed to approximate draws from the actual posterior via MCMC) and make any inference a Bayesian can using the output from maximum likelihood estimation.

As a Bayesian, I advocate strongly for increasing the use of Bayesian methods. However, we should be careful to avoid overselling the advantages.

]]>