Statistical Confessions

Let’s say I want to know how many students in my class are cheating on exams. Maybe I’d like to know who the individual cheaters are, maybe I don’t but let’s say that the only way I can find out the number of cheaters is to ask the students themselves to report whether or not they cheated. I have a problem because no matter how hard I try to convince them otherwise, they will assume that a confession will get them in trouble.

Since I cannot persuade them of my incentives, instead I need to convince them that it would be impossible for me to use their confession as evidence against them even if I wanted to. But these two requirements are contradictory:

The students tell the truth.
A confession is not proof of their guilt.

So I have to abandon one of them. That’s when you notice that I don’t really need every student to tell the truth. Since I just want the aggregate cheating rate, I can live with false responses as long as I can use the response data to infer the underlying cheating rate. If the students randomize whether they tell me the truth or lie, then a confession is not proof that they cheated. And if I know the probabilities with which they tell the truth or lie, then with a large sample I can infer the aggregate cheating rate.

That’s a trick I learned about from this article. (Glengarry glide: John Chilton.) The article describes a survey designed to find out how many South African farmers illegally poached leopards. The farmers were given a six-sided die and told to privately roll the die before responding to the question. They were instructed that if the die came up a 1 they should say yes that they killed leopards. If it came up a 6 they should say that they did not. And if a 2-5 appears they should tell the truth.

A farmer who rolls a 2-5 can safely tell the researcher that he killed leopards because his confession is indistinguishable from a case in which he rolled a 1 and was just following instructions. It is statistical evidence against him at worst, probably not admissible in court. And assuming the farmers followed instructions, those who killed leopards will say so with probability 5/6 and those who did not will say so with probability 1/6. In a large sample, the fraction of confessions will be a weighted average of those two numbers with the weights telling you the desired aggregate statistic.

5 comments

Comments feed for this article

September 1, 2011 at 3:54 am

Anonymous

You don’t even need a large sample for this to work.

Using the dice method, you could tell your students to also provide (separately and anonymously) the results of their dice throws allowing you to remove the exact number of false positives / negatives due to the 1s / 6s.

September 1, 2011 at 8:33 am

Michael Webster

This seems very interesting. I would love to know how it worked for you. It sounds very neat in theory, but I would like to look more deeply at a real example.

September 1, 2011 at 11:07 am

emir

Supposedly, when they actually used this method to find out how many people have ever done cocaine (heads: tell the truth, tails: say you’ve done it), substantially fewer than half the people ended up saying they had done it. In other words, the implied percentage was negative. Unfortunately, I don’t have a reference for this fact. On the other hand, if you simply ask people in a survey: have you done cocaine, plenty of them will willingly admit to it (for this I not only don’t have a reference, this is mere speculation). One potential explanation would be that the very fact you are using such an elaborate procedure to allow me to hide my personal truth means that doing cocaine must be seen in bad light. Given that, why would I want your posterior to move toward thinking that I’ve done cocaine (which it will if tails come up and I follow the procedure)?

September 8, 2011 at 5:29 pm

rjd100

I recently debated a self-righteous law student on Linked-In about whether law students were more ethical than MBA students.

The JD asserted that law students were more ethical because law students are barred from the profession if caught cheating, while MBA students could still be hired by a business, who might even appreciate their ability to get results no matter what.

I asserted that the highly competitive people who go to law or business schools are most likely cheating at similar rates and countered with some data:

http://www.bloomberg.com/apps/news?pid=newsarchive&sid=aw7s9m0BmcBo&refer=home

The article says 56% of MBA students cheat while only 45% of law students cheat.

While you might initially think that the article supports the JD’s position, if you think about it, it is not so clear. Admitting cheating is far less costly for an MBA student, since getting caught ends the career of a law student. Therefore the law students have more incentive to lie about not cheating on any survey, even if promised anonymity.

Moreover, since admission to the bar requires (as I was told) an affirmation by the lawyer that he has not cheated, and we know law schools are not expelling anywhere near 45% of their students, we can know for sure that there are a large number of lawyers who are not only cheating in school, but lying about it on their ascension to the Bar. Combine this with the relatively small percentage difference in cheaters,and most people decided I won the argument.

September 8, 2011 at 7:52 pm

Eric

A sufficiently anonymous survey would probably get the results you want without resorting to using dice. A student who lies in an anonymous survey may just as easily fail to follow instructions in another method.

The other way to measure cheating is to catch it – that is often quite easy these days.

Statistical Confessions

Top Posts

Tags

Subscribe via RSS

Jeff’s Twitter Feed

Email Subscription

5 comments

What Are Your Thoughts? Cancel reply

Statistical Confessions

talk cheaply

Related

Top Posts

Tags

Subscribe via RSS

Jeff’s Twitter Feed

Email Subscription

5 comments

What Are Your Thoughts? Cancel reply