How a Bubble Form Is Like a Fingerprint

Ah, the bubble form. It mostly brings back memories of elementary school: flustered teachers passing out sharp number-two pencils while I sadly bubbled "ELIZABET" into the eight spaces allotted for my first name. Though bubble forms become less frequent as we age, their stakes get higher. We bubble in SAT answers and vote in presidential elections. It's assumed that bubbles are anonymous; once separated from our (truncated) names, they're like ones and zeros that say nothing about the person who filled them in. But researchers at Princeton say that unfortunately--or fortunately--that's not true.

In a paper that will be presented at a security research conference this summer, the computer scientists describe how they extracted identifying information from bubble forms. They created a system that could identify who had filled in a bubble, and detect when a person had fraudulently filled in another person's form.

To pull all the information they could out of each bubble, the researchers scanned a group of questionnaires by high schoolers at high resolution. A computer examined each pencil blob and calculated its center, average radius, how much its radius varied (because most of us don't create perfect circles), and the blob's "center of mass." Then each filled-in circle was divided into 24 pie slices, and the shapes of those slices were evaluated in the same way. Finally, the coloring of each pie slice--how darkly or lightly it had been penciled in--was analyzed.

The researchers were now armed with hundreds of pieces of information about each bubble, and the identities of the 92 students who had made them. They trained their computer with 12 bubbles from each student, then showed it groups of 8 bubbles it hadn't seen before. For each new group of bubbles, the computer guessed which of the 92 students had filled them in. It picked the right student just over 50% of the time. And 75% of the time, the correct student was in the computer's top three guesses.

So a bubble isn't as good as a fingerprint. This method couldn't match bubbles one-to-one; it requires a bunch of bubbles to compare. The researchers did try testing their system on one bubble at a time (after training it on the large group), and it didn't totally fail: it picked the right student 5% of the time, which is better than the 1% it would get from guessing randomly. But to guess the identity of a bubble maker with any confidence, the computer would need a larger group of bubbles.

Still, most of the scenarios in which we fill out bubble forms involve just that: a large group of bubbles. This means that someone with the right programming (and determination) might be able to identify who filled out a standardized test form. The authors point out that a computer program could automatically scan standardized test forms--SATs, MCATs, LSATs--and compare the bubbles to a known set of bubbles for each test-taker. By flagging the tests that don't seem to match, such a program could identify cheaters who have filled out a test under someone else's name.

The findings have negative implications, though, for voting. Here in Chicago, we vote by connecting two sides of an arrow. It's bizarre and frankly a little confusing, but probably doesn't leave a lot of clues for identifying a voter. But in some other parts of the country, voters fill in bubble forms. The authors note that counties such as Humboldt County, California, release scanned images of their ballots after an election. A devious potential employer, or someone who was coercing you to vote a certain way, could potentially find out how you voted, given that this person had the right programming (and, ideally, a copy of your SAT). To make ballots more secure, the authors suggest printing forms on a gray background or having voters fill in their bubbles with ink stampers.

Since this study only used one set of bubble-form surveys from high schoolers at one point in time, it's possible that we aren't that consistent, after all. Our bubbles may all look alike on one day, while we're using one pencil and sitting at one desk, but they might vary from test to test. Or our bubbling-in style might gradually change as we age. If anyone wants to do this study, I'll be the first to volunteer my elementary-school standardized tests. It would be nice to think they served some purpose.

Image: J. Calandrino, W. Clarkson and E. Felten, Princeton University.

1 comment:

  1. Oh, I remember the limited number of bubbles for names, too! I always remember being slightly bewildered by the long, long row of spaces for last names, though. Especially since my last name only had 6 letters. :)



