Testing in the US is dominated by multiple-choice questions. Together with the time limit, this encourages students to stop thinking and go for guessing. I recently wrote an essay AMC, AIME, USAMO Contradiction, in which I complained about the lack of proofs in the first two rounds of math competitions.
Is there a way to improve the situation? I grew up in the USSR, where each round of the math competition had the same format: you were given several hours to write proofs for three or four difficult problems. There are two concerns with organizing a competition in this way. First, the Russian system is much more expensive, whereas the US’s multiple choice tests can be inexpensively checked by a computer. Second, the Russian system is prone to unfairness. You need many math teachers to check all these papers on the highest level. Some of these teachers might not be fully qualified, and it is difficult to ensure uniform checking. This system can’t easily be adopted in the US. I am surprised I haven’t heard of lawsuits challenging USAMO results, but if we were to start having proofs at the AMC level with several hundred thousand participants, we would get into lots of trouble.
An interesting compromise was introduced at the Streamline Olympiad. The problems were multiple choice, but students were also requested to write proofs. Students got two points for a correct multiple choice answer, and if the choice was correct the proof was checked. Students could get up to three points for a correct proof. This idea solves two issues. The writing of proofs is rewarded at an early stage and the work of the judges is not as overwhelming as it would have been, had they needed to check every proof. However, there is one problem that I discussed in previous posts that this method doesn’t solve: with multiple choice, minor mistakes cost you the whole problem, even though you might have been very close to a solution. If we want to reward thinking more than accuracy, the proof system allows us to give credit for partial solutions.
I can suggest another approach. If the Russians require proofs for all problems and the Americans don’t require proofs for any problem, why not compromise and require a proof for one problem out of the set.
But I actually have a bigger idea in mind. I think that current development in artificial intelligence may soon help us to check the proofs with the aid of a computer. Artificial intelligence is still far from ready to validate that a mathematical text a human has produced constitutes a proof. But in this particular case, we have two things working for us. First, we can use humans and computers together. Second, we do not need to check the validity of any random proof; we need to check the validity of a specific proof of a simple problem that we know in advance, thus allowing us to prepare the computers.
Let us assume that we already can convert student handwriting into computer-legible text or that students write directly in LaTeX.
Here is the plan. Suppose for every problem, we create a database of some sample right, wrong and partial solutions with corresponding scores. The computer checks the students’ solutions against the given sample. Hopefully, the computer can recognize small typos and deviations that shouldn’t change the point value. If the computer encounters a solution that is significantly different from the ones in the sample, it sends the solution to human judges. Humans decide how to score the solution and the solution and its score is added to the sample database.
For this system to work, computers should be smart enough not to send too many solutions to humans. So how many is too many? My estimate is based on the idea that we wouldn’t want the budget of AMC to go too much higher than the USAMO budget. Since USAMO has 500 participants, judges check just a few hundred solutions to any particular problem. With several hundred thousand participants in AMC, the computer would have to be able to cluster all the solutions into not more than a few hundred groups. The judges only have to check one solution in each group.
As a bonus, we can create a system where for a given solution that is not in the database, the computer finds the closest solution and highlights the difference, thus simplifying the human’s job.
In order to improve math education, we need to add proofs when teaching math. My idea might also work for SATs and for other tests.
Now that there is more money available for education research, would anyone like to explore this?Share: