Dennis, a professor of computer science at New York University, is the author of The Puzzling Adventures of Dr. Ecco (Dover, 1998); Codes, Puzzles, and Conspiracy (W.H. Freeman & Co., 1992); Database Tuning: A Principled Approach (Prentice Hall, 1992); (coauthored with Jason Wang and Bruce Shapiro) Pattern Discovery in Biomolecular Data: Tools, Techniques, and Applications Oxford University Press, 1999); and (coauthored with Cathy Lazere) Out of Their Minds: The Lives and Discoveries of 15 Great Computer Scientists (Springer Verlag, 1998). He can be contacted at [email protected].
She introduced herself as Elena. Looking younger than her 27 years, she had been a child prodigy, one of only two female physics students at the Moscow Physics Institute, and then a researcher at the Atomic Energy Institute before converting to biology and emigrating to the United States.
"The Atomic Energy Institute has most of its labs in Moscow, nuclear reactors and all. It's mostly for weapons research. I can't even tell you what I worked on. It was sensitive enough that the government in Russia didn't want to let me go. But here I am. There are all kinds of ways to do things in Russia," she said with a dreamy smile I could not read.
Suddenly all business, she continued, "I am here to ask for help in figuring out a protein network. Though we don't understand the genetic details of the network, we know that they involve the most important cells in our model organism -- that one-millimeter worm, the C elegans."
"How exciting," 12-year-old Liane said with undisguised mockery.
"I don't like them any more than you do, girl," Elena said, "but it turns out that if I understand this network, I can perhaps attack certain cancers in people."
Liane sat up, apparently deciding to hear Elena out.
"A network or a pathway is a collection of causal relationships that link the production of chemicals (proteins, metabolites, whatever). If a certain critical pathway goes wrong, it can cause cancer and other diseases. So, understanding the causality in a pathway is essential to understanding what can go wrong.
"Here's how it works. Suppose there is a path of positive causality from one protein P1 to another P2. P1 itself is produced, and none of the proteins along that path in between have been eliminated. Then P2 will be produced also, though the amount that is produced may vary if other paths between P1 and P2 have eliminations along the way.
"If we have a linear pathway
A B C D E
then eliminating A will eliminate all proteins. Eliminating B will eliminate B, C, D, and E. Eliminating C will eliminate C, D, and E. I hope you see the pattern."
We nodded.
Elena went on. "Sometimes there can be branching. For example, A may branch to B and C. Then both B and C may induce D (see Figure 1). In that case, eliminating B partly reduces the output of D.
"The partial reduction means the circuit is analog, not digital," Ecco said with delight. "At least nature rebels against the intellectual boxcar of our age!"
"But, wait, what if you don't know the circuit?" Liane interrupted.
"That's exactly the problem," Elena replied. "We don't know the circuit. We just have certain evidence and we want to discover the most likely circuit. In my case, there are eight proteins in the first circuit and 10 in the second. We know that all causalities are positive. In the first case, we know there is no feedback. In the second, we suspect the feedback is there. Positive feedback -- that is a cancer circuit."
"What data do you have?" Ecco asked.
"In each mutagenesis experiment, we directly modify one of the proteins, thereby eliminating it. The others are modified through causality as I just explained. There is one piece of bad news: We don't know which one we have modified, only that we have modified one."
Here are the results of the mutagenesis experiments. Since no two of these results are the same, we know we have directly eliminated each protein in exactly one experiment (and possibly others by causality). The default behavior (without mutagenesis) is that all proteins are produced at their full concentrations. Reduced concentration means that a protein is produced at a smaller concentration than normal. Zero concentration means that a protein is not produced at all (that is, it is eliminated). So, the first line here says that A and G are produced at full concentration, D and E at reduced concentration, and B, C, F, and H at zero concentration.
Full: A G Reduced: D E Zero: B C F H
Full: A B C D F G H Reduced: Zero: E
Full: A B C E G Reduced: D Zero: F H
Full: A B C E F G H Reduced: Zero: D
Full: A Reduced: B C D E F H Zero: G
Full: A B C D E G H Reduced: Zero: F
Full: A B D E F G H Reduced: Zero: C
Full: Reduced: Zero: A B C D E F G H
Reader: All possible mutagenesis experiments are performed above, though we do not know which protein was eliminated in each case. Can you figure out the circuit from that information? Remember that all links are positive.
Now for the case that may involve some feedback.
Full: A B F G I J L Reduced: C D H Zero: E K
Full: A C D E F G H I J K L Reduced: Zero: B
Full: B C D E F G H I J K L Reduced: Zero: A
Full: A B F G I J L Reduced: C E K Zero: D H
Full: A B D E F G H I J K L Reduced: Zero: C
Full: A B C D E G H I J K L Reduced: Zero: F
Full: A B C E F G H I J K L Reduced: Zero: D
Full: D E H J K Reduced: C Zero: A B F G I L
Full: A B F G I J L Reduced: C D H K Zero: E
Full: Reduced: Zero: A B C D E F G H I J K L
Full: A B C D E H I J K L Reduced: Zero: F G
Full: A B D E F G H J K L Reduced: C Zero: I
Reader: Would you like to try to design a circuit for this? Remember that full means as much as in the case where there is no mutagenesis. Apparently, the reactions don't spin off into infinite production because of some other dampening factors not involved.
After a few hours, during which time Elena clobbered me in chess three times, Ecco and Liane came up with two diagrams. After verifying that the results were consistent, Elena asked, "Are you sure no other circuits are possible?"
Ecco and Liane had to admit they weren't sure.
Reader: Can you find several?
Last Month's Solutions
1. For three coins: 23, 5, and 1. This gives an average of 5.3131 coins needed to pay any bill from 1 to 99 cents. For four coins: 38, 11, 3, and 1. This gives an average of 4.141. For five coins: 44, 17, 8, 3, and 1. This gives an average of 3.535354 using intuitive counting. However, the set is nonintuitive; that is, 24 is best served by three 8 coins. For six coins: 45, 21, 8, 5, 2, 1. This gives an average of 3.242424 using intuitive counting. But this set is nonintuitive because, for example, 63 is best counted out by three coins of 21.
2. Ecco thinks that nonintuitive sets are necessary.
3. If all prices are to the nearest 5 cents. No pennies allowed. For a three-denomination set: 40, 15, and 5. The average number of coins needed is less than 2.53. For a four-denomination set coins: 65, 25, 10, and 5. The average number of coins is less than 2.16. For a five-denomination set: 65, 30, 25, 10, and 5. The average number of coins is less than 1.95 using intuitive counting.
Reader Notes
The following readers solved the first two parts of the Chimps problem (DDJ, September 2000), in which each chimp will share a wall with all the chimps it loves, but will share neither wall nor corner nor column with a chimp it hates): Steve Kietzman, Larry Lewis, Mark Taylor, Dennis Cumro, Patrick Schonfeld, Rodney Meyer, James Heginbottom, and Robert Morrison.
Larry Lewis found 16 different (but more or less symmetrical) solutions to the problem after making his Pentium work for 12 hours. Mark Taylor agreed with this and showed that they are symmetric variations of one another based on reflections and rotations. He also showed that in the third problem: "Chimps that hate each other shouldn't be on the same level" cannot be solved. Patrick Schonfeld and Robert Morrison showed that the minimum number of conflicting pairs in such a situation is 24, while still solving the first two problems.
DDJ