Bad Bioinformatics
Posted by sspiro on September 23, 2007
A recent paper reports some of the most naive sequence-based bioinformatics I have ever come across. I won’t (but could) deconstruct the analysis line-by-line, rather let this one quotation serve as an example:
Looking at the similar domain architecture in NorR of E. coli and DIP1512 of C. diphtheriae we can safely assume that they are functional homologues, inspite of the non-finding of GAF and HTH domain in DIP1512.
Now, NorR is a three domain transcriptional activator. It has a GAF domain required for signal sensing, a AAA+ domain that interacts with RNA polymerase and a HTH domain required for DNA binding. The above statement (ignoring its glaring internal contradiction) refers to the conclusion that a protein that lacks the GAF domain and the HTH domain is nevertheless a ‘functional homologue’ of NorR. Besides sharing only one of NorR’s three domains, the DIP1512 protein also lacks a conserved sequence motif that is absolutely required for the interaction with RNA polymerase.
Being active in a related area of research I find this all rather aggravating, but I’m not sure what to do about it, other than venting here (which feels somewhat like stepping outside my office and yelling into the Texas wind).
For an excellent bioinformatic treatment of NorR and other proteins that respond to nitric oxide, I thoroughly recommend the work of Dmitry Rodionov and colleagues, as reported here.
Manual sequence analysis - some common mistakes « Freelancing science said
[...] will come back to on many occasions. Publication with very wrong sequence analysis like the one Stephen Spiro pointed out on his blog is not an exception. I may agree that large scale analysis can stand quick and dirty treatment of [...]
Bioinformatics and biochemistry « At the end of the day said
[...] Comments (RSS) « Bad Bioinformatics [...]