At the end of the day

Something that you say before you say what you believe to be the most important fact of a situation

Bad Bioinformatics

Posted by sspiro on September 23, 2007

A recent paper reports some of the most naive sequence-based bioinformatics I have ever come across. I won’t (but could) deconstruct the analysis line-by-line, rather let this one quotation serve as an example:

Looking at the similar domain architecture in NorR of E. coli and DIP1512 of C. diphtheriae we can safely assume that they are functional homologues, inspite of the non-finding of GAF and HTH domain in DIP1512.

Now, NorR is a three domain transcriptional activator. It has a GAF domain required for signal sensing, a AAA+ domain that interacts with RNA polymerase and a HTH domain required for DNA binding. The above statement (ignoring its glaring internal contradiction) refers to the conclusion that a protein that lacks the GAF domain and the HTH domain is nevertheless a ‘functional homologue’ of NorR. Besides sharing only one of NorR’s three domains, the DIP1512 protein also lacks a conserved sequence motif that is absolutely required for the interaction with RNA polymerase.

Being active in a related area of research I find this all rather aggravating, but I’m not sure what to do about it, other than venting here (which feels somewhat like stepping outside my office and yelling into the Texas wind).

For an excellent bioinformatic treatment of NorR and other proteins that respond to nitric oxide, I thoroughly recommend the work of Dmitry Rodionov and colleagues, as reported here.

Gupta, S., Bansal, S., Deb, J.K. and Kundu, B. (2007) Interplay between DtxR and nitric oxide reductase activities: a functional genomics approach indicating involvement of homologous protein domains in bacterial pathogenesis. International Journal of Experimental Pathology 88: 377-385

Rodionov, D.A., Dubchak, I.L., Arkin, A.P., Alm, E.J. and Gelfand M.S. (2005) Dissimilatory metabolism of nitrogen oxides in bacteria: comparative reconstruction of transcriptional networks. PLOS Computational Biology 1: e55

2 Responses to “Bad Bioinformatics”

  1. [...] will come back to on many occasions. Publication with very wrong sequence analysis like the one Stephen Spiro pointed out on his blog is not an exception. I may agree that large scale analysis can stand quick and dirty treatment of [...]

  2. [...] Comments (RSS) « Bad Bioinformatics [...]

Leave a Reply

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <pre> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>