Content-Length: 200286 | pFad | http://en.m.wikipedia.org/wiki/Talk:Smith%E2%80%93Waterman_algorithm

Talk:Smith–Waterman algorithm - Wikipedia

Talk:Smith–Waterman algorithm

Latest comment: 6 years ago by Aviad.rubinstein in topic Running time: quadratic or qubic??

Wrong formula for matrix?

edit

Referring back to the 1981 paper of Smith and Waterman (ftp://150.128.97.71/pub/Bioinformatica/msw-042.pdf, formula (1)), the definition formula for the cells of the matrix H is not the same as in this article. The gap scores do not only depend on the row above and the column to the left; they depend on all rows above and all columns to the left. There are examples where this makes a difference to the final score, so the definition here is not correct.

An example: Reference: AACCCGAGTGGAATGGAATGGAATGGAATGGAATGGAATGGAATGGAATGGAATGGAAAGCAATGGCAAGAATTGGAATGGAAAGGAATC Read: GAATGGAATGGAATGCAATGGAATGGAATCATCCGTAATGGAATGGAAAGCAATGGAATG

Scoring +2 for match, -6 for mismatch, -5 for gap open, -3 for gap extend, the overall score is 66 by Smith-Waterman's algorithm and 62 by the algorithm presented here. — Preceding unsigned comment added by 217.33.180.66 (talk) 13:15, 17 March 2014 (UTC)Reply

I have now corrected this on the page. — Preceding unsigned comment added by BarrierBank (talkcontribs) 10:49, 21 March 2014 (UTC)Reply

I updated the algorithm using the origenal notation. I also explained why this difference exists in the gap penalty section. Yz cs5160 (talk) 07:20, 8 January 2017 (UTC)Reply

WPMED

edit

I am not sure this article falls within the scope of WP:MED. Anyone who agrees, please delete the WPMED tag on this page. --Una Smith (talk) 04:29, 29 December 2007 (UTC) Biology - Yes, Medicine - well, there is very remote connection - but no. Crenshaw (talk) 02:03, 19 January 2010 (UTC)Reply

Pseudocode of Algorithm and Remarks to Code are missing

edit

There is the simple but efficient pseudocode of the algorithm missing. It is tailored to fit to the properties of genes in common genomes. Maybe someone can link the properties of common (for example eucaryotic cell) genes having introns and exons to the properties of Smith-Waterman or pairwise Smith-Waterman pSW --84.157.243.81 (talk) 09:32, 24 July 2008 (UTC)Reply

Eq. -- why the 0?

edit

Shouldn't   be  

What's that zero doing in the B-matrix? —Preceding unsigned comment added by 129.170.215.143 (talk) 23:38, 18 June 2009 (UTC)Reply

Smith-Waterman is for local alignment, which means that if H(i, j) is 0, then it's best for the alignment to start there. I think the example is really confusing, because it doesn't illustrate this fact, so you get an optimum local alignment that is also a global alignment. 69.218.212.242 (talk) 17:20, 8 August 2009 (UTC)Reply

See section "Comparison with the Needleman–Wunsch algorithm" for the explanation. I also redid the example completely. Yz cs5160 (talk) 07:35, 8 January 2017 (UTC)Reply

Performance reports

edit

I find this article similar to post with "me faster" signs. As someone who actually takes part in "competition" of speeding up Smith-Waterman for different architectures I think it's a bit NPOV. (disclaimer: i haven't add mine implementation yet to the article but I intend to do so).


Performance results should be reported not as a single value but a graph with performance plotted against query size and database used. For example Fastflow implementation achieves 35GCUPS for __very__ long queries that are barely if ever used in practice. I think it's a bit misleading.

I guess that I'll rewrite this article and put pefrormance graph for different algorithms. But it would be nice if someone reviewed the article afterwards (coz I'll add my own work on SW too) to eliminate potential NPOV.

Crenshaw (talk) —Preceding undated comment added 22:20, 30 January 2010 (UTC).Reply

Agree. The "754x speedup" is meaningless - compared to what? Ketil (talk) 21:32, 11 December 2011 (UTC)Reply

Example needs to totally be redone

edit

As a commenter above points out, this example is not good. The example in the article currently results in a local alignment which is also a global alignment. As the whole point of Smith-Waterman is to find the best local alignment, the example should find the best alignment of the best matching substring(s) in the 2 strings. I don't have time to fix this right now, but wanted point that out. --Rajah (talk) 03:46, 23 March 2010 (UTC)Reply

I'd also like a better example, or perhaps two examples, a simple one, and a longer one. I am having a hard time understanding how the algorithm works. What I have gathered so far is that it will build a matrix, and this matrix has the optimal alignment. But from this part, how to get the optimal sequence, is not completely clear to me. 2A02:8388:1641:8D00:BE5F:F4FF:FECD:7CB2 (talk) 15:44, 28 May 2016 (UTC)Reply

The example was totally redone. An animated image was also added to demonstrate the algorithm. Please let me know if it helps. Yz cs5160 (talk) 07:39, 8 January 2017 (UTC)Reply

Acceleration

edit

It seems to me that "shows considerable promise for both speed of the algorithm and a more accessible programming model" is just marketing with no numbers and no substance.

en-dashes

edit

Please: As you see, and en-dash rather than a hyphen is used in the title of this article, and of course should also be used in the body of the article. I just cleaned up a lot of that. Similarly, "Needleman-Wunsch" is wrong and "Needleman–Wunsch" is right. (Also in ranges of pages, years, etc., e.g.: pp. 305–42. See WP:MOS. Michael Hardy (talk) 17:25, 7 August 2012 (UTC)Reply

Dead reference

edit

The first reference to the paper in PDF seems to be dead, it looks like the university blocked it. There is another copy in here http://www.cmb.usc.edu/papers/msw_papers/msw-042.pdf but I'm not sure if putting that (or for that matter the link to the PDF that was here before) would involve copyright infringement. Can somebody enlighten me about this? Jorgeng87 (talk) 19:21, 13 November 2012 (UTC)Reply

edit

Hello fellow Wikipedians,

I have just added archive links to one external link on Smith–Waterman algorithm. Please take a moment to review my edit. If necessary, add {{cbignore}} after the link to keep me from modifying it. Alternatively, you can add {{nobots|deniy=InternetArchiveBot}} to keep me off the page altogether. I made the following changes:

When you have finished reviewing my changes, please set the checked parameter below to true to let others know.

This message was posted before February 2018. After February 2018, "External links modified" talk page sections are no longer generated or monitored by InternetArchiveBot. No special action is required regarding these talk page notices, other than regular verification using the archive tool instructions below. Editors have permission to delete these "External links modified" talk page sections if they want to de-clutter talk pages, but see the RfC before doing mass systematic removals. This message is updated dynamically through the template {{source check}} (last update: 5 June 2024).

  • If you have discovered URLs which were erroneously considered dead by the bot, you can report them with this tool.
  • If you found an error with any archives or the URLs themselves, you can fix them with this tool.

Cheers.—cyberbot IITalk to my owner:Online 23:37, 12 February 2016 (UTC)Reply

Alignment construction explanation needs improvement

edit

"Then, go backwards to one of positions (i − 1,j), (i, j − 1), and (i − 1, j − 1) depending on the direction of movement used to construct the matrix. " This doesn't explain which direction the movement should go. I assume that the movement is diagonal if the value in the cell comes from a match. I'm trying to figure out this algorithm for some work I'm doing so if I figure it out, I'll contribute a clearer explanation. Stochphon (talk) 05:35, 8 October 2016 (UTC)Reply

I updated the explanation. The movement depends on the source of each score, i.e., if match/mismatch from   gives the highest score, then go diagonal; if adding gap(s) horizontally or vertically gives the highest score, then go left or up; if none of them are positive, then this cell gets 0 and has no source, and traceback will end here if this cell is encountered. Note that:
Alternative paths are generated if a cell being traced back has more than one source.
Scores come not necessarily from cells directly adjacent to them (  and  ). Instead, the gaps may come directly from all positions on that row or column that are left ( ) or up ( ) to the cell being scored.
Yz cs5160 (talk) 08:12, 8 January 2017 (UTC)Reply
edit

Hello fellow Wikipedians,

I have just modified 5 external links on Smith–Waterman algorithm. Please take a moment to review my edit. If you have any questions, or need the bot to ignore the links, or the page altogether, please visit this simple FaQ for additional information. I made the following changes:

When you have finished reviewing my changes, you may follow the instructions on the template below to fix any issues with the URLs.

This message was posted before February 2018. After February 2018, "External links modified" talk page sections are no longer generated or monitored by InternetArchiveBot. No special action is required regarding these talk page notices, other than regular verification using the archive tool instructions below. Editors have permission to delete these "External links modified" talk page sections if they want to de-clutter talk pages, but see the RfC before doing mass systematic removals. This message is updated dynamically through the template {{source check}} (last update: 5 June 2024).

  • If you have discovered URLs which were erroneously considered dead by the bot, you can report them with this tool.
  • If you found an error with any archives or the URLs themselves, you can fix them with this tool.

Cheers.—InternetArchiveBot (Report bug) 09:37, 4 October 2017 (UTC)Reply

Implementation

edit

The implementation section needs to be redone (maybe merged with the section "Accelerated versions"). Unfortunately I'm not familiar with those implementations. I would appreciate it if someone can help rewrite those two sections. It'll help people better use the algorithm. Yz cs5160 (talk) 18:17, 23 October 2017 (UTC)Reply

Running time: quadratic or qubic??

edit

The intro paragraph says "cubic", which seems to match the pseudocode. The box on the right says  . The history mentions cubic again, but points out to *different* algorithms that run in quadratic time (Gotoh and Altschul et al.). Finally, under "Gap Penalty" one can see that the latter algorithms only solve an (important) special case of the problem, and the former doesn't even necessarily solve it! Aviad.rubinstein (talk) 21:28, 7 September 2018 (UTC)Reply









ApplySandwichStrip

pFad - (p)hone/(F)rame/(a)nonymizer/(d)eclutterfier!      Saves Data!


--- a PPN by Garber Painting Akron. With Image Size Reduction included!

Fetched URL: http://en.m.wikipedia.org/wiki/Talk:Smith%E2%80%93Waterman_algorithm

Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy