Binary Search Explanation
Binary Search Explanation
Last week I got an email from a worried user that some information was missing on a sales
report. After few hours of exhaustive debugging with a few time-outs in between, I realized
that an obscure READ TABLE... command comes back with SY-SUBRC = 4. It was looking for a
combination of material number and customer number in an internal table. Both numbers
were right, leading zeroes and all. I had the whole table in front of me in the debugging
window and the record, which READ was supposed to find, was indeed present in it. “What
do you mean sy-subrc is 4?! Here is that record, right there, you dumbass!”, - almost yelled I
at the poor innocent Dell monitor.
Here I should probably mention that READ TABLE command had BINARY SEARCH addition.
As I’ve learned from my very long programming (not ABAP) experience, sometimes if you
just make things simpler it might actually solve the problem. So I’ve just commented out the
BINARY SEARCH part and ran the program again. Now it worked like a charm. OK, now I had
to get to the bottom of this.
PERFORM populate_table.
So what’s the deal with this damn binary search? I really like the simple explanation that
one guy gave in an SDN post: ”Let’s say you have numbers 1..to ..100 in a table and you are
searching for 34. It would read the 50th record and if it is say 50 next it would read the 25th
record and if it is say 25 it would carry to read the 38th record and so on.”
Back to my example. There were 11 entries in my test table. The binary search started by
splitting the table in half and it got the middle record (B B C). “OK,” thought the computer.
“Since I’m looking for Z and A, let me look at the second part of the list (because Z > B). Oh,
now I see C C A, we are getting closer! Let’s look at what’s left after that.” Naturally, at this
point the only records to search were only C C A, Z B B and Z A A. So it split the list in half
again and got Z B B. “OMG, I went too far! Let me get back real quick. Hmm... I see C C A. C
is less than Z, which means that there is no record with Z and A. Oh well... SY-SUBRC = 4.
Buhbye!”.
As I finally found out, the problem with the sales report was that the internal table was first
sorted by one field, which would have worked fine with the READ, but then re-sorted by
another field somewhere in the middle. It looks like a good idea to sort the table right
before the binary search, which I will do in the future.
Obviously, with BINARY SEARCH what you see is not always what you get. To get the right
result, the table must be sorted by the right field and in ascending order. If this is not done
properly, sometimes binary search might still work correctly, depending on what data is
inside the table. But sometimes you might wish it didn’t work at all because it could make
finding an error a major pain in the back.
While I was on it, I also ran the runtime analysis a few times. With the small amount of data
in my test program ordinary READ actually worked even faster than READ ... BINARY
SEARCH. However, with thousands of records and about 10 fields (as in my sales report),
BINARY SEARCH performs much better. I’m pretty sure that hashed table would be even
more efficient (unfortunately, it can not be used in that specific report).
http://friendlyabaper.blogspot.com/2006/10/pure-and-simple-truth-about-binary.html