Re: Bug in random search - MacOS v4.5 plugin ...

This WebDNA talk-list message is from

2002


It keeps the original formatting.
numero = 45540
interpreted = N
texte = Hi Brian,Thanks for running the test I posted.I agree that sometimes I could get false matches because of the inclusion of the tags. But that does not explain the repetition of certain sequences of results in several independent tests -- so there's no question that there is a bug in the ra sorting capabilities in this version of webdna ...:(Thanks for your code suggestion. I tested it and got what *appears* to be random sequences in the beginning, when limiting the number of loops to 100 or 200. At least it doesn't repeat any results in series like my ra tests did ...But when I change the string to three characters and increase the loops to 500, nearly 80% of the last 50 results have already been found inside the first 450 results. This is not what one would expect when there are 17,576 possible combinations of three-letter values. Instead, only 0.3% (not 80%) of the last 50 results should be repeated, statistically speaking. So this technique is not the best solution either, although it works a lot better than using ra in a search, at least for the first 100 or so results ... :)Before you posted your suggested code, I created a database of all possible 3-letter combinations of the letters a-z -- 17,576 records. That's more than enough for my current needs. I have decided to search this db, then use the first 3-letter string found -- that is not already used as a unique record value in my other db -- for the next unique value in my other db.Of course, I could use the [cart] value to create a unique string, but cart values create other problems that I'm trying to avoid. These values are not all the same length -- which means that I cannot use certain coding techniques that I want to use. And they waste a lot of space in the db when I'm trying to store as little information as possible.If I add the 10 numeric digits to my 26 character database (thus using 36 unique characters in various 3-character combinations) the number of unique values jumps from 17,576 to 46,656.And if I use the same 36 unique characters in 4-character combinations, the number of unique values rises to a whopping 1.68 million combinations -- which means I can create a really huge number of unique values with only 4 characters per value.Of course it would be nice if I could find a way to create these values on the fly -- without having to maintain a database of 1.68 million unique 4-character values -- but that's a task I have not yet attempted ...:)>Ken - > >I ran your test on my dev server - OSX 10.1.5 / WebDNA 4.5.0 - and >it worked fine for me. In 30+ attempts, I never saw the repeating >result problem. It could be a bug in the particular version you're >running. > >As another test, I wrote code to pull characters at random from a >text variable instead of a DB. The results appeared equally well >distributed (though I didn't run a full statistical analysis) and it >ran about 30% faster than the DB method - generally 25-30 ticks vs. >35-40 per execution. You might try this version instead... > >[text]abc=abcdefghijklmnopqrstuvwxyz[/text] >[loop start=1&end=200] > >[math show=F]x=ceil([random format=float]*26)[/math] >[text]firstLetter=[getchars start=[x]&end=[x]][abc][/getchars][/text] > >[math show=F]x=ceil([random format=float]*26)[/math] >[text]secondLetter=[getchars start=[x]&end=[x]][abc][/getchars][/text] > >[showif [url][twoletterlist][/url]^[url][firstLetter][secondLetter][/url]] >[text]twoletterlist=[twoletterlist]color=red>[firstLetter][secondLetter] [/text] >[/showif] > >[hideif [url][twoletterlist][/url]^[url][firstLetter][secondLetter][/url]] >[text]twoletterlist=[twoletterlist][firstLetter][secondLetter] [/text] >[/hideif] > >[/loop] > > >... Note: The only flaw I see in your logic is that you could get >false matches on your [showif]... if you get a character combination >of fo, for example, that would match from the word font if there >had already been a duplicate in your twoletterlist > >- brian > >At 6:41 PM 11/19/02, Kenneth Grome wrote: >>Here's some more test results. After 30 tests of my code, I found >>that webdna got stuck repeating the following strings of >>two-character results. Some of these strings appeared in as many >>as seven different tests (with only 30 attemtps), so obviously the >>results are *repeatable*. >> >>In fact, only half of my tests gave me the results they should >>have: no repeating strings in the results. >> >>I think this is strong evidence that there is a definite bug in the >>ra sort capability of the webdna software. >> >>:( >> >> >>ys bi li - repeated in 7 different tests >> >>iy sb il - repeated in 3 different tests >> >>jp xj ui lx gz bi ys ve kp tq qp dz od gg yq ms wh xl nz bv ez qe >>sz kh ql uq jr jj it wh lu xs dg mm th pv ti pc hs oq in zd - >>repeated in 2 different tests >> >>pc hs oq in zd jp xj ui lx gz bi ys ve kp tq qp dz od gg yq ms wh >>xl nz bv ez qe sz kh ql uq jr jj it wh lu xs dg mm th pv ti - >>repeated in 1 test >> >>ly te ih ki ss ix sl yt ei hk is si xs - repeated in 1 test >> >>dj px ju il xg zb iy sv ek pt qq pd zo dg gy qm sw hx ln zb ve zq >>es zk hq lu qj rj ji tw hl ux sd gm mt hp vt ip ch so qi nz - >>repeated in 1 test >> >>pt qq pd zo dg gy qm sw hx ln zb ve zq es zk hq lu qj rj ji tw hl >>ux sd gm mt hp vt ip ch so qi nz dj px ju il xg zb iy sv ek - >>repeated in 1 test >> >>tq qp dz od gg yq ms wh xl nz bv ez qe sz kh ql uq jr jj it wh lu >>xs dg mm th pv ti pc hs oq in zd jp xj ui lx gz bi ys ve kp - >>repeated in 1 test >> >> >>Sincerely, >>Kenneth Grome > > > >------------------------------------------------------------- >This message is sent to you because you are subscribed to > the mailing list . >To unsubscribe, E-mail to: >To switch to the DIGEST mode, E-mail to > >Web Archive of this list is at: http://search.smithmicro.com/Sincerely, Kenneth Grome--------------------------------------------------- WebDNA Professional Training and Development Center 175 J. Llorente Street +63 (32) 255-6921 Cebu City, Cebu 6000 kengrome@webdna.net Philippines http://www.webdna.net ---------------------------------------------------------------------------------------------------------------- This message is sent to you because you are subscribed to the mailing list . To unsubscribe, E-mail to: To switch to the DIGEST mode, E-mail to Web Archive of this list is at: http://search.smithmicro.com/ Associated Messages, from the most recent to the oldest:

    
  1. Re: Bug in random search - MacOS v4.5 plugin ... (Kenneth Grome 2002)
  2. Re: Bug in random search - MacOS v4.5 plugin ... (Frank Nordberg 2002)
  3. Re: Bug in random search - MacOS v4.5 plugin ... (Kenneth Grome 2002)
  4. Re: Bug in random search - MacOS v4.5 plugin ... (Brian Fries 2002)
  5. Re: Bug in random search - MacOS v4.5 plugin ... (Kenneth Grome 2002)
  6. Bug in random search - MacOS v4.5 plugin ... (Kenneth Grome 2002)
Hi Brian,Thanks for running the test I posted.I agree that sometimes I could get false matches because of the inclusion of the tags. But that does not explain the repetition of certain sequences of results in several independent tests -- so there's no question that there is a bug in the ra sorting capabilities in this version of webdna ...:(Thanks for your code suggestion. I tested it and got what *appears* to be random sequences in the beginning, when limiting the number of loops to 100 or 200. At least it doesn't repeat any results in series like my ra tests did ...But when I change the string to three characters and increase the loops to 500, nearly 80% of the last 50 results have already been found inside the first 450 results. This is not what one would expect when there are 17,576 possible combinations of three-letter values. Instead, only 0.3% (not 80%) of the last 50 results should be repeated, statistically speaking. So this technique is not the best solution either, although it works a lot better than using ra in a search, at least for the first 100 or so results ... :)Before you posted your suggested code, I created a database of all possible 3-letter combinations of the letters a-z -- 17,576 records. That's more than enough for my current needs. I have decided to search this db, then use the first 3-letter string found -- that is not already used as a unique record value in my other db -- for the next unique value in my other db.Of course, I could use the [cart] value to create a unique string, but cart values create other problems that I'm trying to avoid. These values are not all the same length -- which means that I cannot use certain coding techniques that I want to use. And they waste a lot of space in the db when I'm trying to store as little information as possible.If I add the 10 numeric digits to my 26 character database (thus using 36 unique characters in various 3-character combinations) the number of unique values jumps from 17,576 to 46,656.And if I use the same 36 unique characters in 4-character combinations, the number of unique values rises to a whopping 1.68 million combinations -- which means I can create a really huge number of unique values with only 4 characters per value.Of course it would be nice if I could find a way to create these values on the fly -- without having to maintain a database of 1.68 million unique 4-character values -- but that's a task I have not yet attempted ...:)>Ken - > >I ran your test on my dev server - OSX 10.1.5 / WebDNA 4.5.0 - and >it worked fine for me. In 30+ attempts, I never saw the repeating >result problem. It could be a bug in the particular version you're >running. > >As another test, I wrote code to pull characters at random from a >text variable instead of a DB. The results appeared equally well >distributed (though I didn't run a full statistical analysis) and it >ran about 30% faster than the DB method - generally 25-30 ticks vs. >35-40 per execution. You might try this version instead... > >[text]abc=abcdefghijklmnopqrstuvwxyz[/text] >[loop start=1&end=200] > >[math show=F]x=ceil([random format=float]*26)[/math] >[text]firstLetter=[getchars start=[x]&end=[x]][abc][/getchars][/text] > >[math show=F]x=ceil([random format=float]*26)[/math] >[text]secondLetter=[getchars start=[x]&end=[x]][abc][/getchars][/text] > >[showif [url][twoletterlist][/url]^[url][firstLetter][secondLetter][/url]] >[text]twoletterlist=[twoletterlist]color=red>[firstLetter][secondLetter] [/text] >[/showif] > >[hideif [url][twoletterlist][/url]^[url][firstLetter][secondLetter][/url]] >[text]twoletterlist=[twoletterlist][firstLetter][secondLetter] [/text] >[/hideif] > >[/loop] > > >... Note: The only flaw I see in your logic is that you could get >false matches on your [showif]... if you get a character combination >of fo, for example, that would match from the word font if there >had already been a duplicate in your twoletterlist > >- brian > >At 6:41 PM 11/19/02, Kenneth Grome wrote: >>Here's some more test results. After 30 tests of my code, I found >>that webdna got stuck repeating the following strings of >>two-character results. Some of these strings appeared in as many >>as seven different tests (with only 30 attemtps), so obviously the >>results are *repeatable*. >> >>In fact, only half of my tests gave me the results they should >>have: no repeating strings in the results. >> >>I think this is strong evidence that there is a definite bug in the >>ra sort capability of the webdna software. >> >>:( >> >> >>ys bi li - repeated in 7 different tests >> >>iy sb il - repeated in 3 different tests >> >>jp xj ui lx gz bi ys ve kp tq qp dz od gg yq ms wh xl nz bv ez qe >>sz kh ql uq jr jj it wh lu xs dg mm th pv ti pc hs oq in zd - >>repeated in 2 different tests >> >>pc hs oq in zd jp xj ui lx gz bi ys ve kp tq qp dz od gg yq ms wh >>xl nz bv ez qe sz kh ql uq jr jj it wh lu xs dg mm th pv ti - >>repeated in 1 test >> >>ly te ih ki ss ix sl yt ei hk is si xs - repeated in 1 test >> >>dj px ju il xg zb iy sv ek pt qq pd zo dg gy qm sw hx ln zb ve zq >>es zk hq lu qj rj ji tw hl ux sd gm mt hp vt ip ch so qi nz - >>repeated in 1 test >> >>pt qq pd zo dg gy qm sw hx ln zb ve zq es zk hq lu qj rj ji tw hl >>ux sd gm mt hp vt ip ch so qi nz dj px ju il xg zb iy sv ek - >>repeated in 1 test >> >>tq qp dz od gg yq ms wh xl nz bv ez qe sz kh ql uq jr jj it wh lu >>xs dg mm th pv ti pc hs oq in zd jp xj ui lx gz bi ys ve kp - >>repeated in 1 test >> >> >>Sincerely, >>Kenneth Grome > > > >------------------------------------------------------------- >This message is sent to you because you are subscribed to > the mailing list . >To unsubscribe, E-mail to: >To switch to the DIGEST mode, E-mail to > >Web Archive of this list is at: http://search.smithmicro.com/Sincerely, Kenneth Grome--------------------------------------------------- WebDNA Professional Training and Development Center 175 J. Llorente Street +63 (32) 255-6921 Cebu City, Cebu 6000 kengrome@webdna.net Philippines http://www.webdna.net ---------------------------------------------------------------------------------------------------------------- This message is sent to you because you are subscribed to the mailing list . To unsubscribe, E-mail to: To switch to the DIGEST mode, E-mail to Web Archive of this list is at: http://search.smithmicro.com/ Kenneth Grome

DOWNLOAD WEBDNA NOW!

Top Articles:

Talk List

The WebDNA community talk-list is the best place to get some help: several hundred extremely proficient programmers with an excellent knowledge of WebDNA and an excellent spirit will deliver all the tips and tricks you can imagine...

Related Readings:

Site Search Concepts (2003) printing twice? and fix (1997) Bad match! (2002) Re[5]: Problem with new formvariables (2000) attn: smitmicro - cart limitation (2002) WebCat cannot handle compatible search parameters? (1997) [OT] XML data to tab delimited database (2003) [countWords]? (1997) b18 problem on NT 4.0 (1997) Use of & in text areas.... (1998) multiple search commands (1997) [showif] / [hideif] (1997) [WebDNA] Seriously what is wrong here? (2011) Not really WebCat (1997) E-mailer error codes (1997) RE: New WebCatalog Version !!! (1997) Db design question (relational vs. flat?) (1998) WebDNA and SQL (2003) WebCat2b13 Mac plugin - [sendmail] and checkboxes (1997) unitshipcost vs shipcosts (1997)