[WebDNA] Limts? -and- Hot Merge

This WebDNA talk-list message is from

2009


It keeps the original formatting.
numero = 102339
interpreted = N
texte = Moving this to it's own thread because I have to open up the 'GREP question' every time to read it! - Thanks Bob! ;-) Matthew A Perosi wrote: > Donovan you might remember from our telephone conversations from Spring > 2007 that I've been pushing the limits for many years now. > > I see no reason to migrate anything if it's not broken. Unless you tell > me WSC is officially advising it. > I have 2 CMS systems that use the RAM databases. They are totally rock > solid in their design and functionality. > > Sometimes I log into the WebCatalog admin screen and keep refreshing the > Show Database page just to watch how all the DBs are loading and > flushing. I find it quite amazing. The information I have learned by > watching that screen has helped me design more efficient code and > relational databases, as well as help me identify which customers are > resource hogs who I can charge higher hosting fees to :-) Right, I think "efficient" code is the key to pushing the limits. No, I have no "official" recommendation. I just have theories. :-) I absolutely love the "WebDNA database" system.. it is *the* key feature of WebDNA that differentiates WebDNA from the competitors the most. It's a tool that has a lot of value in our field that is not available elsewhere. However, when you get into 100's of MB's of data, personally, I tend to start thinking about MySQL. This is not a "WSC official recommondation".. it's just my inclination.. especially if I *don't* need to search the data all the time. ;-) (WebDNA's [search..] context is hard to replace) I also, personally, have been trying to branch out more and experiment with things. For instance, I recently had a situation where I needed to create a "Hot Merge" of sorts. In this effort, I came up with a little bit different way of doing replaces to WebDNA databases.. Basically, a WebDNA *merge* database lands on the server. Then, my system is charged to check the integrity of that merge table and then merge it to a database already residing on the server. The experimental part had to do with the fact that these merge's were at least 7,000 records and could potential scale to 70,000 records. So, I was worried about server load obviously. This is the basic logic I came up with to do these kinds of replaces, fully automated (I just get an email if it succeeds or fails): ----start code snippet------------------------------------------------- [!]-------------------------------------------------------------- ** Hot Merge Example by Donovan ** ** Note: This snippet is within a [listfiles..] context that looks in a "hotfolder/" directory and is fired every 10 min. by a trigger. This snippet is also after integrity checks on the uploaded merge file. ** Note: [tNumRecs_Merge] value is from a [numfound] on the merge file after the integrity checks. -------------------------------------------------------------[/!] [!] ** Merge 500 records at a time ** [/!] [text]tNumLoops=[math]floor([tNumRecs_Merge]/500)[/math][/text] [!] ** Set some vars ** [/!] [text multi=T]hmt_index=1[!] [/!]&tKeyField=MERGE_KEY_FIELD[!] [/!]&tDestDB_Key=DEST_KEY_FIELD[!] [/!]&tDestDBPath=your/db/destination.db[!] [/!]&tNumSecs=10[!] ^ Edit the above more or less depending on size of merge. [/!][/text] [!] ** need to set some text vars because some WebDNA is not allowed in spawn ** [/!] [text]tFilename=[filename][/text] [!] ** spawn a new process for each block of 500 ** [/!] [loop start=1&end=[math][tNumLoops]+1[/math]&advance=1] [spawn] [search db=hotfolder/[tFilename][!] [/!]&ne[tKeyField]datarq=find_all[!] [/!]&startat=[hmt_index][!] [/!]&max=500] [founditems] [replace db=[tDestDBPath][!] [/!]&eq[tDestDB_Key]datarq=[interpret][[tKeyField]][/interpret]][!] [/!]&DestFirstField=[url][Merge1Field][/url][!] [/!]&DestSecondField=[url][Merge2Field][/url][!] etc.. [/!][/replace] [/founditems] [/search] [/spawn] [!] ** Wait <[tNumSecs]> seconds to start the next block ** [/!] [waitforfile file=non_existent_file.txt[!] [/!]&timeout=[tNumSecs]][/waitforfile]  [!] ** set index to next 500 block ** [/!] [text]hmt_index=[math][index]*500[/math][/text] [/loop] ----end code------------------------------------------------------- The basic idea was that I didn't really care how long the merge took, but rather, I wanted to make sure the processor wasn't overloaded. My idea was to use SPAWN and a waiting technique using WAITFORFILE to "spread out" the task. This turned out to work really well I think. I used 'top -u apache' to monitor the process on a merge with 10,000 records, and I didn't see *any* noticeable heightened processor usage using this code. Just thought I'd pass this experiment along to the list! Donovan disclaimer: :) the above code was snipped out of a live working system, but to make it legible and universal, I rewrote a bit of it above so there could be some syntax errors from the rewrite. -- Donovan D. Brooke PH: 1 (608) 770-3822 ------------------------------------------------ VP WebDNA Software Corporation 16192 Coastal Highway Lewes, DE 19958 Associated Messages, from the most recent to the oldest:

    
  1. Re: [WebDNA] Limts? -and- Hot Merge ("Psi Prime, Matthew A Perosi " 2009)
  2. Re: [WebDNA] Limts? -and- Hot Merge (Donovan Brooke 2009)
  3. Re: [WebDNA] Limts? -and- Hot Merge (christophe.billiottet@webdna.us 2009)
  4. Re: [WebDNA] Limts? -and- Hot Merge ("Psi Prime, Matthew A Perosi " 2009)
  5. Re: [WebDNA] Limts? -and- Hot Merge (Donovan Brooke 2009)
  6. Re: [WebDNA] Limts? -and- Hot Merge ("Psi Prime, Matthew A Perosi " 2009)
  7. Re: [WebDNA] Limts? -and- Hot Merge (christophe.billiottet@webdna.us 2009)
  8. [WebDNA] Limts? -and- Hot Merge (Donovan Brooke 2009)
Moving this to it's own thread because I have to open up the 'GREP question' every time to read it! - Thanks Bob! ;-) Matthew A Perosi wrote: > Donovan you might remember from our telephone conversations from Spring > 2007 that I've been pushing the limits for many years now. > > I see no reason to migrate anything if it's not broken. Unless you tell > me WSC is officially advising it. > I have 2 CMS systems that use the RAM databases. They are totally rock > solid in their design and functionality. > > Sometimes I log into the WebCatalog admin screen and keep refreshing the > Show Database page just to watch how all the DBs are loading and > flushing. I find it quite amazing. The information I have learned by > watching that screen has helped me design more efficient code and > relational databases, as well as help me identify which customers are > resource hogs who I can charge higher hosting fees to :-) Right, I think "efficient" code is the key to pushing the limits. No, I have no "official" recommendation. I just have theories. :-) I absolutely love the "WebDNA database" system.. it is *the* key feature of WebDNA that differentiates WebDNA from the competitors the most. It's a tool that has a lot of value in our field that is not available elsewhere. However, when you get into 100's of MB's of data, personally, I tend to start thinking about MySQL. This is not a "WSC official recommondation".. it's just my inclination.. especially if I *don't* need to search the data all the time. ;-) (WebDNA's [search..] context is hard to replace) I also, personally, have been trying to branch out more and experiment with things. For instance, I recently had a situation where I needed to create a "Hot Merge" of sorts. In this effort, I came up with a little bit different way of doing replaces to WebDNA databases.. Basically, a WebDNA *merge* database lands on the server. Then, my system is charged to check the integrity of that merge table and then merge it to a database already residing on the server. The experimental part had to do with the fact that these merge's were at least 7,000 records and could potential scale to 70,000 records. So, I was worried about server load obviously. This is the basic logic I came up with to do these kinds of replaces, fully automated (I just get an email if it succeeds or fails): ----start code snippet------------------------------------------------- [!]-------------------------------------------------------------- ** Hot Merge Example by Donovan ** ** Note: This snippet is within a [listfiles..] context that looks in a "hotfolder/" directory and is fired every 10 min. by a trigger. This snippet is also after integrity checks on the uploaded merge file. ** Note: [tNumRecs_Merge] value is from a [numfound] on the merge file after the integrity checks. -------------------------------------------------------------[/!] [!] ** Merge 500 records at a time ** [/!] [text]tNumLoops=[math]floor([tNumRecs_Merge]/500)[/math][/text] [!] ** Set some vars ** [/!] [text multi=T]hmt_index=1[!] [/!]&tKeyField=MERGE_KEY_FIELD[!] [/!]&tDestDB_Key=DEST_KEY_FIELD[!] [/!]&tDestDBPath=your/db/destination.db[!] [/!]&tNumSecs=10[!] ^ Edit the above more or less depending on size of merge. [/!][/text] [!] ** need to set some text vars because some WebDNA is not allowed in spawn ** [/!] [text]tFilename=[filename][/text] [!] ** spawn a new process for each block of 500 ** [/!] [loop start=1&end=[math][tNumLoops]+1[/math]&advance=1] [spawn] [search db=hotfolder/[tFilename][!] [/!]&ne[tKeyField]datarq=find_all[!] [/!]&startat=[hmt_index][!] [/!]&max=500] [founditems] [replace db=[tDestDBPath][!] [/!]&eq[tDestDB_Key]datarq=[interpret][[tKeyField]][/interpret]][!] [/!]&DestFirstField=[url][Merge1Field][/url][!] [/!]&DestSecondField=[url][Merge2Field][/url][!] etc.. [/!][/replace] [/founditems] [/search] [/spawn] [!] ** Wait <[tNumSecs]> seconds to start the next block ** [/!] [waitforfile file=non_existent_file.txt[!] [/!]&timeout=[tNumSecs]][/waitforfile]  [!] ** set index to next 500 block ** [/!] [text]hmt_index=[math][index]*500[/math][/text] [/loop] ----end code------------------------------------------------------- The basic idea was that I didn't really care how long the merge took, but rather, I wanted to make sure the processor wasn't overloaded. My idea was to use SPAWN and a waiting technique using WAITFORFILE to "spread out" the task. This turned out to work really well I think. I used 'top -u apache' to monitor the process on a merge with 10,000 records, and I didn't see *any* noticeable heightened processor usage using this code. Just thought I'd pass this experiment along to the list! Donovan disclaimer: :) the above code was snipped out of a live working system, but to make it legible and universal, I rewrote a bit of it above so there could be some syntax errors from the rewrite. -- Donovan D. Brooke PH: 1 (608) 770-3822 ------------------------------------------------ VP WebDNA Software Corporation 16192 Coastal Highway Lewes, DE 19958 Donovan Brooke

DOWNLOAD WEBDNA NOW!

Top Articles:

Talk List

The WebDNA community talk-list is the best place to get some help: several hundred extremely proficient programmers with an excellent knowledge of WebDNA and an excellent spirit will deliver all the tips and tricks you can imagine...

Related Readings:

NewCart+Search with one click ? (1997) Mauthcapture vs mauthonly (2002) [WebDNA] jQuery validation and webdna (2015) Download URL & access on the fly ? (1997) Cache headers - WebDNA 5 (2005) RAM variables (1997) Multiple Passwords (1997) typhoon... (1997) Math Function (1997) Pithy questions on webcommerce & siteedit (1997) FYI: Apache Module perchild (2002) Generating unique SKU from [cart] - FIXED! (1997) More DateMath problems (1997) Calculating multiple shipping... (1997) NEW NetProfessional Revealed (1998) WebCatalog 4.0 ? (2000) Major bug report on rootbeer (1997) [WebDNA] User sessions - cookies only or cookies and a sessions.db? (2016) Looking up two prices in database? (1997) What am I missing (1997)