Re: [WebDNA] Limts? -and- Hot Merge

This WebDNA talk-list message is from

2009


It keeps the original formatting.
numero = 102341
interpreted = N
texte = Donovan thinks out of the box again! Where was this idea last year when I had a [spawn] blow up on me because I was processing a 5000 record import table into a database? I gave up on the project because I could not come up with a solution, but this might work. Now, can someone tell me how to use the supposed built in feature that converts CSV files into TAB tables? Because if I could get that feature working my life would be easier too! Matthew A Perosi JewelerWebsites.com ------------------------------by Psi Prime------- Senior Web Developer 323 Union Blvd. Totowa, NJ 07512 Pre-Sales: 888.872.0274 Service: 973.413.8213 Training: 973.413.8214 Fax: 973.413.8217 http://www.jewelerwebsites.com http://en.wikipedia.org/wiki/Psi_Prime%2C_Inc http://www.psiprime.com Donovan Brooke wrote: > Moving this to it's own thread because I have > to open up the 'GREP question' every time to > read it! - Thanks Bob! ;-) > > > Matthew A Perosi wrote: > > Donovan you might remember from our telephone conversations from Spring > > 2007 that I've been pushing the limits for many years now. > > > > I see no reason to migrate anything if it's not broken. Unless you > tell > > me WSC is officially advising it. > > I have 2 CMS systems that use the RAM databases. They are totally rock > > solid in their design and functionality. > > > > Sometimes I log into the WebCatalog admin screen and keep refreshing > the > > Show Database page just to watch how all the DBs are loading and > > flushing. I find it quite amazing. The information I have learned by > > watching that screen has helped me design more efficient code and > > relational databases, as well as help me identify which customers are > > resource hogs who I can charge higher hosting fees to :-) > > > > Right, I think "efficient" code is the key to pushing the limits. > No, I have no "official" recommendation. I just have theories. :-) > I absolutely love the "WebDNA database" system.. it is *the* key feature > of WebDNA that differentiates WebDNA from the competitors the most. > It's a tool that has a lot of value in our field that is not available > elsewhere. > > However, when you get into 100's of MB's of data, personally, I tend to > start thinking about MySQL. This is not a "WSC official > recommondation".. it's just my inclination.. especially if I *don't* > need to search the data all the time. ;-) (WebDNA's [search..] context > is hard to replace) > > I also, personally, have been trying to branch out more and experiment > with things. For instance, I recently had a situation where I needed > to create a "Hot Merge" of sorts. In this effort, I came up with a > little bit different way of doing replaces to WebDNA databases.. > > Basically, a WebDNA *merge* database > lands on the server. Then, my system is charged to check the integrity > of that merge table and then merge it to a database already residing on > the server. > > The experimental part had to do with the fact that these merge's were at > least 7,000 records and could potential scale to 70,000 records. So, > I was worried about server load obviously. This is the basic logic > I came up with to do these kinds of replaces, fully automated > (I just get an email if it succeeds or fails): > > ----start code snippet------------------------------------------------- > [!]-------------------------------------------------------------- > ** Hot Merge Example by Donovan ** > ** Note: This snippet is within a [listfiles..] context that > looks in a "hotfolder/" directory and is fired every 10 > min. by a trigger. This snippet is also after integrity > checks on the uploaded merge file. > ** Note: [tNumRecs_Merge] value is from a [numfound] on the > merge file after the integrity checks. > > -------------------------------------------------------------[/!] > [!] ** Merge 500 records at a time ** [/!] > [text]tNumLoops=[math]floor([tNumRecs_Merge]/500)[/math][/text] > > [!] ** Set some vars ** [/!] > [text multi=T]hmt_index=1[!] > [/!]&tKeyField=MERGE_KEY_FIELD[!] > [/!]&tDestDB_Key=DEST_KEY_FIELD[!] > [/!]&tDestDBPath=your/db/destination.db[!] > [/!]&tNumSecs=10[!] > ^ Edit the above more or less depending on size of > merge. > [/!][/text] > > [!] ** need to set some text vars because some WebDNA is not allowed > in spawn ** [/!] > > [text]tFilename=[filename][/text] > > [!] ** spawn a new process for each block of 500 ** [/!] > [loop start=1&end=[math][tNumLoops]+1[/math]&advance=1] > [spawn] > [search db=hotfolder/[tFilename][!] > [/!]&ne[tKeyField]datarq=find_all[!] > [/!]&startat=[hmt_index][!] > [/!]&max=500] > > [founditems] > [replace db=[tDestDBPath][!] > [/!]&eq[tDestDB_Key]datarq=[interpret][[tKeyField]][/interpret]][!] > [/!]&DestFirstField=[url][Merge1Field][/url][!] > [/!]&DestSecondField=[url][Merge2Field][/url][!] > etc.. > [/!][/replace] > [/founditems] > [/search] > [/spawn] > > [!] ** Wait <[tNumSecs]> seconds to start the next block ** [/!] > [waitforfile file=non_existent_file.txt[!] > [/!]&timeout=[tNumSecs]][/waitforfile] > > [!] ** set index to next 500 block ** [/!] > [text]hmt_index=[math][index]*500[/math][/text] > [/loop] > ----end code------------------------------------------------------- > > > The basic idea was that I didn't really care how long the > merge took, but rather, I wanted to make sure the processor > wasn't overloaded. My idea was to use SPAWN and a waiting technique > using WAITFORFILE to > "spread out" the task. This turned out to work really well I think. > > I used 'top -u apache' to monitor the process on a merge with > 10,000 records, and I didn't see *any* noticeable heightened > processor usage using this code. > > Just thought I'd pass this experiment along to the list! > > Donovan > > > disclaimer: :) the above code was snipped out of a live working system, > but to make it legible and universal, I rewrote a bit of it above > so there could be some syntax errors from the rewrite. > > > > Associated Messages, from the most recent to the oldest:

    
  1. Re: [WebDNA] Limts? -and- Hot Merge ("Psi Prime, Matthew A Perosi " 2009)
  2. Re: [WebDNA] Limts? -and- Hot Merge (Donovan Brooke 2009)
  3. Re: [WebDNA] Limts? -and- Hot Merge (christophe.billiottet@webdna.us 2009)
  4. Re: [WebDNA] Limts? -and- Hot Merge ("Psi Prime, Matthew A Perosi " 2009)
  5. Re: [WebDNA] Limts? -and- Hot Merge (Donovan Brooke 2009)
  6. Re: [WebDNA] Limts? -and- Hot Merge ("Psi Prime, Matthew A Perosi " 2009)
  7. Re: [WebDNA] Limts? -and- Hot Merge (christophe.billiottet@webdna.us 2009)
  8. [WebDNA] Limts? -and- Hot Merge (Donovan Brooke 2009)
Donovan thinks out of the box again! Where was this idea last year when I had a [spawn] blow up on me because I was processing a 5000 record import table into a database? I gave up on the project because I could not come up with a solution, but this might work. Now, can someone tell me how to use the supposed built in feature that converts CSV files into TAB tables? Because if I could get that feature working my life would be easier too! Matthew A Perosi JewelerWebsites.com ------------------------------by Psi Prime------- Senior Web Developer 323 Union Blvd. Totowa, NJ 07512 Pre-Sales: 888.872.0274 Service: 973.413.8213 Training: 973.413.8214 Fax: 973.413.8217 http://www.jewelerwebsites.com http://en.wikipedia.org/wiki/Psi_Prime%2C_Inc http://www.psiprime.com Donovan Brooke wrote: > Moving this to it's own thread because I have > to open up the 'GREP question' every time to > read it! - Thanks Bob! ;-) > > > Matthew A Perosi wrote: > > Donovan you might remember from our telephone conversations from Spring > > 2007 that I've been pushing the limits for many years now. > > > > I see no reason to migrate anything if it's not broken. Unless you > tell > > me WSC is officially advising it. > > I have 2 CMS systems that use the RAM databases. They are totally rock > > solid in their design and functionality. > > > > Sometimes I log into the WebCatalog admin screen and keep refreshing > the > > Show Database page just to watch how all the DBs are loading and > > flushing. I find it quite amazing. The information I have learned by > > watching that screen has helped me design more efficient code and > > relational databases, as well as help me identify which customers are > > resource hogs who I can charge higher hosting fees to :-) > > > > Right, I think "efficient" code is the key to pushing the limits. > No, I have no "official" recommendation. I just have theories. :-) > I absolutely love the "WebDNA database" system.. it is *the* key feature > of WebDNA that differentiates WebDNA from the competitors the most. > It's a tool that has a lot of value in our field that is not available > elsewhere. > > However, when you get into 100's of MB's of data, personally, I tend to > start thinking about MySQL. This is not a "WSC official > recommondation".. it's just my inclination.. especially if I *don't* > need to search the data all the time. ;-) (WebDNA's [search..] context > is hard to replace) > > I also, personally, have been trying to branch out more and experiment > with things. For instance, I recently had a situation where I needed > to create a "Hot Merge" of sorts. In this effort, I came up with a > little bit different way of doing replaces to WebDNA databases.. > > Basically, a WebDNA *merge* database > lands on the server. Then, my system is charged to check the integrity > of that merge table and then merge it to a database already residing on > the server. > > The experimental part had to do with the fact that these merge's were at > least 7,000 records and could potential scale to 70,000 records. So, > I was worried about server load obviously. This is the basic logic > I came up with to do these kinds of replaces, fully automated > (I just get an email if it succeeds or fails): > > ----start code snippet------------------------------------------------- > [!]-------------------------------------------------------------- > ** Hot Merge Example by Donovan ** > ** Note: This snippet is within a [listfiles..] context that > looks in a "hotfolder/" directory and is fired every 10 > min. by a trigger. This snippet is also after integrity > checks on the uploaded merge file. > ** Note: [tNumRecs_Merge] value is from a [numfound] on the > merge file after the integrity checks. > > -------------------------------------------------------------[/!] > [!] ** Merge 500 records at a time ** [/!] > [text]tNumLoops=[math]floor([tNumRecs_Merge]/500)[/math][/text] > > [!] ** Set some vars ** [/!] > [text multi=T]hmt_index=1[!] > [/!]&tKeyField=MERGE_KEY_FIELD[!] > [/!]&tDestDB_Key=DEST_KEY_FIELD[!] > [/!]&tDestDBPath=your/db/destination.db[!] > [/!]&tNumSecs=10[!] > ^ Edit the above more or less depending on size of > merge. > [/!][/text] > > [!] ** need to set some text vars because some WebDNA is not allowed > in spawn ** [/!] > > [text]tFilename=[filename][/text] > > [!] ** spawn a new process for each block of 500 ** [/!] > [loop start=1&end=[math][tNumLoops]+1[/math]&advance=1] > [spawn] > [search db=hotfolder/[tFilename][!] > [/!]&ne[tKeyField]datarq=find_all[!] > [/!]&startat=[hmt_index][!] > [/!]&max=500] > > [founditems] > [replace db=[tDestDBPath][!] > [/!]&eq[tDestDB_Key]datarq=[interpret][[tKeyField]][/interpret]][!] > [/!]&DestFirstField=[url][Merge1Field][/url][!] > [/!]&DestSecondField=[url][Merge2Field][/url][!] > etc.. > [/!][/replace] > [/founditems] > [/search] > [/spawn] > > [!] ** Wait <[tNumSecs]> seconds to start the next block ** [/!] > [waitforfile file=non_existent_file.txt[!] > [/!]&timeout=[tNumSecs]][/waitforfile] > > [!] ** set index to next 500 block ** [/!] > [text]hmt_index=[math][index]*500[/math][/text] > [/loop] > ----end code------------------------------------------------------- > > > The basic idea was that I didn't really care how long the > merge took, but rather, I wanted to make sure the processor > wasn't overloaded. My idea was to use SPAWN and a waiting technique > using WAITFORFILE to > "spread out" the task. This turned out to work really well I think. > > I used 'top -u apache' to monitor the process on a merge with > 10,000 records, and I didn't see *any* noticeable heightened > processor usage using this code. > > Just thought I'd pass this experiment along to the list! > > Donovan > > > disclaimer: :) the above code was snipped out of a live working system, > but to make it legible and universal, I rewrote a bit of it above > so there could be some syntax errors from the rewrite. > > > > "Psi Prime, Matthew A Perosi "

DOWNLOAD WEBDNA NOW!

Top Articles:

Talk List

The WebDNA community talk-list is the best place to get some help: several hundred extremely proficient programmers with an excellent knowledge of WebDNA and an excellent spirit will deliver all the tips and tricks you can imagine...

Related Readings:

Running 2 two WebCatalog.acgi's (1996) Converting characters to numbers ... (1997) [ot] g5 xserve drives (2004) taxTotal, grandTotal (1997) Frames and WebCat (1997) Help! WebCat2 bug (1997) Purchase command error problem (1997) Running webdna on centos 4.2? (2006) Any help gratefully appreciated (2001) Error with [applescript] (1999) Carts & Refering URLs (1997) WebCat2b13 Mac plugin - [sendmail] and checkboxes (1997) Separate SSL Server (1997) WebCatalog 2.0 & WebDNA docs in HTML ... (1997) Using the W* upload p-i with webcat (1998) DON'T use old cart file! (1997) WebCat2_Mac RETURNs in .db (1997) emailer setup (1997) Shipping Calculation Problem (1997) My new discussion forum (2003)