Re: [WebDNA] Limts? -and- Hot Merge

This WebDNA talk-list message is from

2009


It keeps the original formatting.
numero = 102341
interpreted = N
texte = Donovan thinks out of the box again! Where was this idea last year when I had a [spawn] blow up on me because I was processing a 5000 record import table into a database? I gave up on the project because I could not come up with a solution, but this might work. Now, can someone tell me how to use the supposed built in feature that converts CSV files into TAB tables? Because if I could get that feature working my life would be easier too! Matthew A Perosi JewelerWebsites.com ------------------------------by Psi Prime------- Senior Web Developer 323 Union Blvd. Totowa, NJ 07512 Pre-Sales: 888.872.0274 Service: 973.413.8213 Training: 973.413.8214 Fax: 973.413.8217 http://www.jewelerwebsites.com http://en.wikipedia.org/wiki/Psi_Prime%2C_Inc http://www.psiprime.com Donovan Brooke wrote: > Moving this to it's own thread because I have > to open up the 'GREP question' every time to > read it! - Thanks Bob! ;-) > > > Matthew A Perosi wrote: > > Donovan you might remember from our telephone conversations from Spring > > 2007 that I've been pushing the limits for many years now. > > > > I see no reason to migrate anything if it's not broken. Unless you > tell > > me WSC is officially advising it. > > I have 2 CMS systems that use the RAM databases. They are totally rock > > solid in their design and functionality. > > > > Sometimes I log into the WebCatalog admin screen and keep refreshing > the > > Show Database page just to watch how all the DBs are loading and > > flushing. I find it quite amazing. The information I have learned by > > watching that screen has helped me design more efficient code and > > relational databases, as well as help me identify which customers are > > resource hogs who I can charge higher hosting fees to :-) > > > > Right, I think "efficient" code is the key to pushing the limits. > No, I have no "official" recommendation. I just have theories. :-) > I absolutely love the "WebDNA database" system.. it is *the* key feature > of WebDNA that differentiates WebDNA from the competitors the most. > It's a tool that has a lot of value in our field that is not available > elsewhere. > > However, when you get into 100's of MB's of data, personally, I tend to > start thinking about MySQL. This is not a "WSC official > recommondation".. it's just my inclination.. especially if I *don't* > need to search the data all the time. ;-) (WebDNA's [search..] context > is hard to replace) > > I also, personally, have been trying to branch out more and experiment > with things. For instance, I recently had a situation where I needed > to create a "Hot Merge" of sorts. In this effort, I came up with a > little bit different way of doing replaces to WebDNA databases.. > > Basically, a WebDNA *merge* database > lands on the server. Then, my system is charged to check the integrity > of that merge table and then merge it to a database already residing on > the server. > > The experimental part had to do with the fact that these merge's were at > least 7,000 records and could potential scale to 70,000 records. So, > I was worried about server load obviously. This is the basic logic > I came up with to do these kinds of replaces, fully automated > (I just get an email if it succeeds or fails): > > ----start code snippet------------------------------------------------- > [!]-------------------------------------------------------------- > ** Hot Merge Example by Donovan ** > ** Note: This snippet is within a [listfiles..] context that > looks in a "hotfolder/" directory and is fired every 10 > min. by a trigger. This snippet is also after integrity > checks on the uploaded merge file. > ** Note: [tNumRecs_Merge] value is from a [numfound] on the > merge file after the integrity checks. > > -------------------------------------------------------------[/!] > [!] ** Merge 500 records at a time ** [/!] > [text]tNumLoops=[math]floor([tNumRecs_Merge]/500)[/math][/text] > > [!] ** Set some vars ** [/!] > [text multi=T]hmt_index=1[!] > [/!]&tKeyField=MERGE_KEY_FIELD[!] > [/!]&tDestDB_Key=DEST_KEY_FIELD[!] > [/!]&tDestDBPath=your/db/destination.db[!] > [/!]&tNumSecs=10[!] > ^ Edit the above more or less depending on size of > merge. > [/!][/text] > > [!] ** need to set some text vars because some WebDNA is not allowed > in spawn ** [/!] > > [text]tFilename=[filename][/text] > > [!] ** spawn a new process for each block of 500 ** [/!] > [loop start=1&end=[math][tNumLoops]+1[/math]&advance=1] > [spawn] > [search db=hotfolder/[tFilename][!] > [/!]&ne[tKeyField]datarq=find_all[!] > [/!]&startat=[hmt_index][!] > [/!]&max=500] > > [founditems] > [replace db=[tDestDBPath][!] > [/!]&eq[tDestDB_Key]datarq=[interpret][[tKeyField]][/interpret]][!] > [/!]&DestFirstField=[url][Merge1Field][/url][!] > [/!]&DestSecondField=[url][Merge2Field][/url][!] > etc.. > [/!][/replace] > [/founditems] > [/search] > [/spawn] > > [!] ** Wait <[tNumSecs]> seconds to start the next block ** [/!] > [waitforfile file=non_existent_file.txt[!] > [/!]&timeout=[tNumSecs]][/waitforfile] > > [!] ** set index to next 500 block ** [/!] > [text]hmt_index=[math][index]*500[/math][/text] > [/loop] > ----end code------------------------------------------------------- > > > The basic idea was that I didn't really care how long the > merge took, but rather, I wanted to make sure the processor > wasn't overloaded. My idea was to use SPAWN and a waiting technique > using WAITFORFILE to > "spread out" the task. This turned out to work really well I think. > > I used 'top -u apache' to monitor the process on a merge with > 10,000 records, and I didn't see *any* noticeable heightened > processor usage using this code. > > Just thought I'd pass this experiment along to the list! > > Donovan > > > disclaimer: :) the above code was snipped out of a live working system, > but to make it legible and universal, I rewrote a bit of it above > so there could be some syntax errors from the rewrite. > > > > Associated Messages, from the most recent to the oldest:

    
  1. Re: [WebDNA] Limts? -and- Hot Merge ("Psi Prime, Matthew A Perosi " 2009)
  2. Re: [WebDNA] Limts? -and- Hot Merge (Donovan Brooke 2009)
  3. Re: [WebDNA] Limts? -and- Hot Merge (christophe.billiottet@webdna.us 2009)
  4. Re: [WebDNA] Limts? -and- Hot Merge ("Psi Prime, Matthew A Perosi " 2009)
  5. Re: [WebDNA] Limts? -and- Hot Merge (Donovan Brooke 2009)
  6. Re: [WebDNA] Limts? -and- Hot Merge ("Psi Prime, Matthew A Perosi " 2009)
  7. Re: [WebDNA] Limts? -and- Hot Merge (christophe.billiottet@webdna.us 2009)
  8. [WebDNA] Limts? -and- Hot Merge (Donovan Brooke 2009)
Donovan thinks out of the box again! Where was this idea last year when I had a [spawn] blow up on me because I was processing a 5000 record import table into a database? I gave up on the project because I could not come up with a solution, but this might work. Now, can someone tell me how to use the supposed built in feature that converts CSV files into TAB tables? Because if I could get that feature working my life would be easier too! Matthew A Perosi JewelerWebsites.com ------------------------------by Psi Prime------- Senior Web Developer 323 Union Blvd. Totowa, NJ 07512 Pre-Sales: 888.872.0274 Service: 973.413.8213 Training: 973.413.8214 Fax: 973.413.8217 http://www.jewelerwebsites.com http://en.wikipedia.org/wiki/Psi_Prime%2C_Inc http://www.psiprime.com Donovan Brooke wrote: > Moving this to it's own thread because I have > to open up the 'GREP question' every time to > read it! - Thanks Bob! ;-) > > > Matthew A Perosi wrote: > > Donovan you might remember from our telephone conversations from Spring > > 2007 that I've been pushing the limits for many years now. > > > > I see no reason to migrate anything if it's not broken. Unless you > tell > > me WSC is officially advising it. > > I have 2 CMS systems that use the RAM databases. They are totally rock > > solid in their design and functionality. > > > > Sometimes I log into the WebCatalog admin screen and keep refreshing > the > > Show Database page just to watch how all the DBs are loading and > > flushing. I find it quite amazing. The information I have learned by > > watching that screen has helped me design more efficient code and > > relational databases, as well as help me identify which customers are > > resource hogs who I can charge higher hosting fees to :-) > > > > Right, I think "efficient" code is the key to pushing the limits. > No, I have no "official" recommendation. I just have theories. :-) > I absolutely love the "WebDNA database" system.. it is *the* key feature > of WebDNA that differentiates WebDNA from the competitors the most. > It's a tool that has a lot of value in our field that is not available > elsewhere. > > However, when you get into 100's of MB's of data, personally, I tend to > start thinking about MySQL. This is not a "WSC official > recommondation".. it's just my inclination.. especially if I *don't* > need to search the data all the time. ;-) (WebDNA's [search..] context > is hard to replace) > > I also, personally, have been trying to branch out more and experiment > with things. For instance, I recently had a situation where I needed > to create a "Hot Merge" of sorts. In this effort, I came up with a > little bit different way of doing replaces to WebDNA databases.. > > Basically, a WebDNA *merge* database > lands on the server. Then, my system is charged to check the integrity > of that merge table and then merge it to a database already residing on > the server. > > The experimental part had to do with the fact that these merge's were at > least 7,000 records and could potential scale to 70,000 records. So, > I was worried about server load obviously. This is the basic logic > I came up with to do these kinds of replaces, fully automated > (I just get an email if it succeeds or fails): > > ----start code snippet------------------------------------------------- > [!]-------------------------------------------------------------- > ** Hot Merge Example by Donovan ** > ** Note: This snippet is within a [listfiles..] context that > looks in a "hotfolder/" directory and is fired every 10 > min. by a trigger. This snippet is also after integrity > checks on the uploaded merge file. > ** Note: [tNumRecs_Merge] value is from a [numfound] on the > merge file after the integrity checks. > > -------------------------------------------------------------[/!] > [!] ** Merge 500 records at a time ** [/!] > [text]tNumLoops=[math]floor([tNumRecs_Merge]/500)[/math][/text] > > [!] ** Set some vars ** [/!] > [text multi=T]hmt_index=1[!] > [/!]&tKeyField=MERGE_KEY_FIELD[!] > [/!]&tDestDB_Key=DEST_KEY_FIELD[!] > [/!]&tDestDBPath=your/db/destination.db[!] > [/!]&tNumSecs=10[!] > ^ Edit the above more or less depending on size of > merge. > [/!][/text] > > [!] ** need to set some text vars because some WebDNA is not allowed > in spawn ** [/!] > > [text]tFilename=[filename][/text] > > [!] ** spawn a new process for each block of 500 ** [/!] > [loop start=1&end=[math][tNumLoops]+1[/math]&advance=1] > [spawn] > [search db=hotfolder/[tFilename][!] > [/!]&ne[tKeyField]datarq=find_all[!] > [/!]&startat=[hmt_index][!] > [/!]&max=500] > > [founditems] > [replace db=[tDestDBPath][!] > [/!]&eq[tDestDB_Key]datarq=[interpret][[tKeyField]][/interpret]][!] > [/!]&DestFirstField=[url][Merge1Field][/url][!] > [/!]&DestSecondField=[url][Merge2Field][/url][!] > etc.. > [/!][/replace] > [/founditems] > [/search] > [/spawn] > > [!] ** Wait <[tNumSecs]> seconds to start the next block ** [/!] > [waitforfile file=non_existent_file.txt[!] > [/!]&timeout=[tNumSecs]][/waitforfile] > > [!] ** set index to next 500 block ** [/!] > [text]hmt_index=[math][index]*500[/math][/text] > [/loop] > ----end code------------------------------------------------------- > > > The basic idea was that I didn't really care how long the > merge took, but rather, I wanted to make sure the processor > wasn't overloaded. My idea was to use SPAWN and a waiting technique > using WAITFORFILE to > "spread out" the task. This turned out to work really well I think. > > I used 'top -u apache' to monitor the process on a merge with > 10,000 records, and I didn't see *any* noticeable heightened > processor usage using this code. > > Just thought I'd pass this experiment along to the list! > > Donovan > > > disclaimer: :) the above code was snipped out of a live working system, > but to make it legible and universal, I rewrote a bit of it above > so there could be some syntax errors from the rewrite. > > > > "Psi Prime, Matthew A Perosi "

DOWNLOAD WEBDNA NOW!

Top Articles:

Talk List

The WebDNA community talk-list is the best place to get some help: several hundred extremely proficient programmers with an excellent knowledge of WebDNA and an excellent spirit will deliver all the tips and tricks you can imagine...

Related Readings:

access denied problem (1997) can you take a look (2003) Updating checkboxes made easy !!! (1998) WebDNA 5.1 and Mac OS X Server v10.3 - Working ??? (2003) [urgent] Phone number at SM (2006) Question about replacing words (1998) why won't this work, please tell me??? (2001) searchable list archive (1997) Database Updates (1997) search form problem.. (1997) OT- AS/400 and Macs (2003) Sort of a Dilema! (1998) Problem with CC problem ? (1997) keep W* in front applescript? (1998) WebCat2_Mac RETURNs in .db (1997) Thanks ! (1997) Exclamation point (1997) Email (1998) [WebDNA] png support in webDNA (2011) WebCat2b15MacPlugin - [protect] (1997)