Re: [WebDNA] Limts? -and- Hot Merge

This WebDNA talk-list message is from

2009

It keeps the original formatting. numero = 102341
interpreted = N
texte = Donovan thinks out of the box again!Where was this idea last year when I had a [spawn] blow up on me because I was processing a 5000 record import table into a database? I gave up on the project because I could not come up with a solution, but this might work.Now, can someone tell me how to use the supposed built in feature that converts CSV files into TAB tables? Because if I could get that feature working my life would be easier too!Matthew A Perosi JewelerWebsites.com------------------------------by Psi Prime-------Senior Web Developer 323 Union Blvd. Totowa, NJ 07512Pre-Sales: 888.872.0274Service: 973.413.8213Training: 973.413.8214Fax: 973.413.8217http://www.jewelerwebsites.comhttp://en.wikipedia.org/wiki/Psi_Prime%2C_Inchttp://www.psiprime.comDonovan Brooke wrote:> Moving this to it's own thread because I have> to open up the 'GREP question' every time to> read it! - Thanks Bob! ;-)>>> Matthew A Perosi wrote:> > Donovan you might remember from our telephone conversations from Spring> > 2007 that I've been pushing the limits for many years now.> >> > I see no reason to migrate anything if it's not broken. Unless you > tell> > me WSC is officially advising it.> > I have 2 CMS systems that use the RAM databases. They are totally rock> > solid in their design and functionality.> >> > Sometimes I log into the WebCatalog admin screen and keep refreshing > the> > Show Database page just to watch how all the DBs are loading and> > flushing. I find it quite amazing. The information I have learned by> > watching that screen has helped me design more efficient code and> > relational databases, as well as help me identify which customers are> > resource hogs who I can charge higher hosting fees to :-)>>>> Right, I think "efficient" code is the key to pushing the limits.> No, I have no "official" recommendation. I just have theories. :-)> I absolutely love the "WebDNA database" system.. it is *the* key feature> of WebDNA that differentiates WebDNA from the competitors the most. > It's a tool that has a lot of value in our field that is not available> elsewhere.>> However, when you get into 100's of MB's of data, personally, I tend to> start thinking about MySQL. This is not a "WSC official > recommondation".. it's just my inclination.. especially if I *don't*> need to search the data all the time. ;-) (WebDNA's [search..] context> is hard to replace)>> I also, personally, have been trying to branch out more and experiment> with things. For instance, I recently had a situation where I needed> to create a "Hot Merge" of sorts. In this effort, I came up with a > little bit different way of doing replaces to WebDNA databases..>> Basically, a WebDNA *merge* database> lands on the server. Then, my system is charged to check the integrity > of that merge table and then merge it to a database already residing on> the server.>> The experimental part had to do with the fact that these merge's were at> least 7,000 records and could potential scale to 70,000 records. So,> I was worried about server load obviously. This is the basic logic> I came up with to do these kinds of replaces, fully automated> (I just get an email if it succeeds or fails):>> ----start code snippet-------------------------------------------------> [!]--------------------------------------------------------------> ** Hot Merge Example by Donovan **> ** Note: This snippet is within a [listfiles..] context that> looks in a "hotfolder/" directory and is fired every 10> min. by a trigger. This snippet is also after integrity> checks on the uploaded merge file.> ** Note: [tNumRecs_Merge] value is from a [numfound] on the> merge file after the integrity checks.>> -------------------------------------------------------------[/!]> [!] ** Merge 500 records at a time ** [/!]> [text]tNumLoops=[math]floor([tNumRecs_Merge]/500)[/math][/text]>> [!] ** Set some vars ** [/!]> [text multi=T]hmt_index=1[!]> [/!]&tKeyField=MERGE_KEY_FIELD[!]> [/!]&tDestDB_Key=DEST_KEY_FIELD[!]> [/!]&tDestDBPath=your/db/destination.db[!]> [/!]&tNumSecs=10[!]> ^ Edit the above more or less depending on size of> merge.> [/!][/text]>> [!] ** need to set some text vars because some WebDNA is not allowed > in spawn ** [/!]>> [text]tFilename=[filename][/text]>> [!] ** spawn a new process for each block of 500 ** [/!]> [loop start=1&end=[math][tNumLoops]+1[/math]&advance=1]> [spawn]> [search db=hotfolder/[tFilename][!]> [/!]&ne[tKeyField]datarq=find_all[!]> [/!]&startat=[hmt_index][!]> [/!]&max=500]>> [founditems]> [replace db=[tDestDBPath][!]> [/!]&eq[tDestDB_Key]datarq=[interpret][[tKeyField]][/interpret]][!]> [/!]&DestFirstField=[url][Merge1Field][/url][!]> [/!]&DestSecondField=[url][Merge2Field][/url][!]> etc..> [/!][/replace]> [/founditems]> [/search]> [/spawn]>> [!] ** Wait <[tNumSecs]> seconds to start the next block ** [/!]> [waitforfile file=non_existent_file.txt[!]> [/!]&timeout=[tNumSecs]][/waitforfile]> > [!] ** set index to next 500 block ** [/!]> [text]hmt_index=[math][index]*500[/math][/text]> [/loop]> ----end code------------------------------------------------------->>> The basic idea was that I didn't really care how long the> merge took, but rather, I wanted to make sure the processor> wasn't overloaded. My idea was to use SPAWN and a waiting technique > using WAITFORFILE to> "spread out" the task. This turned out to work really well I think.>> I used 'top -u apache' to monitor the process on a merge with> 10,000 records, and I didn't see *any* noticeable heightened> processor usage using this code.>> Just thought I'd pass this experiment along to the list!>> Donovan>>> disclaimer: :) the above code was snipped out of a live working system,> but to make it legible and universal, I rewrote a bit of it above> so there could be some syntax errors from the rewrite.>>>> Associated Messages, from the most recent to the oldest:

Re: [WebDNA] Limts? -and- Hot Merge ("Psi Prime, Matthew A Perosi " 2009)
Re: [WebDNA] Limts? -and- Hot Merge (Donovan Brooke 2009)
Re: [WebDNA] Limts? -and- Hot Merge (christophe.billiottet@webdna.us 2009)
Re: [WebDNA] Limts? -and- Hot Merge ("Psi Prime, Matthew A Perosi " 2009)
Re: [WebDNA] Limts? -and- Hot Merge (Donovan Brooke 2009)
Re: [WebDNA] Limts? -and- Hot Merge ("Psi Prime, Matthew A Perosi " 2009)
Re: [WebDNA] Limts? -and- Hot Merge (christophe.billiottet@webdna.us 2009)
[WebDNA] Limts? -and- Hot Merge (Donovan Brooke 2009)

Donovan thinks out of the box again!Where was this idea last year when I had a [spawn] blow up on me because I was processing a 5000 record import table into a database? I gave up on the project because I could not come up with a solution, but this might work.Now, can someone tell me how to use the supposed built in feature that converts CSV files into TAB tables? Because if I could get that feature working my life would be easier too!Matthew A Perosi JewelerWebsites.com------------------------------by Psi Prime-------Senior Web Developer 323 Union Blvd. Totowa, NJ 07512Pre-Sales: 888.872.0274Service: 973.413.8213Training: 973.413.8214Fax: 973.413.8217http://www.jewelerwebsites.comhttp://en.wikipedia.org/wiki/Psi_Prime%2C_Inchttp://www.psiprime.comDonovan Brooke wrote:> Moving this to it's own thread because I have> to open up the 'GREP question' every time to> read it! - Thanks Bob! ;-)>>> Matthew A Perosi wrote:> > Donovan you might remember from our telephone conversations from Spring> > 2007 that I've been pushing the limits for many years now.> >> > I see no reason to migrate anything if it's not broken. Unless you > tell> > me WSC is officially advising it.> > I have 2 CMS systems that use the RAM databases. They are totally rock> > solid in their design and functionality.> >> > Sometimes I log into the WebCatalog admin screen and keep refreshing > the> > Show Database page just to watch how all the DBs are loading and> > flushing. I find it quite amazing. The information I have learned by> > watching that screen has helped me design more efficient code and> > relational databases, as well as help me identify which customers are> > resource hogs who I can charge higher hosting fees to :-)>>>> Right, I think "efficient" code is the key to pushing the limits.> No, I have no "official" recommendation. I just have theories. :-)> I absolutely love the "WebDNA database" system.. it is *the* key feature> of WebDNA that differentiates WebDNA from the competitors the most. > It's a tool that has a lot of value in our field that is not available> elsewhere.>> However, when you get into 100's of MB's of data, personally, I tend to> start thinking about MySQL. This is not a "WSC official > recommondation".. it's just my inclination.. especially if I *don't*> need to search the data all the time. ;-) (WebDNA's [search..] context> is hard to replace)>> I also, personally, have been trying to branch out more and experiment> with things. For instance, I recently had a situation where I needed> to create a "Hot Merge" of sorts. In this effort, I came up with a > little bit different way of doing replaces to WebDNA databases..>> Basically, a WebDNA *merge* database> lands on the server. Then, my system is charged to check the integrity > of that merge table and then merge it to a database already residing on> the server.>> The experimental part had to do with the fact that these merge's were at> least 7,000 records and could potential scale to 70,000 records. So,> I was worried about server load obviously. This is the basic logic> I came up with to do these kinds of replaces, fully automated> (I just get an email if it succeeds or fails):>> ----start code snippet-------------------------------------------------> [!]--------------------------------------------------------------> ** Hot Merge Example by Donovan **> ** Note: This snippet is within a [listfiles..] context that> looks in a "hotfolder/" directory and is fired every 10> min. by a trigger. This snippet is also after integrity> checks on the uploaded merge file.> ** Note: [tNumRecs_Merge] value is from a [numfound] on the> merge file after the integrity checks.>> -------------------------------------------------------------[/!]> [!] ** Merge 500 records at a time ** [/!]> [text]tNumLoops=[math]floor([tNumRecs_Merge]/500)[/math][/text]>> [!] ** Set some vars ** [/!]> [text multi=T]hmt_index=1[!]> [/!]&tKeyField=MERGE_KEY_FIELD[!]> [/!]&tDestDB_Key=DEST_KEY_FIELD[!]> [/!]&tDestDBPath=your/db/destination.db[!]> [/!]&tNumSecs=10[!]> ^ Edit the above more or less depending on size of> merge.> [/!][/text]>> [!] ** need to set some text vars because some WebDNA is not allowed > in spawn ** [/!]>> [text]tFilename=[filename][/text]>> [!] ** spawn a new process for each block of 500 ** [/!]> [loop start=1&end=[math][tNumLoops]+1[/math]&advance=1]> [spawn]> [search db=hotfolder/[tFilename][!]> [/!]&ne[tKeyField]datarq=find_all[!]> [/!]&startat=[hmt_index][!]> [/!]&max=500]>> [founditems]> [replace db=[tDestDBPath][!]> [/!]&eq[tDestDB_Key]datarq=[interpret][[tKeyField]][/interpret]][!]> [/!]&DestFirstField=[url][Merge1Field][/url][!]> [/!]&DestSecondField=[url][Merge2Field][/url][!]> etc..> [/!][/replace]> [/founditems]> [/search]> [/spawn]>> [!] ** Wait <[tNumSecs]> seconds to start the next block ** [/!]> [waitforfile file=non_existent_file.txt[!]> [/!]&timeout=[tNumSecs]][/waitforfile]> > [!] ** set index to next 500 block ** [/!]> [text]hmt_index=[math][index]*500[/math][/text]> [/loop]> ----end code------------------------------------------------------->>> The basic idea was that I didn't really care how long the> merge took, but rather, I wanted to make sure the processor> wasn't overloaded. My idea was to use SPAWN and a waiting technique > using WAITFORFILE to> "spread out" the task. This turned out to work really well I think.>> I used 'top -u apache' to monitor the process on a merge with> 10,000 records, and I didn't see *any* noticeable heightened> processor usage using this code.>> Just thought I'd pass this experiment along to the list!>> Donovan>>> disclaimer: :) the above code was snipped out of a live working system,> but to make it legible and universal, I rewrote a bit of it above> so there could be some syntax errors from the rewrite.>>>> "Psi Prime, Matthew A Perosi "

DOWNLOAD WEBDNA NOW!

Re: [WebDNA] Limts? -and- Hot Merge

2009

Top Articles:

Related Readings: