Re: [WebDNA] Server load

This WebDNA talk-list message is from

2008


It keeps the original formatting.
numero = 100499
interpreted = N
texte = Frank, once again, thanks for the challenging questions, my thoughts are mixed in below... On Aug 7, 2008, at 8:56 AM, Frank Nordberg wrote: > I've come to a point where I have to recode most of the WebDNA for > my sites. Almost all my code is written to be compact - that is to > require as little memory and as few databases as possible. That made > a lot of sense when I originally opened the sites - back in the old > days when disc space and RAM were expensive and WebDNA's cache could > only hold a limited number of databases. There is nothing wrong with continuing this, I see many sites that need multiple servers just to serve medium websites because of sloppy coding. > Here are a few of the issues I've run into the last few days: > > 1. When is one big database better than several smaller ones? > Example: I currently have one database of recommended CDs shared by > several sites (all of them music related). Would it be better to > have one smaller database for each site? It would reduce file size > considerably but the *total* memory requirement would increase since > much of the data from the old big database would be duplicated in > several of the small ones. > > Obviously what is the best solution depends on how much of the > content has to be duplicated but is there a way to guesstimate at > which point one big database becomes better than a bunch of small > ones? This question is more complicated then it seems. 1) is there actually a problem with how it's being done now? Yes it's takes a lot of memory, but: Are searches fast or slow? Have you had data corruption issues? Is the database outgrowing the max ram that can be installed in the server? Given that your database is HUGE, I suggest looking at running the database on a SQL server (mySQL for example) because database size really doesn't matter on a modern SQL server. (hits per second matters) This won't be a small project, but it gives you a lot of options moving forward. One other idea, and this one is a lot more abstract, is to shard the database. This only makes sense if you want to break the database down because access speeds (searches) are too slow. You start this by looking if your data can be broken down by categories into different databases (country, top 40, classical) then when you search you can either have a pull down by the search box that picks the database, or you can start searching databases sequentially until you get results without having to search the entire record set most of the time. True sharding is typically done in SQL environments where you have the logging database on one server, the site's data (less sessions and carts) on another server, a server just for transient data. Well, you get the idea. It all comes done to breaking the data to spread the access load across several servers, not exactly what you want to do here, but nothing wrong with planning for the future. > 2. When is [SEARCH] better than [LOOKUP]? > Example: http://www.musicaviva.com/encyclopedia/display.html?phrase=waldzither > This page gets data from two different databases via [LOOKUP] > commands, one big (20+ MB) base for the main body and a much smaller > one (1-2 MB) for the page title. It wouldn't be too difficult to add > a page title field to the big database and get all the data from > there with a [SEACRH] context but is that a good idea? (I need the > small database for search results anyway but I don't *have* to use > it on this diplay page). Search is better then a lookup when you need multiple records or multiple fields, otherwise it isn't. For a lookup of one field from one record lookup is always better because it stops searching the record set on the first matching hit. A search will continue to look at every record even if you know there is to be only 1 found. On a small database, who cares, on a database of 250,000 records, getting a few searches a minute, it makes a world of difference. > 3. Is it always better to set a variable than to recalculate value > each time it's needed? > Example: In a search results page like this: > http://www.musicaviva.com/encyclopedia/list.html?alpha=f > I need the index number for the last item three times. It's before > the [FOUNDITEMS] contect so I have to calculate it by adding the > number of entries listed to the startat value. Is it better to store > this value as a variable or to calculate it three times? > > The "next-previous" navigation links on the same page raise a > similar question. The code for it is rather long, includes quite a > lot of calculations and appears both at the top and the bottom of > the page. Should I store as a variable or include the code twice? YES. one technique that i use when programming is to imagine that every time I look at a record in a database, it costs a penny. Pretty cheap right? Imagine searching a database with 50,000 records. On a lookup you can stop at the 12873 record (if appropriate.) A true search will look at every record (50,000 evaluations). 3 searches on the same page for the same thing = 150,000 row evaluations. Anyway, back to the point, yes, you should always store the results in a text variable, or a table if you are going to need it more then once while rendering a page. > 4. Is it a good idea to use variables for static text that occurs > several times on the page? > > Back to the display page: > Example: http://www.musicaviva.com/encyclopedia/display.html?phrase=waldzither > The "back to top" links could easily be set as a variable. That > would mean more calculation work for the server but also less data > to handle. Will this increase or decrease the server load? (Unlike > the two examples above, there is no WebDNA in the text snippet > itself, just static html.) Without seeing what's going on on the back end, I'm not sure. It is very simple for the server to drop in a variable in a web page, so if the amount of work to set the "back to top" is even remotely complicated, then yes, you should store it as a variable and just reuse the variable. > 5. Does anybody know of a good file serving service? > I know this is a bit off topic but moving all the non-html/WebDNA > files to another server would of course help. Not much but every > little bit counts. Can anybody recommend such a pure file server > service? > What I need is: > + A fast and - even more important - reliable server > + Ftp access > + Subdomain names pointing to the file server > + File types: .gif, .jpg, .pdf, .mid, .kar, .mp3, .wav and .abc > (.kar is "karaoke," just midi files with embedded lyrics, .abc is a > text based music notation file format) > + automatic redirects to main site for anybody trying to browse the > file directories manually Look at Amazone's S3 service. You can use Jungle Disk to get the files there. ( http://www.jungledisk.com/ ) A lot of sites now accept files via web upload, verify and resize the file then thru scripting send the files to S3 and rewrite their links to point to the outsourced hosting. Looks very inexpensive. (your mileage may vary.) Associated Messages, from the most recent to the oldest:

    
  1. Re: [WebDNA] Server load (Frank Nordberg 2008)
  2. Re: [WebDNA] Server load (Paul Willis 2008)
  3. Re: [WebDNA] Server load ("Brian B. Burton" 2008)
  4. Re: [WebDNA] Server load (Frank Nordberg 2008)
  5. Re: [WebDNA] Server load (Terry Wilson 2008)
  6. Re: [WebDNA] Server load (Patrick Junkroski 2008)
  7. Re: [WebDNA] Server load (Frank Nordberg 2008)
  8. Re: [WebDNA] Server load ("Brian B. Burton" 2008)
  9. [WebDNA] Server load (Frank Nordberg 2008)
Frank, once again, thanks for the challenging questions, my thoughts are mixed in below... On Aug 7, 2008, at 8:56 AM, Frank Nordberg wrote: > I've come to a point where I have to recode most of the WebDNA for > my sites. Almost all my code is written to be compact - that is to > require as little memory and as few databases as possible. That made > a lot of sense when I originally opened the sites - back in the old > days when disc space and RAM were expensive and WebDNA's cache could > only hold a limited number of databases. There is nothing wrong with continuing this, I see many sites that need multiple servers just to serve medium websites because of sloppy coding. > Here are a few of the issues I've run into the last few days: > > 1. When is one big database better than several smaller ones? > Example: I currently have one database of recommended CDs shared by > several sites (all of them music related). Would it be better to > have one smaller database for each site? It would reduce file size > considerably but the *total* memory requirement would increase since > much of the data from the old big database would be duplicated in > several of the small ones. > > Obviously what is the best solution depends on how much of the > content has to be duplicated but is there a way to guesstimate at > which point one big database becomes better than a bunch of small > ones? This question is more complicated then it seems. 1) is there actually a problem with how it's being done now? Yes it's takes a lot of memory, but: Are searches fast or slow? Have you had data corruption issues? Is the database outgrowing the max ram that can be installed in the server? Given that your database is HUGE, I suggest looking at running the database on a SQL server (mySQL for example) because database size really doesn't matter on a modern SQL server. (hits per second matters) This won't be a small project, but it gives you a lot of options moving forward. One other idea, and this one is a lot more abstract, is to shard the database. This only makes sense if you want to break the database down because access speeds (searches) are too slow. You start this by looking if your data can be broken down by categories into different databases (country, top 40, classical) then when you search you can either have a pull down by the search box that picks the database, or you can start searching databases sequentially until you get results without having to search the entire record set most of the time. True sharding is typically done in SQL environments where you have the logging database on one server, the site's data (less sessions and carts) on another server, a server just for transient data. Well, you get the idea. It all comes done to breaking the data to spread the access load across several servers, not exactly what you want to do here, but nothing wrong with planning for the future. > 2. When is [search] better than [lookup]? > Example: http://www.musicaviva.com/encyclopedia/display.html?phrase=waldzither > This page gets data from two different databases via [lookup] > commands, one big (20+ MB) base for the main body and a much smaller > one (1-2 MB) for the page title. It wouldn't be too difficult to add > a page title field to the big database and get all the data from > there with a [SEACRH] context but is that a good idea? (I need the > small database for search results anyway but I don't *have* to use > it on this diplay page). Search is better then a lookup when you need multiple records or multiple fields, otherwise it isn't. For a lookup of one field from one record lookup is always better because it stops searching the record set on the first matching hit. A search will continue to look at every record even if you know there is to be only 1 found. On a small database, who cares, on a database of 250,000 records, getting a few searches a minute, it makes a world of difference. > 3. Is it always better to set a variable than to recalculate value > each time it's needed? > Example: In a search results page like this: > http://www.musicaviva.com/encyclopedia/list.html?alpha=f > I need the index number for the last item three times. It's before > the [founditems] contect so I have to calculate it by adding the > number of entries listed to the startat value. Is it better to store > this value as a variable or to calculate it three times? > > The "next-previous" navigation links on the same page raise a > similar question. The code for it is rather long, includes quite a > lot of calculations and appears both at the top and the bottom of > the page. Should I store as a variable or include the code twice? YES. one technique that i use when programming is to imagine that every time I look at a record in a database, it costs a penny. Pretty cheap right? Imagine searching a database with 50,000 records. On a lookup you can stop at the 12873 record (if appropriate.) A true search will look at every record (50,000 evaluations). 3 searches on the same page for the same thing = 150,000 row evaluations. Anyway, back to the point, yes, you should always store the results in a text variable, or a table if you are going to need it more then once while rendering a page. > 4. Is it a good idea to use variables for static text that occurs > several times on the page? > > Back to the display page: > Example: http://www.musicaviva.com/encyclopedia/display.html?phrase=waldzither > The "back to top" links could easily be set as a variable. That > would mean more calculation work for the server but also less data > to handle. Will this increase or decrease the server load? (Unlike > the two examples above, there is no WebDNA in the text snippet > itself, just static html.) Without seeing what's going on on the back end, I'm not sure. It is very simple for the server to drop in a variable in a web page, so if the amount of work to set the "back to top" is even remotely complicated, then yes, you should store it as a variable and just reuse the variable. > 5. Does anybody know of a good file serving service? > I know this is a bit off topic but moving all the non-html/WebDNA > files to another server would of course help. Not much but every > little bit counts. Can anybody recommend such a pure file server > service? > What I need is: > + A fast and - even more important - reliable server > + Ftp access > + Subdomain names pointing to the file server > + File types: .gif, .jpg, .pdf, .mid, .kar, .mp3, .wav and .abc > (.kar is "karaoke," just midi files with embedded lyrics, .abc is a > text based music notation file format) > + automatic redirects to main site for anybody trying to browse the > file directories manually Look at Amazone's S3 service. You can use Jungle Disk to get the files there. ( http://www.jungledisk.com/ ) A lot of sites now accept files via web upload, verify and resize the file then thru scripting send the files to S3 and rewrite their links to point to the outsourced hosting. Looks very inexpensive. (your mileage may vary.) "Brian B. Burton"

DOWNLOAD WEBDNA NOW!

Top Articles:

Talk List

The WebDNA community talk-list is the best place to get some help: several hundred extremely proficient programmers with an excellent knowledge of WebDNA and an excellent spirit will deliver all the tips and tricks you can imagine...

Related Readings:

List Address Changed! (1998) Use of Back and Reload Buttons on ShoppingCart page? (1997) process SSI (1998) ODBC Performance? (2001) Upgrade to WebCat2 from Commerce Lite (1997) more on my bbs (1997) WebCatalog can't find database (1997) WebCat2b13 Command Reference Doc error (1997) WC 2.0 frames feature (1997) Another question about credit cards (1997) b12 cannot limit records returned and more. (1997) international time (1997) Error, 101 a DNS problem ? (1997) Possible Bug in 2.0b15.acgi (1997) Hiding HTML in an [include] file... (2004) Authenticate (1999) Replace Statement (1997) Virtual Postcard almost complete... (1998) Here we go again... (2006) Add to Cart & List of Products (1997)