Re: Caching pages...again

This WebDNA talk-list message is from

2001


It keeps the original formatting.
numero = 36925
interpreted = N
texte = Glenn Busbin wrote: > > > > > > >This has nothing to do with caching. > > Google says it sure has to do with caching on proxy servers. > ROBOTS != caching serversMost HTTP caching servers operate in transparent mode; by this I mean that the client requests a page through the server and the server sees whether it has that page cached. If you are unlucky (or gullible) enough to use AOL, all of your accesses are through a caching server.Robots or spiders, on the other hand, do not (as a rule) keep a cache of the pages they visit (think of the storage requirements). They only apply whatever indexing algorythm they use to the page and store the indexed information for later searching.Google, unlike most search engines, does keep a copy of the original page available for viewing _when the original page is no longer available_. Here is a representative search from Google:University Press of America, Inc.: Catalog/Advanced Search Click here for details on Web Discount. University Press of America, Inc. Catalog / Advanced Search. Click Here for Search Instructions ... www.univpress.com/Catalog/ - 18k - Cached - Similar pages ------If you click on the first line, you get the actual site; if you click on the hyperlink (unlined) labled Cached you get a copy of the page as it was when the spider walked the site.This is not a feature pf caching proxy servers at all; this is a feature of Google specifically. QED> > > > > >Use MIME headers correctly if you don't want any caching done. > > > > Will all servers obey MIME headers for cache rules? > The combination of the two headers I described in the other thread is the only way _I_ know of to consistently defeat (or correctly apply) caching proxy services. If you do not use both, you are letting yourself in for whatever heuristic the caching server uses to determine whether the page should be fresh or cached.John-- John Peacock Director of Information Research and Technology Rowman & Littlefield Publishing Group 4720 Boston Way Lanham, MD 20706 301-459-3366 x.5010 fax 301-429-5747------------------------------------------------------------- This message is sent to you because you are subscribed to the mailing list . To unsubscribe, E-mail to: To switch to the DIGEST mode, E-mail to Web Archive of this list is at: http://search.smithmicro.com/ Associated Messages, from the most recent to the oldest:

    
  1. Re: Caching pages...again (Glenn Busbin 2001)
  2. Re: Caching pages...again (John Peacock 2001)
  3. Re: Caching pages...again (Christer Olsson 2001)
  4. Re: Caching pages...again (John Peacock 2001)
  5. Re: Caching pages...again (Bob Minor 2001)
  6. Re: Caching pages...again (Christer Olsson 2001)
  7. Re: Caching pages...again (Bob Minor 2001)
  8. Re: Caching pages...again (Christer Olsson 2001)
  9. Re: Caching pages...again (Donovan Brooke 2001)
  10. Re: Caching pages...again (Christer Olsson 2001)
  11. Re: Caching pages...again (Christer Olsson 2001)
  12. Re: Caching pages...again (Bob Minor 2001)
  13. Re: Caching pages...again (John Peacock 2001)
  14. Re: Caching pages...again (Christer Olsson 2001)
  15. Re: Caching pages...again (Donovan Brooke 2001)
  16. Re: Caching pages...again (Christer Olsson 2001)
  17. Re: Caching pages...again (Anup Setty 2001)
  18. Re: Caching pages...again (John Peacock 2001)
  19. Re: Caching pages...again (John Peacock 2001)
  20. Re: Caching pages...again (Christer Olsson 2001)
  21. Re: Caching pages...again (Glenn Busbin 2001)
  22. Re: Caching pages...again (Paul Uttermohlen 2001)
  23. Re: Caching pages...again (John Peacock 2001)
  24. Re: Caching pages...again (Christer Olsson 2001)
  25. Re: Caching pages...again (Paul Uttermohlen 2001)
  26. Re: Caching pages...again (John Peacock 2001)
  27. Re: Caching pages...again (Christer Olsson 2001)
  28. Caching pages...again (Glenn Busbin 2001)
Glenn Busbin wrote: > > > > > > >This has nothing to do with caching. > > Google says it sure has to do with caching on proxy servers. > ROBOTS != caching serversMost HTTP caching servers operate in transparent mode; by this I mean that the client requests a page through the server and the server sees whether it has that page cached. If you are unlucky (or gullible) enough to use AOL, all of your accesses are through a caching server.Robots or spiders, on the other hand, do not (as a rule) keep a cache of the pages they visit (think of the storage requirements). They only apply whatever indexing algorythm they use to the page and store the indexed information for later searching.Google, unlike most search engines, does keep a copy of the original page available for viewing _when the original page is no longer available_. Here is a representative search from Google:University Press of America, Inc.: Catalog/Advanced Search Click here for details on Web Discount. University Press of America, Inc. Catalog / Advanced Search. Click Here for Search Instructions ... www.univpress.com/Catalog/ - 18k - Cached - Similar pages ------If you click on the first line, you get the actual site; if you click on the hyperlink (unlined) labled Cached you get a copy of the page as it was when the spider walked the site.This is not a feature pf caching proxy servers at all; this is a feature of Google specifically. QED> > > > > >Use MIME headers correctly if you don't want any caching done. > > > > Will all servers obey MIME headers for cache rules? > The combination of the two headers I described in the other thread is the only way _I_ know of to consistently defeat (or correctly apply) caching proxy services. If you do not use both, you are letting yourself in for whatever heuristic the caching server uses to determine whether the page should be fresh or cached.John-- John Peacock Director of Information Research and Technology Rowman & Littlefield Publishing Group 4720 Boston Way Lanham, MD 20706 301-459-3366 x.5010 fax 301-429-5747------------------------------------------------------------- This message is sent to you because you are subscribed to the mailing list . To unsubscribe, E-mail to: To switch to the DIGEST mode, E-mail to Web Archive of this list is at: http://search.smithmicro.com/ John Peacock

DOWNLOAD WEBDNA NOW!

Top Articles:

Talk List

The WebDNA community talk-list is the best place to get some help: several hundred extremely proficient programmers with an excellent knowledge of WebDNA and an excellent spirit will deliver all the tips and tricks you can imagine...

Related Readings:

Press Release hit the NewsWire!!! (1997) WebCat2b13MacPlugin - nested [xxx] contexts (1997) WebDelivery downloads alias, not original ? (1997) RE: Automatic Forwarding using WebCat (1997) vs (1997) Time/date formatting %X doesn't work ... (1997) Securing/hiding database file (2000) Active Server Code... (1998) WebCat2b13 Mac plugin - [sendmail] and checkboxes (1997) Generating unique SKU from [cart] - Still Stumped... (1997) WC2.0 Memory Requirements (1997) Resume Catalog ? (1997) Bug or syntax error on my part? (1997) Can I invoke an ssi plugin from within a webcat page (1997) WebCatalog for Postcards ? (1997) Search Questions (2000) WC2f3 (1997) WC1.6 to WC2 date formatting (1997) Question re: FlushDatabases (1997) random in arizona (2003)