Re: Search Engine questions ...

This WebDNA talk-list message is from

2002


It keeps the original formatting.
numero = 44801
interpreted = N
texte = Glenn Busbin wrote: > > I hope we're not talking about two different things here. Having a static link that contains ?Cart=[Cart] is one thing. That can be followed by the major engines these days. But having a link that is created on the fly only when the page is hit is another animal altogether.No, they are not. The spider has no idea if the link is being generated dynamically or is static, as long as the URL conforms to what the se expects. Some will follow pages with parameters, some will not.> > Google and FAST follow static links containing a ? and carry any variables, including cart numbers, with them.They haven't read the dynamically generated text or links on my pages, though.As I said, there is no way for the remote browser (whether it is a user or a spider) to know that the page is dynamically generated. It is far more likely that the page is not suitable for spidering due to the lack of proper MIME headers which denote the page is cacheable. If you include appropriate MIME headers (e.g. Date, Last-modified, and Expires) the spiders will [probably] walk the site.If you use the MIMEHeaders.inc file I have posted below, your pages will be cacheable and will be likely to be spidered. There are two files listed below; I recommend you store them together in the Globals/Includes directory. Beware of odd linebreaks due to stupid e-mail clients. If I could attach the files to the message, it wouldn't be a problem, but SMSI's listserv software is set to exclude attachments.---------------------------------------------------------------------------- [!] ------- MIMEHeaders.inc ------- [/!][if [expires]=[raw][expires][/raw]][!] ---- Check for input variable [/!][then][!] [/!] [math show=f]expires=0[/math][/then][!] [/!][/if][!] [/!][SETMIMEHEADER name=Date&value=[include file=^Includes/GMTdate.inc]] [SETMIMEHEADER name=Last-modified&value=[include file=^Includes/GMTdate.inc]] [SETMIMEHEADER name=Expires&value=[include file=^Includes/GMTdate.inc&offset=[expires]]] ---------------------------------------------------------------------------- [!] ------- GMTdate.inc ------- Takes optional parameter offset in minutes and returns current date/time in GMT long format Replace the {5:00:00} with your non-DST offset from GMT [/!][math time&show=f]GMT={[date %X]}+{5:00:00}[showif [date format=%Z]^Daylight]-{1:00:00}[/showif][/math][!] [/!][math time&show=f]NOW={[date %X]}[/math][!] [/!][if [offset]=[raw][offset][/raw]][!] [/!][then][!] [/!] [math show=f]offsettime=0[/math][/then][!] [/!][else][!] [/!] [math show=f]offsettime=[offset]*60[/math][/else][!] [/!][/if][!] --- Create the GMTdate variable [/!][math time&show=f]GMTdate=[GMT]+[offsettime][/math][!] --- Create the MIMEexpires string [/!][text show=t]MIMEexpires=[!] [/!][if ([GMT]>[NOW]) and ([GMTdate]>[GMT]) and ([GMTdate]<86400)][!] ---- all in the same day ---- [/!][then][!] [/!][date format=%a, %d %b %Y][!] [/!][/then][!] ---- at least one day later ---- [/!][else][!] [/!][format days_to_date %a, %d %b %Y][math]{[date]}+{00/[!] [/!][showif [GMTexpires]>86400][math]floor([GMTexpires]/86400)[/math][/showif][!] [/!][showif [GMTexpires]<86401]01[/showif][!] [/!]/0000}[/math][/format][!] [/!][/else][!] [/!][/if][!] [/!] [Format Seconds_To_Time][GMTdate][/Format][!] [/!] GMT[/text] ----------------------------------------------------------------------------HTHJohnp.s. I don't know where people got the idea I knew everything there is to know about spiders. ;~)-- John Peacock Director of Information Research and Technology Rowman & Littlefield Publishing Group 4720 Boston Way Lanham, MD 20706 301-459-3366 x.5010 fax 301-429-5747 ------------------------------------------------------------- This message is sent to you because you are subscribed to the mailing list . To unsubscribe, E-mail to: To switch to the DIGEST mode, E-mail to Web Archive of this list is at: http://search.smithmicro.com/ Associated Messages, from the most recent to the oldest:

    
  1. Re: Search Engine questions ... (Pedro Rivera 2002)
  2. Re: Search Engine questions ... (Wendell Kozak 2002)
  3. Re: Search Engine questions ... (dale's stuff 2002)
  4. Re: Search Engine questions ... (Gary Krockover 2002)
  5. Re: Search Engine questions ... (Gary Krockover 2002)
  6. Re: Search Engine questions ... (Kenneth Grome 2002)
  7. Re: Search Engine questions ... (Kenneth Grome 2002)
  8. Re: Search Engine questions ... (Kenneth Grome 2002)
  9. Re: Search Engine questions ... (Glenn Busbin 2002)
  10. Re: Search Engine questions ... (Kenneth Grome 2002)
  11. Re: Search Engine questions ... (Glenn Busbin 2002)
  12. Re: Search Engine questions ... (Kenneth Grome 2002)
  13. Re: Search Engine questions ... (Brian Fries 2002)
  14. Re: Search Engine questions ... (Glenn Busbin 2002)
  15. Re: Search Engine questions ... (Andrew Simpson 2002)
  16. Re: Search Engine questions ... (Glenn Busbin 2002)
  17. Re: Search Engine questions ... (Kenneth Grome 2002)
  18. Re: Search Engine questions ... (Glenn Busbin 2002)
  19. Re: Search Engine questions ... (Kenneth Grome 2002)
  20. Re: Search Engine questions ... (Glenn Busbin 2002)
  21. Re: Search Engine questions ... (dale's stuff 2002)
  22. Re: Search Engine questions ... (Glenn Busbin 2002)
  23. Re: Search Engine questions ... (John Peacock 2002)
  24. Re: Search Engine questions ... (Glenn Busbin 2002)
  25. Re: Search Engine questions ... (Alain Russell 2002)
  26. Re: Search Engine questions ... (Glenn Busbin 2002)
  27. Re: Search Engine questions ... (Dan Strong 2002)
  28. Re: Search Engine questions ... (Oleg Kremiansky 2002)
  29. Re: Search Engine questions ... (Glenn Busbin 2002)
  30. Re: Search Engine questions ... (Alain Russell 2002)
  31. Re: Search Engine questions ... (Kenneth Grome 2002)
  32. Re: Search Engine questions ... (Glenn Busbin 2002)
  33. Re: Search Engine questions ... (Glenn Busbin 2002)
  34. Re: Search Engine questions ... (Glenn Busbin 2002)
  35. Re: Search Engine questions ... (Glenn Busbin 2002)
  36. Re: Search Engine questions ... (Clayton Randall 2002)
  37. Re: Search Engine questions ... (Kenneth Grome 2002)
  38. Re: Search Engine questions ... (Glenn Busbin 2002)
  39. Re: Search Engine questions ... (Donovan Brooke 2002)
  40. Re: Search Engine questions ... (Alain Russell 2002)
  41. Re: Search Engine questions ... (Donovan Brooke 2002)
  42. Re: Search Engine questions ... (Alain Russell 2002)
  43. Search Engine questions ... (Kenneth Grome 2002)
Glenn Busbin wrote: > > I hope we're not talking about two different things here. Having a static link that contains ?Cart=[cart] is one thing. That can be followed by the major engines these days. But having a link that is created on the fly only when the page is hit is another animal altogether.No, they are not. The spider has no idea if the link is being generated dynamically or is static, as long as the URL conforms to what the se expects. Some will follow pages with parameters, some will not.> > Google and FAST follow static links containing a ? and carry any variables, including cart numbers, with them.They haven't read the dynamically generated text or links on my pages, though.As I said, there is no way for the remote browser (whether it is a user or a spider) to know that the page is dynamically generated. It is far more likely that the page is not suitable for spidering due to the lack of proper MIME headers which denote the page is cacheable. If you include appropriate MIME headers (e.g. Date, Last-modified, and Expires) the spiders will [probably] walk the site.If you use the MIMEHeaders.inc file I have posted below, your pages will be cacheable and will be likely to be spidered. There are two files listed below; I recommend you store them together in the Globals/Includes directory. Beware of odd linebreaks due to stupid e-mail clients. If I could attach the files to the message, it wouldn't be a problem, but SMSI's listserv software is set to exclude attachments.---------------------------------------------------------------------------- [!] ------- MIMEHeaders.inc ------- [/!][if [expires]=[raw][expires][/raw]][!] ---- Check for input variable [/!][then][!] [/!] [math show=f]expires=0[/math][/then][!] [/!][/if][!] [/!][SETMIMEHEADER name=Date&value=[include file=^Includes/GMTdate.inc]] [SETMIMEHEADER name=Last-modified&value=[include file=^Includes/GMTdate.inc]] [SETMIMEHEADER name=Expires&value=[include file=^Includes/GMTdate.inc&offset=[expires]]] ---------------------------------------------------------------------------- [!] ------- GMTdate.inc ------- Takes optional parameter offset in minutes and returns current date/time in GMT long format Replace the {5:00:00} with your non-DST offset from GMT [/!][math time&show=f]GMT={[date %X]}+{5:00:00}[showif [date format=%Z]^Daylight]-{1:00:00}[/showif][/math][!] [/!][math time&show=f]NOW={[date %X]}[/math][!] [/!][if [offset]=[raw][offset][/raw]][!] [/!][then][!] [/!] [math show=f]offsettime=0[/math][/then][!] [/!][else][!] [/!] [math show=f]offsettime=[offset]*60[/math][/else][!] [/!][/if][!] --- Create the GMTdate variable [/!][math time&show=f]GMTdate=[GMT]+[offsettime][/math][!] --- Create the MIMEexpires string [/!][text show=t]MIMEexpires=[!] [/!][if ([GMT]>[NOW]) and ([GMTdate]>[GMT]) and ([GMTdate]<86400)][!] ---- all in the same day ---- [/!][then][!] [/!][date format=%a, %d %b %Y][!] [/!][/then][!] ---- at least one day later ---- [/!][else][!] [/!][format days_to_date %a, %d %b %Y][math]{[date]}+{00/[!] [/!][showif [GMTexpires]>86400][math]floor([GMTexpires]/86400)[/math][/showif][!] [/!][showif [GMTexpires]<86401]01[/showif][!] [/!]/0000}[/math][/format][!] [/!][/else][!] [/!][/if][!] [/!] [Format Seconds_To_Time][GMTdate][/Format][!] [/!] GMT[/text] ----------------------------------------------------------------------------HTHJohnp.s. I don't know where people got the idea I knew everything there is to know about spiders. ;~)-- John Peacock Director of Information Research and Technology Rowman & Littlefield Publishing Group 4720 Boston Way Lanham, MD 20706 301-459-3366 x.5010 fax 301-429-5747 ------------------------------------------------------------- This message is sent to you because you are subscribed to the mailing list . To unsubscribe, E-mail to: To switch to the DIGEST mode, E-mail to Web Archive of this list is at: http://search.smithmicro.com/ John Peacock

DOWNLOAD WEBDNA NOW!

Top Articles:

Talk List

The WebDNA community talk-list is the best place to get some help: several hundred extremely proficient programmers with an excellent knowledge of WebDNA and an excellent spirit will deliver all the tips and tricks you can imagine...

Related Readings:

Hard Questions ? (1997) Emailer (WebCat2) (1997) WebCat2 Append problem (B14Macacgi) (1997) E-Mail Preferences in Admin Folder (1997) Pass Thru Page? (1998) Ken's grep question about renaming files to lowercase (2003) Using [showif] (2000) table max? (2001) categorys (1998) Nesting format tags (1997) [WebDNA] Wishlist: ignore whitespace in database changes (2016) Online reference (1997) [WebDNA] Another strange WebDNA problem (2013) WebCatalog Technical Reference (1997) Multiple radio buttons (1998) math a various prices (1997) Dates (1996) problems with 2 tags (1997) PLEASE REMOVE MY EMAIL ADDRESS (1997) Searchable WebCat (etc.) Docs ? (1997)