Re: quick GREP question
This WebDNA talk-list message is from 2001
It keeps the original formatting.
numero = 39501
interpreted = N
texte = Steven Jarvis wrote:> > I know jack about grep, though I'm planning to learn it. I *think* it's what> I want to use in this situation, but I'm open to any other options, too.Get Mastering Regular Expressions from O'Reilly (ISBN 1-56592-257-2).Ignore all of the discussion of Perl extensions to regex engines (it will just make you jealous ;~) since the WebCat grep is pretty basic.> > I have to format some stories with WebCat and export them to a text file,> and I need to cut some HTML tags and their contents out of stories if they> are present.> Can I call your attention to the following context which is designedspecifically for your problem: http://betadoc.smithmicro.com/RemoveHTMLContext.htmlIn general, you cannot use [grep] to always strip out markup tags, due to line breaks and nesting. You really need to have a simple state machine to correctly parse nested HTML tags; if you can make certain assumptions about your tags, you can deal with it with grep,but you need to be very careful.HTHJohn-- John PeacockDirector of Information Research and TechnologyRowman & Littlefield Publishing Group4720 Boston WayLanham, MD 20706301-459-3366 x.5010fax 301-429-5747-------------------------------------------------------------This message is sent to you because you are subscribed to the mailing list
.To unsubscribe, E-mail to: To switch to the DIGEST mode, E-mail to Web Archive of this list is at: http://search.smithmicro.com/
Associated Messages, from the most recent to the oldest:
Steven Jarvis wrote:> > I know jack about grep, though I'm planning to learn it. I *think* it's what> I want to use in this situation, but I'm open to any other options, too.Get Mastering Regular Expressions from O'Reilly (ISBN 1-56592-257-2).Ignore all of the discussion of Perl extensions to regex engines (it will just make you jealous ;~) since the WebCat grep is pretty basic.> > I have to format some stories with WebCat and export them to a text file,> and I need to cut some HTML tags and their contents out of stories if they> are present.> Can I call your attention to the following context which is designedspecifically for your problem: http://betadoc.smithmicro.com/RemoveHTMLContext.htmlIn general, you cannot use [grep] to always strip out markup tags, due to line breaks and nesting. You really need to have a simple state machine to correctly parse nested HTML tags; if you can make certain assumptions about your tags, you can deal with it with grep,but you need to be very careful.HTHJohn-- John PeacockDirector of Information Research and TechnologyRowman & Littlefield Publishing Group4720 Boston WayLanham, MD 20706301-459-3366 x.5010fax 301-429-5747-------------------------------------------------------------This message is sent to you because you are subscribed to the mailing list .To unsubscribe, E-mail to: To switch to the DIGEST mode, E-mail to Web Archive of this list is at: http://search.smithmicro.com/
John Peacock
DOWNLOAD WEBDNA NOW!
Top Articles:
Talk List
The WebDNA community talk-list is the best place to get some help: several hundred extremely proficient programmers with an excellent knowledge of WebDNA and an excellent spirit will deliver all the tips and tricks you can imagine...
Related Readings:
no template caching (1997)
Sense/Disallow HTML tags during $Append (1997)
RAM variables (1997)
Generating Report Totals (1997)
SiteGaurd file Cache vs webcatalog cache (1997)
quantity [addlineitem] (2001)
Firesite and [referrer] atg broke (1997)
unique ascending numbers (2003)
RE:DatabaseHelper (1997)
WebCat2b12 Mac.acgi--[searchstring] bug (1997)
process SSI (1998)
F*** you (1998)
ShowIf, If, XML Syntax.... (2004)
Single Link browsing (1997)
Using Plug-In while running 1.6.1 (1997)
Problem when Inputing text (1999)
WebCat2b15MacPlugin - showing [math] (1997)
Webcat 4.0 and security (2000)
show me your store ! (2003)
Linux problems (2000)