Re: quick GREP question
This WebDNA talk-list message is from 2001
It keeps the original formatting.
numero = 39501
interpreted = N
texte = Steven Jarvis wrote:> > I know jack about grep, though I'm planning to learn it. I *think* it's what> I want to use in this situation, but I'm open to any other options, too.Get Mastering Regular Expressions from O'Reilly (ISBN 1-56592-257-2).Ignore all of the discussion of Perl extensions to regex engines (it will just make you jealous ;~) since the WebCat grep is pretty basic.> > I have to format some stories with WebCat and export them to a text file,> and I need to cut some HTML tags and their contents out of stories if they> are present.> Can I call your attention to the following context which is designedspecifically for your problem: http://betadoc.smithmicro.com/RemoveHTMLContext.htmlIn general, you cannot use [grep] to always strip out markup tags, due to line breaks and nesting. You really need to have a simple state machine to correctly parse nested HTML tags; if you can make certain assumptions about your tags, you can deal with it with grep,but you need to be very careful.HTHJohn-- John PeacockDirector of Information Research and TechnologyRowman & Littlefield Publishing Group4720 Boston WayLanham, MD 20706301-459-3366 x.5010fax 301-429-5747-------------------------------------------------------------This message is sent to you because you are subscribed to the mailing list
.To unsubscribe, E-mail to: To switch to the DIGEST mode, E-mail to Web Archive of this list is at: http://search.smithmicro.com/
Associated Messages, from the most recent to the oldest:
Steven Jarvis wrote:> > I know jack about grep, though I'm planning to learn it. I *think* it's what> I want to use in this situation, but I'm open to any other options, too.Get Mastering Regular Expressions from O'Reilly (ISBN 1-56592-257-2).Ignore all of the discussion of Perl extensions to regex engines (it will just make you jealous ;~) since the WebCat grep is pretty basic.> > I have to format some stories with WebCat and export them to a text file,> and I need to cut some HTML tags and their contents out of stories if they> are present.> Can I call your attention to the following context which is designedspecifically for your problem: http://betadoc.smithmicro.com/RemoveHTMLContext.htmlIn general, you cannot use [grep] to always strip out markup tags, due to line breaks and nesting. You really need to have a simple state machine to correctly parse nested HTML tags; if you can make certain assumptions about your tags, you can deal with it with grep,but you need to be very careful.HTHJohn-- John PeacockDirector of Information Research and TechnologyRowman & Littlefield Publishing Group4720 Boston WayLanham, MD 20706301-459-3366 x.5010fax 301-429-5747-------------------------------------------------------------This message is sent to you because you are subscribed to the mailing list .To unsubscribe, E-mail to: To switch to the DIGEST mode, E-mail to Web Archive of this list is at: http://search.smithmicro.com/
John Peacock
DOWNLOAD WEBDNA NOW!
Top Articles:
Talk List
The WebDNA community talk-list is the best place to get some help: several hundred extremely proficient programmers with an excellent knowledge of WebDNA and an excellent spirit will deliver all the tips and tricks you can imagine...
Related Readings:
Fun with dates (1997)
[cart] Taxrate - seriously .. (2002)
Webmerchant/Cybercash (2000)
[WebDNA] Combining searches with blank and zero values? (2011)
Need relative path explanation (1997)
cybercash on OSX - was Executing remote AppleScript (2000)
Multiple fields on 1 input (1997)
Include a big block of text (1997)
Problem searching bw & ne of a word (1999)
[WebDNA] Conditional Javascript and WebDNA (2009)
MacActivity and PCS (1997)
Banner DNA (1997)
New public beta available (1997)
Need correct syntax for writing to header2 (1999)
New Installation (1998)
possible, WebCat2.0 and checkboxes-restated (1997)
Authorize.net, SIM, tcpconnect and applescript (2003)
sorting... (2003)
Function basic question (2006)
Summing fields (1997)