September, 2007

  • September 29, 2007

    Most symbol characters, like + = © ™ ← → ☺ ♣ ♠, need to be stripped out of web page text before processing it in a search engine or text analysis tool. For international text there are thousands of symbol characters, but some should be removed in one context, but not in another. This tip shows how.

  • September 15, 2007

    When processing text for a search engine or analysis tool, code needs to strip out punctuation, formatting, spacing, and control characters to reveal indexable text. In international text there are hundreds of these characters, and some should be removed in one context, but not in another. This tip shows how.

  • September 1, 2007

    The HTML tags on a web page must be stripped away to get clean text for a PHP search engine, keyword extractor, or some other page analysis tool. PHP's standard strip_tags( ) function will do part of the job, but you need to strip out styles, scripts, embedded objects, and other unwanted page code first. This tip shows how.

Syndicate content
Nadeau software consulting
Nadeau software consulting