Numbers in prices, quantities, dates, times, phone numbers, and addresses may not be of interest when processing a web page for a PHP search engine or keyword analysis tool. In international text there are around 900 different types of digits, currency symbols, and units of measure marks that need to be removed. This tip shows how to remove numbers and number-related characters.
When processing text for a search engine or analysis tool, code needs to strip out punctuation, formatting, spacing, and control characters to reveal indexable text. In international text there are hundreds of these characters, and some should be removed in one context, but not in another. This tip shows how.
Most symbol characters, like + = © ™ ← → ☺ ♣ ♠, need to be stripped out of web page text before processing it in a search engine or text analysis tool. For international text there are thousands of symbol characters, but some should be removed in one context, but not in another. This tip shows how.