Having to maintain old programs written in VB6, I find myself having this issue.
I need to find an efficient way to search a string for all characters OUTSIDE the Windows-1252 set and replace them with "_". I can do this in C#
So far I have done this by creating a string with all 1252 characters, is there a faster way?
I may have to do this for a few million records in a text file
string 1252chars = ""!\""#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~¡¢£¤¥¦§¨©ª«¬®¯°±²³´µ¶•¸¹º»¼½¾¿ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖרÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõö÷øùúûüýþÿŸžœ›š™˜—–•""’’ŽŽ‹Š‰vˆ‡†…„ƒ‚€ ""
//Replace all characters not in the string above...
The
Encoding
class can achieve this, most likely very efficiently. When converting to and from the encoding, a replacement character can be specified.Output