[TipJar] Common Punk: replace my text

No Comments

The SOA server I am currently working with had a nasty quirk on its services that I havent figured out yet on how to fix: it fails on requests with an XML comment. We use SOAPUI to trigger requests and the quirk requires most of us to strip the comments that is automatically generated by the tool. This quirk however gives me a good segue on this IT tipjar: how to leverage pattern matching to batch remove comments. This should serve as an introduction in other pattern matching applications when dealing with text/ascii content.

A normal SOAP request template instance will normally contain XML comments for optional items similar to this:

Normally most programmer’s text editor will have an option to use regular expressions when doing file search and replace. As can be seen in the above, XML comments have a pattern wherein they start with “<!–” and then end with “–>” (note there should be 2 dashes and not a single long line). In these examples each comment goes into a single line which makes it easier to replace them.

Now to the useful stuff, here are a few ways to quickly remove those comments:

On your programmer’s text editor (e.g. PsPad, Geany,etc.)

  • open the search&replace editor
  • tick on the option for the regular expression
  • Enter the express <!–.*–> in the search field. (The dot and asterisk in the middle is important!)
  • Press on the OK or whatever button to make it go, and…

  • TADA!

Another beauty of regular expressions is that a lot of utilities have support for it. If you have access to a Unix/Linux shell then chances are you have access to the sed (stream editor) and grep (global/regular expression/print) utilities.

A quick cat command to display the contents of the original request:

The sed option is nice as it will work even if the comments are inline with other texts. Be wary though that if there are multiple – -> in the same line then the example shown above will greedily replace up to the last ending match. If you are pretty sure the string to be searched for is unique then you can even specify a partial match (see second invocation example):

Those limited to a very restricted Windows operating system need not worry. Microsoft provides the findstr utility that provides limited support for pattern matching.

Microsoft has introduced the PowerShell which has a more robust support for regular expressions and other unix utilities. Below is an example of a grep-like functionality. Take note of the ! before the $_.Contains function as it serves as a negation so only non-matching lines will be printed out.

This is just a basic introduction to text replacement via regular expressions. Regexps can provide more functionality for those who have the interest and time to learn some of its quirks.

[dfads params=’groups=-1′]

Leave a Reply