Friday, April 13, 2012

VIM Tip: Not Containing Pattern (3)

I find out that Lookaround zero width assertions are very powerful and useful. The more I use it the more I like it. Think this search strategy as match a pattern with addition zero width or hidden pattern together.  This can filter some parts out, which cannot be done just by matching a pattern.

I have used this technique resolving many of finding and replacing issues. Normally, I use search first to make sure the results meeting my expectations. Then I use replacement to substitute the results with my expected contents. Here are some examples. As you can see that it is very productive. For sure, it does require a lots of brain energy to think and to try hard. However, this will sharp you brains. I really enjoy learning and using VIM.

Examples


Let take pseudo codes I used in my previous blog as example. A simple search is to find character 's', and the next character is 't' as zero width, look ahead.  The search command is:

/st\@=

This tells that first token is 's'. The search engine searches for the token as a pattern. When it is found, the engine stops at the found position in the string. Then, the engine looks for the next token 't'. The look-ahead tells the engine to construct the next search for the second token from the substring afterward the first token.  The engine continues to search for the second token as an immediate match. If there is a match, the complete pattern is found. The first token 's' is a matched result as return; if there is no match, the match is failed.



The next simple example is look-behind zero width assertion command:

/\(s\)\@<=t

The pattern to be matched is 't', which is also the first token. The second token is a group of characters 's'. When the first token is found, the search engine stops. The look-behind tells the search engine to take the substring behind as the next search from this position.  If the match (second token) is found right behind the position, a complete match successes, and the first matched token is returned as a result. The following snapshot shows three results of 'ring's:



More Complexed Examples


The following text are some blocks of foo...bar:

foo
  test baz
  something for you
  gave me your beer
bar

foo
  test ba
  something for you
  gave me your beer
bar

foo
  test bae
  foo
  something for you
    gave me your beer
  bar
bar

Here is a command of searching for foo...bar block containing 'baz':

/foo\(\_.\{-}baz\)\@=\_.\{-}bar

Notice that the text within the foo...bar block may contain multiple lines of text. Here \_. is for multi-lines of text. \{-} is none-greedy match, which means matches 0 or more of the preceding atom, as few as possible.  The above search can be described as searching for:

'foo' as start, next zero width pattern: 0 or multiple lines of text till 'baz', then 0 or multiple lines of text till hit 'bar'



You may verify the command by break the search command into two parts, the first part is:

/foo\(\_.\{-}baz\)\@=



The command of searching for foo...bar loop not containing 'baz':

/foo\(\_.\{-}baz\)\@!\_.\{-}bar



Search for the most inner foo...bar block command:

/foo\(\_.\{-}foo\_.\{-}bar\)\@!\_.\{-}bar



Sometimes, I want to add line break tags to the end of a line, but not to the empty lines.  This command can be used to add <br /><br /> to the end of any none-empty lines:

:%s:.\@<=$:<br/><br/>:g



The next example is a very useful one. I often use VIM convert program codes into HTML format. Some times, I need to convert a group of spaces into &nbsp;s, except the first space. This is an excellent case to use lookahead zero width assertion. I figure it out and it becomes my favorite the search and replace commands.

/\(\s\)\@<=\(\s\)\+



After examining the results, I use the following replacement command to do the conversion:

:%s:\s\@<=\s:\&nbsp;:g



Reference


0 comments: