Tuesday, April 3, 2007

Removing Extra Lines in MS Word

you’ll sometimes find that there are extraneous blank lines between paragraphs getting in the way. This happens frequently when text is pasted in from another source, such as an ASCII file or an e-mail, where it is common practice to hit enter twice between paragraphs (as opposed to indenting the first line of each paragraph). In a Word, however, those extra lines are not only unnecessary, but they are intrusive. The paragraph formatting options in Word allow for refined control of inter-paragraph spacing, paragraph indentation, and so on.

When working on an MS Word documentMS_Word_Paragraph_Format.jpgIn the example shown above, we have Word automatically inserting a 6pt spacer above each paragraph plus a 3pt spacer below (for a total of 9pts between paragraphs, which is slightly shorter than a typical text line containing a 10pt font with room for dissenters). The problem is that Word will treat a blank line as an entire paragraph, so we actually end up with about 30pt worth of spacing between paragraphs. Not good.

If there’s only a handful of such blank lines, then it’s no problem to simply place the cursor on each blank line, one at a time, and hit the delete key. But what if these extraneous blank lines go on for 50 pages? MS_Word_Visible_Marks.jpg

In Word, every paragraph ends with a special paragraph mark, and it turns out that you can perform a search-and-replace on these paragraph marks just as if they were regular text. Normally these paragraph marks are invisible, but you can make them appear by clicking on the button with the paragraph symbol (¶) in the toolbar. Click on the button a second time to render them invisible again. (Note: the paragraph symbols do not need to be visible in order to perform a search and replace against them. I just wanted to show you how to make them visible in case you were curious.)

MS_Word_Special_Paragraphs.jpg In the search/replace dialog, the notation that refers to a paragraph symbol is ^p (the caret symbol, followed by a lower case P). Don’t worry if you can’t remember this code. Simply click on the More/Less button to expand the dialog box, then click on the special button, and select Paragraph Mark. It will automatically type the ^p for you (into either the Find What, or Replace With, whichever had focus last).

Now, here’s the tricky part. You might think that, in order to delete the blank lines, you’d want to set Find What to ^p, leave Replace With blank, and click the Replace All Button. That will certainly delete the paragraph marks on those blank lines, effectively removing the blank lines, but it will also delete the paragraph marks at the end of the legitimate paragraphs. So, you’d end up with your entire document being one huge paragraph.

Instead, set Find What to ^p^p, and set Replace With to ^p. This tells Word that anywhere it sees two paragraph marks together, with nothing in between them, to replace them with a single paragraph mark — which has the net effect of leaving the first paragraph mark alone while deleting the second one. Note that if your file has multiple blank lines between paragraphs, then you’ll have to click the Replace All button repeatedly to eventually collapse all of the blank lines to nothing.

Now, say instead that the source material did not have blank lines between paragraphs but did indent the first line of each paragraph by five spaces. Here again, those spaces would interfere with the paragraph formatting. To delete those leading spaces, open the search and replace dialog, set Find What to ^p followed by five spaces, and set Replace With to ^p. After clicking on the Replace All button, any paragraph that is followed by another paragraph where the first five characters are spaces, those spaces will be removed. (Note: This counts on you knowing that the paragraphs with leading spaces begin with exactly 5 spaces. If you are unsure of the number of spaces, or if the number varies, then just have this search and replace dialog find ^p followed by a single space (still replacing it with ^p alone). Then, keep clicking on the Replace All button repeatedly until Word reports that it made zero replacements.

