Use of diff and patch commands in Linux

Use of diff and patch commands in Linux 1

The diff and patch commands form a powerful combination.   They are widely used to obtain differences between the original files and the updated files in such a way that other people who only have the original files can convert them into updated files with a single patch file containing only the differences.   This tutorial explains the basics of how to use these excellent commands.

Difficulty: Medium

This tutorial assumes some basic knowledge of Linux and command lines, such as changing directories, copying files and editing text files.

Using diff to create a simple patch

The shape   more simple   to use diff is to obtain the differences between two files, an original file and an updated file.   For example, you can write some words in a normal text file, make some modifications and then save the modified content in a second file.   Then, you can compare these files with diff, like this:

[   receleccionado @ localhost   ~] $   diff   original file   file updated

Of course, replace   the   archive   original   Y   the   archive   updated   with the   file names   appropriate   of your case.   You will probably get an output like this:

1c1

<These are some words.

No new line at the end of the file

> These are just a few words.

No new line at the end of the file

Note: to demonstrate the creation of a simple patch, I used the file  original  with the content “These are some words” and the file  updated  with the content “These are just some words.    You can create these files yourself if you want to execute the commands in the tutorial and get the same result.

The 1c1 is a way to indicate line numbers and specify what should be done.   Note that those line numbers can also be line ranges (12   , fifteen   means line 12 to line 15).   The “c” tells the patch to replace the contents of the lines.   There are two other characters with a meaning: “a” and “d”, with “a” which means “add” or “add” and “d” which means “delete”.   The syntax is (line number or range   ) (   c, aod) (number or line range), although when using “a” or “d”, one of the parts (line number or range) may contain only a single line number.

  •  When “c” is used, the line numbers on the left are the lines in the original file that should be replaced with the text contained in the patch, and the line numbers on the right are the lines where the content should be in the patched version of the file.
  •  When “a” is used, the line number on the left can only be a single number, which means where to add the lines in the patched version of the file, and the line numbers on the right are the lines on which you must be the content. Patched version of the file.
  •  When “d” is used, the line numbers left are the lines that must be removed to create the patched version of the file, and the line number on the right may be just a number, indicating where the numbers would have been. lines. the patched version of the file if they had not been deleted. You might think that the last number is redundant, but remember that patches can also be applied in reverse.   I will explain more about that later in this tutorial.

The “<” means that the patch must delete the characters after this sign, and the “>” means that the characters after this sign must be added.   When you replace the content (a “c” between the line numbers), you will see both the <and the> signs.   When adding content (an “a” between the line numbers),   you will only see the> sign, and when you delete content (a “d” between line numbers), only the <sign.

The “”, followed by “There is no new line at the end of the file”, is only there because I did not press Enter after writing the words.   In general, it is always a good practice to add a new final line to each text file that you create.   Certain pieces of software can not do without them.   Therefore, the absence of a new final line is reported explicitly by diff.   Add   New final lines to the files makes the output much shorter:

1c1

<These are some words.

> These are just a few words.

As you may have noticed, I omitted to explain what the 3 are for.   They indicate the end of the lines to be replaced and the beginning of the lines to be replaced.  Separate the old and the new.   You will only see them when you replace the content (a “c” between the line numbers).

If we want to create a patch, we should put the diff output in a file.   Of course, you can do this by copying the result of your console and, after pasting it in your favorite text editor, save the file, but there is a shorter form.   We can let bash write the diff output in a file like this:

[   receleccionado @ localhost   ~] $   diff   original file   updatedfile   >   patchfile.patch

Replace again the file names with the   appropriate   for your case.   You may want to know that telling bash to type the output of a command in a file using> works with each command.   This can be very useful to save the output of a command in a file (record).

Applying the simple patch that we created.

Well, have we just created a patch?   The short answer is: yes, we did.   We can use the   archive   of   patch   to change   a copy of   archive   original   to a copy of the  file updated   .   Of course, it would not make much sense to apply the patch to the files from which we created the patch.   Therefore, copy the original file and the file  patch   in   other   place, and go to that place.   Then, try applying the patch in this way:

[   rechosen @ localhost   ~] $ patch   archive   original   -   i   patchfile.patch   -or   updatedfile

Again, replace the file names when necessary.   If everything went well, the file   updated   newly created by the patch should be identical to the one I had at the beginning, when creating the patch with diff.   You can verify this by using the -s option of diff:

[   rechosen @ localhost   ~] $   diff   -s   updatedfile   [/ path / to / the / original /   updatedfile   ] /   updatefile

Replace the part between   [Y   ] with the path to the original update file.   For example, if the   file updated   you used when creating the patch is in the home directory of your current directory, replace “[/ path / to / the / original /   updatedfile   ]” with ”   … ”   (   bash   understands this as the main directory of the current working directory).  And, of course, also replace the file names again when   appropriate   .

Congratulations!   If diff reported that the files are the same, just   create and use a patch with   success   !   However, the patch format we just used is not the only one.  In the next chapter, I will explain about   other   patch format.

Contextual patching

In the first chapter, we created a patch using the normal diff format.   However, this format does not provide any of the context lines around which they will be replaced, and therefore, a change in the line numbers (one or more new additional lines somewhere , or some deleted lines) would do very much difficult for the patch program to determine which lines to change instead.   Also, if a different file that is patched by accident contains the same line as the original file in the right places, patch will be happy to apply   the changes   of the   patch file   to this file.   This could result in a broken code and other unwanted side effects.   Fortunately, diff supports other formats than normal.   We are going to create a patch for the same files, but this time using the context output format:

[   rechosen @ localhost   ~] $   diff   -c   archive   original   file updated

At this time, it should be clear that you must replace the file names when necessary =).   You should get an exit like this:

***   archive   original   2007-02-03 22: 15: 48.000000000 0100

—   updatedfile   2007-02-03 22: 15: 56.000000000 0100

***************

*** one ****

!   These are some words.

— one —-

!   These are just a few words.

As you can see, the names of the files are included.   This will save us some writing when applying the patch.   The timestamps that you can see next to the file names are the date and time of the last modification of the file.   The line with 15 * ‘s indicates the beginning of a piece.   A piece describes what changes, such as replacements, additions and deletions, should be made in a certain block of text.   The two numbers 1 are line numbers (again, these can also be line ranges (12   , fifteen   means line 12 to line 15)), and!   it means   that the line must be replaced.   The line with   a!   before the three (eh, where did we see them before?) it should be replaced by the second line with   a!   , after the three (of course, the! itself will not be included; it is the syntax of the context format).

As you can see, there is no c, ay here.   The action to be taken is determined by the character that is in front of the line.   The !   , as explained, it means that the line must be replaced.   The others   characters   available   they are +, –   Y ”   ” (a space).   The + means to add (or add), the – means to eliminate, and   the ”   “It means nothing: the patch will only use it as a context to make sure that it is modifying the correct part of the file.

Applying this patch is a bit easier: in the same circumstances as before (let bash write the diff output to a file again, then copy the   patch file   and the original file in  other   location), you must run:

[   rechosen @ localhost   ~] $   patch   -   i   patchfile.patch   -or   updatedfile

Probably think now: why do we still have to specify the new file name?   Well, that’s because the patch was made with the intention of updating existing files in mind, not creating new updated files.   In general, this is useful when applying patches to the source trees of the programs, which is the main use of patches.   And that brings us to our next topic: to patch a complete source tree, you must   include   several files   at   patch file   .   The next chapter will tell you how to do this.

Getting the differences between multiple files.

The easiest way to get the differences between several files is to place them all in a directory and allow diff to compare the complete directories.   You can only specify directories instead of files, diff   autodetect   if you are giving a file or a directory:

[   receleccionado @ localhost   ~] $   diff   originaldirectory   /   updateddirectory   /

Note: if the directories you are comparing also include subdirectories, you must add the -r option to make diff compare the files in subdirectories, too.

This should give an output like this:

dif   originaldirectory   /   file1 updateddirectory   /   file1

1c1

<This is the first original file.

> This is the first updated file.

dif   originaldirectory   /   file2 updateddirectory   /   file2

1c1

<This is the second original file.

> This is the second updated file.

14d13

<We are going to add something to this file and delete this line.

26a26

> This line has been added to this updated file.

Note: for this example, I have created some sample files.  You can download a file that contains these files here:    http://www.linuxtutorialblog.com/post/introduction-using-diff-and-patch-tutorial/diffpatchexamplefiles.tar.gz  .

As you can see, the normal output format only specifies file names when comparing multiple files.   You can also see examples of adding and removing lines.

Now, let’s look at the output of the same comparison in the context format:

diff   -c   directory   original   /   file1 updated directory   /   file1

***   directory   original   / file1   2007-02-04 16: 17: 57.000000000 +0100

—   updateddirectory   /   file1   2007-02-04 16: 18: 33.000000000 +0100

***************

*** one ****

!   This is the first original file.

— one —-

!   This is the first updated file.

diff   -c   originaldirectory   /   file2 updateddirectory   /   file2

***   originaldirectory   / file2   2007-02-04 16: 19: 37.000000000 +0100

—   updateddirectory   /   file2   2007-02-04 16: 20: 08.000000000 +0100

***************

*** one   , 4   ****

!   This is the second original file.

 

S

OR

— one   , 4   —-

!   This is the second updated file.

 

S

OR

***************

*** eleven   , 17   ****

do

my

 

– We will add something in this file and delete this line.

 

S

OR

— eleven   , 16   —-

***************

*** 24   , 28   ****

— 2. 3   , 28   —-

do

my

 

+ This line has been added to this updated file.

 

Something will be added above this line.

The first thing you should notice is the increase in length;   The context format provides more information than the normal format.   This was not so visible in the first example, since there was no context to include.   However, this time there was a context, and that surely extends the patch a lot.   You will also have noticed that the names of the files are mentioned twice each time.   This is probably done to make it easier for the patch to recognize when to start patching the next file, or to provide better compatibility with earlier versions (or both).

The other way to allow diff to compare multiple files is to write a shell script that runs diff several times and correctly add all the results to a file, including the lines with the diff commands.   I will not tell you how to do it the other way (putting the files in a directory) is much easier and is widely used.

The creation of this patch with diff was considerably easy, but the use of directory kicks in a new problem: it will patch the patches mentioned in the current working directory and forget the directory they were in when the patch was created, or patch the patch Files within the directories specified in the patch?   Take a look at the next chapter to find out!

Patching multiple files

In the previous chapter, we created a patch that can be used to patch multiple files.   If you have not already done so, save the diff output in a   archive   deparche   real of the   following   way:

[   rechosen @ localhost   ~] $   diff   -c   originaldirectory   /   updateddirectory   />   patchfile.patch

Note: here we will use the context format patch, since it is generally a good practice to use a format that provides context.

It’s time to try using our   patch file   .   Copy the original directory and the   archive   of   patch   in   other   location, go to that other location and apply the patch with this command:

[   rechosen @ localhost   ~] $   patch   -   i   patchfile.patch

Hey   Report that you can not find the file to patch!   Yes, that is correct.   You are trying to find the file file1 in the current directory (the patch   delete by default   all directories in front of the file name).   Of course, this file is not there because we are trying to update the file in the directory directory   original   .   For this reason, we must tell patch not to delete any directory in the file names.   That can be done in this way:

[   rechosen @ localhost   ~] $ patch -p0 -   i   patchfile.patch

Note: You might think you could also move to the directory  original  and execute the patch command there.  Do not do it  This is a bad practice: if the  patch file  it includes no file to be patched in subdirectories, patch will search them in the working directory, and, obviously, not find them or find the incorrect ones.  Use the -p option to make the patch look in the subdirectories as it should.

The -p options tell patch how many bars (including what is in front of them, usually the directories) should be stripped before the file name (note that, when using the -p0 option, patch looks for the patch files on both   originaldirectory   Y   updateddirectory,   in our case).   In this case, we set it to 0 (do not delete any bars), but you can also set it to 1 (to eliminate the first bar including anything before), or 2 (to eliminate the first two bars including Everything before that), or any other amount.   This can be very useful if you have a patch that uses a different directory structure than yours.   For example: if you have a patch that uses a directory structure like this:

(...)

*** / home / username / sources / program /   originaldirectory   / file1 2007-02-04 16: 17: 57.000000000 +0100

— / home / username / sources / program /   updateddirectory   /   file1   2007-02-04 16: 18: 33.000000000 +0100

(…)

You can simply count the bars (/ (1) start / (2) username / (3) sources / (4) program / (5)) and give that value with the -p option.   If you are using -p5, the patch would look for both
the   directory   original   /   file1   As the   updated directory   /   file1   .   Note that the patch considers two bars side by side (as in / home / username // sources) as a single bar.   This is because the scripts sometimes (accidentally or not) put an additional bar between the directories.

Reversing an applied patch

Sometimes a patch is applied whereas it should not have been.   For example: a patch introduces a new error in some codes and a fixed patch is released.   However, he has already applied the old and defective patch, and can not think of a quick way to get back the original files (maybe they have already been patched dozens of times).   Then you can apply the buggy patch in a way   reversible   .   The patch command will try to undo all the changes you made by changing the tacos.   You can tell patch to try to reverse by passing it the -R option:

[   rechosen @ localhost   ~] $ patch -p0 -R -   i   patchfile.patch

In general, this operation will be successful, and you will recover the original files that you had.   By the way, there is another reason why I would like to revert a patch: sometimes (especially when asleep), people throw a patch with the files exchanged.   You have a great chance that the patch will automatically detect it and ask if you want it to try patching   inverse   .   Sometimes, however, the patch will not detect it and will wonder why the files do not seem to match.   Then, you can try applying the patch in reverse, manually, passing the -R option to the patch.   It is a good practice to make a backup before trying this, since it is possible that the patch will be ruined and leave you with irretrievably knotted files.

The unified format

The diff command can also show the differences in another format: the unified format.   This format is more compact, since it omits redundant context lines and groups elements such as line number instructions.   However, this format is currently only compatible with the difference and the GNU patch.   If you are releasing a patch in this format, you must ensure that only the users of the GNU patch apply it.   Almost all versions of Linux have GNU patches.

The unified format is similar to the context format, but it is far from exactly the same.   You can create a patch in the unified format in this way:

[   rechosen @ localhost   ~] $   diff   -or   originaldirectory   /   updateddirectory   /

The output should be something like this:

diff   -or   directory   original   /   file1 updated directory   /   file1

—   originaldirectory   / file1   2007-02-04 16: 17: 57.000000000 +0100

+++   updateddirectory   /   file1   2007-02-04 16: 18: 33.000000000 +0100

@@ -1 +1 @@

-This is the first original file.

+ This is the first updated file.

diff   -or   directory   original   /   file2 directory updated   /   file2

—   originaldirectory   / file2   2007-02-04 16: 19: 37.000000000 +0100

+++   updateddirectory   /   file2   2007-02-04 16: 20: 08.000000000 +0100

@@ -one   , 4   +1,4 @@

-This is the second original file.

+ This is the second updated file.

 

S

OR

@@ -eleven   7   +11,6 @@

do

my

 

-We’ll add something to this file and delete this line.

 

S

OR

@@ -24   , 5   +23.6 @@

do

my

 

+ This line has been added to this updated file.

 

Something will be added above this line.

As you can see, the numbers / ranges of the lines are grouped and placed between @ ‘s.   Also, there is no extra space after + or -.   This saves some bytes. Another difference: the unified format does not have a special replacement sign.   Simply delete (the – sign) the old line and add (the + sign) the altered line instead.   The only difference between adding / removing and replacing can be found in line numbers / ranges: when a line is replaced, they are the same, and when adding or removing, they differ.

Comparison of formats

After reading about three formats, you probably wonder which one to choose.   Here is a small comparison:

  •  The normal format has the best compatibility: almost all diff / patch commands must recognize it.   However, the lack of context is a great disadvantage.
  •  The context format is widely supported, although not all diff / patch-like commands know this.   However, the advantage of being able to include the context makes up for it.
  •  The unified format also includes the context, and is more compact than the format of the context, but it is only compatible with a single brand of diff / patch-like commands.

If you are sure that only GNU diff / patch users will use the patch, unified is the best option, as it keeps your patch as compact as possible.   In most other cases, however, the context format is the best option.   The normal format should only be used if you are sure that there is a user without context format support.

Varying the number of context lines

It is possible to make the differences include fewer context lines around the lines that should be changed.   Specially in   large patch files   , this can remove many bytes and make your   patch file be   more portable.   However, if you include too few context lines, the patch may not work correctly.   Citing the   manual page   GNU diff   : “For a correct operation, the patch usually needs at least two context lines”.

The specification of the number of context lines can be done in several ways:

  •  If you want to use the context format, you can combine it in one option, the -C option.   Example:

[   rechosen @ localhost   ~] $   diff   -C 2   originaldirectory   /   updateddirectory   /

The above command would use the context format with 2 context lines.

  •  If you want to use the unified format, you can combine it in one option, the -U option.   Example:

[   rechosen @ localhost   ~] $   diff   -OR   2directory   original   /   updateddirectory   /

The above command would use the unified format with 2 context lines.

  •  Regardless of the format you choose, you can specify the number of lines like this:

[   rechosen @ localhost   ~] $   diff   -two   originaldirectory   /   updateddirectory   /

However, this will only work if you also specify a format compatible with the context.   You should combine this option with -co -u.

Last words

Although this tutorial describes many features and functions of diff and patch, it does not describe by far everything you can do with these powerful tools. It is an introduction in the form of a tutorial.   If you want to know more about these commands, you can read, for example, their   manual pages   Y     GNU documentation on diff and patch   .

Well, I hope this tutorial helped you.   Thank you for reading!   If you liked this tutorial, browse this blog and see if there are more you like.   Please help this blog to grow by leaving a link here and there, and let others   people benefit from the increasing amount of knowledge on this site.   Thanks in advance and happy patch!

For Linux Training in Gurgaon Contact us at info@itstechschool.com. We are #1 Red Hat Training Partner in India & offering multiple Red hat certifications.