Experiment 05
Aim
To perform advanced text processing, data filtering, file comparisons, and input/output redirection in Linux
Theory
-
cmp(Compare)- Compares two files and tells you which line numbers are different
- Syntax: cmp [options..] file1 file2.
-
paste(Paste)- Used to paste the content from one file to another file. It is also used to set column format for each line
- Syntax: paste [options]
- Options: -d reuse characters from LIST instead of TABS
-
grep(Global Regular Expression Print)- Searches for a specific string or pattern within a file's contents
- Syntax: grep "[pattern]" [filename]
-
sort(Sort) &uniq(Unique)sortarranges lines alphabetically or numerically;uniqfilters out adjacent duplicate lines- Syntax: sort [filename] / uniq [filename]
-
sed(Stream Editor) &awk(AWK)sedis a stream editor used for finding and replacing text (s/old/new/g);awkis a pattern scanning language used for column extraction
-
Redirection (
>,>>) & Piping (|)>routes output to a file (overwriting),>>appends to it. The pipe|takes the output of the first command and uses it as the input for the second command
Commands
$ echo -e "apple\nbanana\norange" > list1.txt
$ echo -e "apple\ngrape\norange" > list2.txt
$ cmp list1.txt list2.txt
list1.txt list2.txt differ: byte 7, line 2
$ paste -d "," list1.txt list2.txt > combined.csv
$ cat combined.csv
apple,apple
banana,grape
orange,orange
$ echo -e "dog\ncat\ndog\nbird" > animals.txt
$ sort animals.txt | uniq > unique_animals.txt
$ cat unique_animals.txt
bird
cat
dog
$ grep "cat" unique_animals.txt
cat
$ sed 's/dog/wolf/g' animals.txt
wolf
cat
wolf
bird
Conclusion
Advanced command-line text parsing and manipulation were successfully achieved. Files were compared (cmp), merged side-by-side (paste), filtered (grep), sorted (sort, uniq), and modified in-stream (sed), demonstrating the powerful data pipeline capabilities of the Linux shell