Look ahead and Look behind in perl

Thursday, June 05, 2014 , 2 Comments

With the look-ahead and look-behind constructs ,you can "roll your own" zero-width assertions to fit your needs. You can look forward or backward in the string being processed, and you can require that a pattern match succeed (positive assertion) or fail (negative assertion) there.
Every extended pattern is written as a parenthetical group with a question mark as the first character. The notation for the look-arounds is fairly mnemonic, but there are some other, experimental patterns that are similar, so it is important to get all the characters in the right order.
(?=pattern)
is a positive look-ahead assertion
(?!pattern)
is a negative look-ahead assertion
(?<=pattern)
is a positive look-behind assertion
(?<!pattern)
is a negative look-behind assertion
EXAMPLES
Look-Ahead:
echo $mytmp2
uvw_abc uvw_def uvw_acb
Positive:
echo $mytmp2 | perl -pe 's/uvw_(?=(abc|def))/xyz_/g'
xyz_abc xyz_def uvw_acb
Description: replace every occurance of uvw_ with xyz_ where uvw_ followed by abc or def
Negative:
echo $mytmp2 | perl -pe 's/uvw_(?!(abc|def))/xyz_/g'
uvw_abc uvw_def xyz_acb
Description: replace every occurance of uvw_ with xyz_ where uvw_ is not followed by abc or def
Look-Behind:
echo $mytmp
abc_uvw def_uvw acb_uvw
Positive:
echo $mytmp | perl -pe 's/(?<=(abc|def))_uvw/_xyz/g'
abc_xyz def_xyz acb_uvw
Description: replace every occurance of _uvw with _xyz where _uvw is preceeded by abc or def
Negative:
echo $mytmp | perl -pe 's/(?<!(abc|def))_uvw/_xyz/g'
abc_uvw def_uvw acb_xyz
Description: replace every occurance of _uvw with _xyz where _uvw is not preceeded by abc or def

2 comments:

Split a string by anything other than spaces

Monday, May 19, 2014 0 Comments

Have you ever tried this. Dont go on writing big perl code for this. Here's a simple solution for this.
my @arr=split /\S+/,$str;
where
$str is your string
\s obviously matches a white space character. But \S matches a non white space character.
So \S+ matches atleast one non white space character.

0 comments:

Find and replace a string in c++

Monday, May 19, 2014 , 0 Comments

This can be handy many a times when you are working on a C++ application. There is a no direct method in the standard to do the same except when you are using a boost library. Below is a simple function that I use regularly in my applications which comes in handy for me all the time
template<class T>
int inline findAndReplace(T& source, const T& find, const T& replace)
{
    int num=0;
    int fLen = find.size();
    int rLen = replace.size();
    for (int pos=0; (pos=source.find(find, pos))!=T::npos; pos+=rLen)
    {
        num++;
        source.replace(pos, fLen, replace);
    }
    return num;
}

0 comments:

Inserting lines in a file using Perl

Wednesday, May 14, 2014 0 Comments

I have input file that look's like :
cellIdentity="42901"
cellIdentity="42902"
cellIdentity="42903"
cellIdentity="52904"
Numbers inside the quotes can be anything. The output needed is original line followed by the copy of same line except the last digit of the number should be a series of 5,6,7. So the output should look like below:
cellIdentity="42901"
cellIdentity="42905"
cellIdentity="42902"
cellIdentity="42906"
cellIdentity="42903"
cellIdentity="42907"
cellIdentity="52904"
cellIdentity="52905"
Below is the Perl command that I have written.
perl -pe 'BEGIN{$n=4}$n++;
          $n>7?$n=5:$n;
          $a=$_;print $a;
          s/(\d).$/$n."\""/ge'

0 comments:

Comparing two files using awk - An assignement

Monday, May 12, 2014 , 0 Comments

This is an awk assignment given to one of my friend. Its quite challenging. We have two files: File1:(List of companies)
Joe's Garage
Pip Co
Utility Muffin Research Kitchen
File2:(List of payments and dues of the companies in File1)
Pip Co                          $20.13   due
Pip Co                          $20.3   due
Utility Muffin Research Kitchen $2.56    due
Utility Muffin Research Kitchen 2.56    due
Joe's Garage                    $120.28  due
Joe's Garage                    $100.24 payment
Now the challenge is we need to create an output file which states the total amount due by each company. Additionally there is one more requirement where we need to handle the format errors in teh File2.
  1. The list of fomrat errors to be handled are:
  2. The dollor symbol not present in the amount
There should be exactly 2 decimals after the decimal point.
If any of the above format errors are encountered, then the complete line should be ignored and proceed to the next line.
The expected output here is:
Joe's Garage $20.04
Utility Muffin Research Kitchen $2.56
Pip Co $20.13
Below is the awk script that I have written for this. and its working at my side.
{
   if(FNR==NR)
   {
          for(i=1;i<=NF;i++)
          str=str","$i;
          a[str]=1;str="";
          next;
   }
   {
   if($(NF-1)!~/^\$/)
   {
   print "Format Error!-No dollor sign"FNR,FILENAME,$(NF-1);
   next;
   }
   if($(NF-1)!~/\.[0-9][0-9]$/)
   {
   print "Format Error!-should have 2 digits after a decimal point"FNR,FILENAME,$(NF-1);
   next;
   }
   for(i=1;i<(NF-1);i++)str=str","$i;
   if(a[str]){
   gsub(/\$/,"",$(NF-1));
   if($NF~/payment/){
     a[str]-=$(NF-1);}
   else if($NF~/due/){
     a[str]+=$(NF-1);}
   }
   str="";
  }
}
END{ 
   for(i in a)
   {
    t=i;
    gsub(/,/," ",t);
    print t,"$"(a[i]-1);
   }
}
I am sure that this can be optimized. I put it long so that its more convincing to all. Below is the way we have to execute this. I am using nawk on solaris.Others can use awk itself. Copy the above code in a file and name it as mycode.awk and then execute the awk command as below:
nawk -f mycode.awk File1 File2
Out that I have got with the above command is:
> nawk -f temp.awk temp2 temp1
Format Error!-should have 2 digits after a decimal point2 temp1 $20.3
Format Error!-No dollor sign4 temp1 2.56
 Joe's Garage $20.04
 Utility Muffin Research Kitchen $2.56
 Pip Co $20.13
>

0 comments:

Joining lines using Awk

Friday, May 02, 2014 1 Comments

Let's say I have a input file which looks like below:
Apr 24 2014;
is;
a;
sample;
;
Jun 24 2014 123;
may 25 2014;
is;
b;
sample;
;
Dec 21 2014 987
I want to merge 6 lines at a time. Which means my output should look like:
Apr 24 2014;is;a;sample;;Jun 24 2014 123
may 25 2014;is;b;sample;;Dec 21 2014 987
Below is a simple command that I would use:
awk '{ORS=(NR%6?"":RS)}1' file
Explanation:
By doing,
ORS=(NR%6?"":RS)
I am setting the output record separator to actual record separator only if line number is a multiple of 6.

1 comments:

Iterating a string through each character

Tuesday, April 29, 2014 , 0 Comments

In general if there is a need for us to iterate though a string character by character, then we normally split the string using a statement like:
@chars=split("",$var);
Now after the array is created we iterate through that array.But an easy way of doing this in Perl without creating an array is :
while ($var =~ /(.)/sg) {
   my $char = $1;
   print $char."\n"
}
Below is the explanation for the same:
$var =~ /(.)/sg
Match any character though out the string and round braces "()" captures the matched character.
/s 
Treat string as single line. That is, change "." to match any character whatsoever, even a newline, which normally it would not match.
/g
Match all occurrences of the regexp throughout the line instead of only the first occurrence.

0 comments:

Butterfly in Perl command line "}{"

Saturday, April 26, 2014 0 Comments

I recently came to know about this and I thought its worth sharing. I will try to keep it very simple.
Lets say I have a file as below:
1
2
3
4
5
I need to join all the lines with a pipe so that my output should look like below:
1|2|3|4|5
Normally I use the below command to achieve the same:
perl -lne 'push @a,$_;END{print join "|",@a}' File
Now, here's another option for you below:
perl -lne 'push @a,$_;}{ print join "|",@a' File
The change here is:
}{
This is called butterfly option in perl. Basically it closes while loop imposed by -n switch, and what follows }{ is block executed after while loop.

0 comments:

Restrict a java text field length

Sunday, April 13, 2014 0 Comments

There are many different way to do this. You might write your own once you figure out what has to be done exactly. But I came up with this. Additionally I wanted the input to be numeric as well. So below is what I came up with.
private class NumericAndLengthFilter extends DocumentFilter {

        /**
         * Number of characters allowed.
         */
        private int length = 0;

        /**
         * Restricts the number of charcacters can be entered by given length.
         * @param length Number of characters allowed.
         */
        public NumericAndLengthFilter(int length) {
            this.length = length;
        }

        @Override
        public void insertString(FilterBypass fb, int offset, String string,
                AttributeSet attr) throws
                BadLocationException {
            if (isNumeric(string)) {
                if (this.length > 0 && fb.getDocument().getLength() + string.
                        length()
                        > this.length) {
                    return;
                }
                super.insertString(fb, offset, string, attr);
            }
        }

        @Override
        public void replace(FilterBypass fb, int offset, int length, String text,
                AttributeSet attrs) throws
                BadLocationException {
            if (isNumeric(text)) {
                if (this.length > 0 && fb.getDocument().getLength() + text.
                        length()
                        > this.length) {
                    return;
                }
                super.insertString(fb, offset, text, attrs);
            }
        }

        /**
         * This method tests whether given text can be represented as number.
         * This method can be enhanced further for specific needs.
         * @param text Input text.
         * @return {@code true} if given string can be converted to number; otherwise returns {@code false}.
         */
        private boolean isNumeric(String text) {
            if (text == null || text.trim().equals("")) {
                return false;
            }
            for (int iCount = 0; iCount < text.length(); iCount++) {
                if (!Character.isDigit(text.charAt(iCount))) {
                    return false;
                }
            }
            return true;
        }
    }
}

0 comments:

Restricting a Java Text field to just a range of integers

Friday, April 11, 2014 0 Comments

This time I came across a slightly different way than extending the documentfilter. That is I will write my own plain document here.
class IntegerRangeDocument extends PlainDocument {

  int minimum, maximum;

  int currentValue = 0;

  public IntegerRangeDocument(int minimum, int maximum) {
    this.minimum = minimum;
    this.maximum = maximum;
  }

  public int getValue() {
    return currentValue;
  }

  public void insertString(int offset, String string, AttributeSet attributes)
      throws BadLocationException {

    if (string == null) {
      return;
    } else {
      String newValue;
      int length = getLength();
      if (length == 0) {
        newValue = string;
      } else {
        String currentContent = getText(0, length);
        StringBuffer currentBuffer = new StringBuffer(currentContent);
        currentBuffer.insert(offset, string);
        newValue = currentBuffer.toString();
      }
      try {
        currentValue = checkInput(newValue);
        super.insertString(offset, string, attributes);
      } catch (Exception exception) {
        Toolkit.getDefaultToolkit().beep();
      }
    }
  }

  public void remove(int offset, int length) throws BadLocationException {
    int currentLength = getLength();
    String currentContent = getText(0, currentLength);
    String before = currentContent.substring(0, offset);
    String after = currentContent.substring(length + offset, currentLength);
    String newValue = before + after;
    try {
      currentValue = checkInput(newValue);
      super.remove(offset, length);
    } catch (Exception exception) {
      Toolkit.getDefaultToolkit().beep();
    }
  }

  public int checkInput(String proposedValue) throws NumberFormatException {
    int newValue = 0;
    if (proposedValue.length() > 0) {
      newValue = Integer.parseInt(proposedValue);
    }
    if ((minimum <= newValue) && (newValue <= maximum)) {
      return newValue;
    } else {
      throw new NumberFormatException();
    }
  }
}
Now you can attach your text field with this plain document as below to achieve our requirement.

Document rangeOne = new IntegerRangeDocument(0, 255);
JTextField textFieldOne = new JTextField();
textFieldOne.setDocument(rangeOne);
    

0 comments:

Restricting a Java Text field to alphabet

Thursday, April 10, 2014 0 Comments

This is slightly opposite to what we have seen  in one of the previous posts
Please use the below DocumentFilter for the same.


class MyDocFilter extends DocumentFilter {
   private static final String REMOVE_REGEX = "\\d";
   private boolean filter = true;

   public boolean isFilter() {
      return filter;
   }

   public void setFilter(boolean filter) {
      this.filter = filter;
   }

   @Override
   public void insertString(FilterBypass fb, int offset, String text,
         AttributeSet attr) throws BadLocationException {
      if (filter) {
         text = text.replaceAll(REMOVE_REGEX, "");
      }
      super.insertString(fb, offset, text, attr);

   }

   @Override
   public void replace(FilterBypass fb, int offset, int length, String text,
         AttributeSet attrs) throws BadLocationException {
      if (filter) {
         text = text.replaceAll(REMOVE_REGEX, "");
      }
      super.replace(fb, offset, length, text, attrs);

   }
}

0 comments:

Restricting a Java Text field to Integers

Thursday, April 10, 2014 0 Comments

I have read a lot for this and the best way I found to do this is by extending the DocumentFilter class and using this new DocumentFilter
class MyIntFilter extends DocumentFilter {
   @Override
   public void insertString(FilterBypass fb, int offset, String string,
         AttributeSet attr) throws BadLocationException {

      Document doc = fb.getDocument();
      StringBuilder sb = new StringBuilder();
      sb.append(doc.getText(0, doc.getLength()));
      sb.insert(offset, string);

      if (test(sb.toString())) {
         super.insertString(fb, offset, string, attr);
      } else {
         // warn the user and don't allow the insert
      }
   }

   private boolean test(String text) {
      try {
         Integer.parseInt(text);
         return true;
      } catch (NumberFormatException e) {
         return false;
      }
   }

   @Override
   public void replace(FilterBypass fb, int offset, int length, String text,
         AttributeSet attrs) throws BadLocationException {

      Document doc = fb.getDocument();
      StringBuilder sb = new StringBuilder();
      sb.append(doc.getText(0, doc.getLength()));
      sb.replace(offset, offset + length, text);

      if (test(sb.toString())) {
         super.replace(fb, offset, length, text, attrs);
      } else {
         // warn the user and don't allow the insert
      }

   }

   @Override
   public void remove(FilterBypass fb, int offset, int length)
         throws BadLocationException {
      Document doc = fb.getDocument();
      StringBuilder sb = new StringBuilder();
      sb.append(doc.getText(0, doc.getLength()));
      sb.delete(offset, offset + length);

      if (test(sb.toString())) {
         super.remove(fb, offset, length);
      } else {
         // warn the user and don't allow the insert
      }

   }
}
Now after this use this filter with your JtextField like below:
PlainDocument doc = (PlainDocument) textField.getDocument();
doc.setDocumentFilter(new MyIntFilter());

0 comments:

A new addition to this blog - Java

Thursday, April 10, 2014 0 Comments

I recently started working on Java and so from now on I would like to share all the interesting things that I have come across about this beautiful programming language. So I am starting a new tag by name JAVA in this blog which will direct all the posts related to java. Happy coding to myself and all.

0 comments:

Searching multiple strings in multiple files in a directory

Tuesday, March 11, 2014 , , , , 2 Comments

I have a list of strings in a file separated by a new line.
for example:
input.txt
temp1
temp2
temp3
Now I have a directory with multiple dat files like:
>ls -1 *.dat
one.dat
two.dat
three.dat
And many more dat like like above with random names. Now I want to search for all the strings in input.txt in all the dat files present in  directory(let's say current working directory).This is what I came up with:
create a perl script given below and name it as anything you wish(I named here as temp.pl).place the file input.txt in the current working directory.
#!/usr/bin/perl -w

open (INP,"input.txt") or die $!;
while(<INP>)
{
my $cmd="find . -name \"*.dat\"|xargs grep -w -i $_";
my $output=`$cmd`;
 if($output!~/^\s*$/)
 {
 print $_."\n";
 print "------------------\n";
 print $output."\n";
 print "-------------------\n";
 }
}
exit;
Run this script as :
>./temp.pl
This solved my need.I hope it solves yours too :)

2 comments:

Creating a graph using STL's in C++

Monday, February 17, 2014 , , 0 Comments

Below is a implementation of Graph Data Structure in C++ as Adjacency List.

I have used STL vector for representation of vertices and STL pair for denoting edge and destination vertex.



struct vertex{
 typedef pair ve;
 vector adj; //cost of edge, destination vertex
 string name;
 vertex(string s)
 {
  name=s;
 }
};

class graph
{
 public:
  typedef map vmap;
  vmap work;
  void addvertex(const string&);
  void addedge(const string& from, const string& to, double cost);
};

void graph::addvertex(const string &name)
{
 vmap::iterator itr=work.begin();
 itr=work.find(name);
 if(itr==work.end())
 {
  vertex *v;
  v= new vertex(name);
  work[name]=v;
  return;
 }
  cout<<"\nVertex already exists!";
}

void graph::addedge(const string& from, const string& to, double cost)
{
 vertex *f=(work.find(from)->second);
 vertex *t=(work.find(to)->second);
 pair edge = make_pair(cost,t);
 f->adj.push_back(edge);
}

0 comments: