str_replace() - String Replace without RegExp

In perl one would replace string without putting much thought into it. You use use the code '$str =~ s/replace_this/with_this/g;' - this is a very efficient and easy way to replace a string. However this statement uses regular expression - which is much more processor intensive than a simple string replace. In many situations I have faced, I had to use a regular expression replace where a string replace would be better. That is because perl don't have a function to do a string replace.

So I have decided to make one for myself...

Code

#Replace a string without using RegExp.
sub str_replace {
	my $replace_this = shift;
	my $with_this  = shift; 
	my $string   = shift;
	
	my $length = length($string);
	my $target = length($replace_this);
	
	for(my $i=0; $i<$length - $target + 1; $i++) {
		if(substr($string,$i,$target) eq $replace_this) {
			$string = substr($string,0,$i) . $with_this . substr($string,$i+$target);
			return $string; #Comment this if you what a global replace
		}
	}
	return $string;
}

Useage

string str_replace ( string search, string replace, string subject );

The function uses the same format as the str_replace() function of PHP. The first argument is the search string that is to be replaced, the second argument is the string that it must be replaced with and the third argument is the string in which the replacement must be done. The result will be returned.

Example

$str_regreplace = "Hello World";
$string = "Hello World";
print $string . "\n\n";
$str_regreplace =~ s/Hello/Goodbye/; #Prints 'Goodbye World'
print $str_regreplace . "\n";

print str_replace('Hello','Goodbye',$string);  #Prints 'Goodbye World'

Is there a better way to do this?

Comments

Anonymous at 18 Jan, 2007 03:35

I applaud you for trying, but that was one of the worst Perl programs I have ever seen. The general coding of *everything* is just horrible, for instance you are looping through the string letter by letter and checking if the following X letters match what you are looking for, when there is an index() function for that. You should just delete what you have written and upload the following text written by me:


use strict;
use warnings;

#
#    Function:
#    str_replace ( string search, string replace, string subject [, int count] )
#
#    Description:
#    This function returns a string or an array with all occurrences of
#    $search in $subject replaced with the given $replace value. If you
#    don't need fancy replacing rules (like regular expressions), you
#    should always use this function instead. Ported to Perl from PHP.
#
#    @PARAM $search String that you want to replace.
#    @PARAM $replace Replacement string.
#    @PARAM $subject The string that we are operating on.
#    @PARAM $count (optional) Limit the number of instances to replace.
#
#    Return values:
#    This function returns a string. Additionally, it returns -1
#    in case you forgot to provide the three basic parameters.
#
sub str_replace
{
  my $search = shift;							# what to find
  my $replace = shift;							# what to replace it with
  my $subject = shift;							# the scalar we are operating on
  if (! defined $subject) { return -1; }		# exit if all three required parameters are missing (!)
  my $count = shift;							# number of occurrences to replace
  if (! defined $count) { $count = -1; }		# set $count to -1 (infinite) if undefined

  # start iterating
  my ($i,$pos) = (0,0);
  while ( (my $idx = index( $subject, $search, $pos )) != -1 )	# find next index of $search, starting from our last position
  {
    substr( $subject, $idx, length($search) ) = $replace;		# replace $search with $replace

    $pos=$idx+length($replace);		# jump forward by the length of $replace as it may be
									# longer or shorter than $search was, and if we don't
									# compensate for this we end up in a different portion
									# of the string.

    if ($count>0 && ++$i>=$count) { last; }				# stop iterating if we have reached the limit ($count)
  }

  return $subject;
}

# This file has been over-commented so that beginners may understand it.
#For more help, see the examples at www.php.net/str_replace
[Commented edited to preserve code formatting]
Reply to this.
Anonymous at 18 Jan, 2007 04:04

For completeness sake, same code, formatted for pro's instead:


sub str_replace
{
	my $search = shift;
	my $replace = shift;
	my $subject = shift;
	if (! defined $subject) { return -1; }
	my $count = shift;
	if (! defined $count) { $count = -1; }
	
	my ($i,$pos) = (0,0);
	while ( (my $idx = index( $subject, $search, $pos )) != -1 )
	{
		substr( $subject, $idx, length($search) ) = $replace;
		$pos=$idx+length($replace);
		if ($count>0 && ++$i>=$count) { last; }
	}
	
	return $subject;
}
[Commented edited to preserve code formatting]
Reply to this.
Anonymous at 03 Feb, 2007 09:22
I used the Benchmark module on the 3 functions:

str_replace_1 => the old, regexp method
str_replace_2 => the one recommended by the author
str_replace_3 => the one submitted by anonymous above

and this is what I get:

Benchmark: timing 1000000 iterations of str_replace_1, str_replace_2, str_replace_3...
str_replace_1: 9 wallclock secs ( 9.64 usr + 0.00 sys = 9.64 CPU) @ 103734.44/s (n=1000000)
str_replace_2: 7 wallclock secs ( 5.88 usr + 0.00 sys = 5.88 CPU) @ 170212.77/s (n=1000000)
str_replace_3: 13 wallclock secs (12.45 usr + 0.00 sys = 12.45 CPU) @ 80301.94/s (n=1000000)

If you want to check out the code, well, here they are:


use Benchmark;

timethese(
	1000000,
	{
		'str_replace_1' => sub{
			my ($from, $to, $string) = ('Hello', 'Goodbye', 'Hello World');
			$string =~ s/$from/$to/;
			return $string;
		},
		'str_replace_2' => sub{
			my $replace_this = 'Hello';
			my $with_this  = 'Goodbye'; 
			my $string   = 'Hello World';
	
			my $length = length($string);
			my $target = length($replace_this);
	
			for(my $i=0; $i<$length - $target + 1; $i++) {
				if(substr($string,$i,$target) eq $replace_this) {
					$string = substr($string,0,$i) . $with_this . substr($string,$i+$target);
					return $string; #Comment this if you what a global replace
				}
			}
			return $string;
		},
		'str_replace_3' => sub{
			my $search = 'Hello';
			my $replace = 'Goodbye';
			my $subject = 'Hello World';
			if (! defined $subject) { return -1; }
			my $count = -1;
			if (! defined $count) { $count = -1; }
	
			my ($i,$pos) = (0,0);
			while ( (my $idx = index( $subject, $search, $pos )) != -1 ){
				substr( $subject, $idx, length($search) ) = $replace;
				$pos=$idx+length($replace);
				if ($count>0 && ++$i>=$count) { last; }
			}
	
			return $subject;
		}
	}
);
Reply to this.
Binny V A at 04 Feb, 2007 10:02
My actual intention in writing this function was not to creating a more optimized version of the replace function. I wanted to replace strings that are regular expression - if I used regular expression to do that, I will have to escape a lot of chars. So, I ended up making this function.

All 3 functions has its own advantages...

s///g; - Perl's native method - requires the least effort.

My Method - Arguably the fastest based on the benchmark provided.

The function provided in the comment - This function basically re-writes my function using better code. I have no idea why it is the slowest.
Reply to this.
Anonymous at 15 Feb, 2007 02:43
That makes no sense, why is it faster to loop through the string letter by letter and looking at the following X letters to see if they match (which means a LOT of memory work), instead of perl's built-in and compiled index() function? Makes no sense... Maybe the advantages of the function by Anonymous show when using a large file, try loading a 100mb file and search and replace a 1mb string or something. Actually, that should be it, yours is faster here since the benchmark only used a replace of 'Hello' => 'Goodbye' in 'Hello World'. But for large data chunks it should be the slowest by a large amount.
Reply to this.
nico at 13 Jul, 2007 07:32
If you think using a 100bm file it will show the "real" performance of each function, just do it and let us know what you have found.

However, I think there's no reason to write "that was one of the worst Perl programs I have ever seen"

be humble, my son... be humble.
Reply to this.
SomeoneElse at 27 Jan, 2008 11:03
In case someone cares, I just tested with a much bigger string and entered the "Hello" term a few times in a very large string and here is the new results:

Benchmark: timing 10000 iterations of str_replace_1, str_replace_2, str_replace_
3...
str_replace_1: 0 wallclock secs ( 0.06 usr + 0.00 sys = 0.06 CPU) @ 161290.32
/s (n=10000)
(warning: too few iterations for a reliable count)
str_replace_2: 12 wallclock secs (11.66 usr + 0.02 sys = 11.67 CPU) @ 856.68/s
(n=10000)
str_replace_3: 0 wallclock secs ( 0.13 usr + 0.00 sys = 0.13 CPU) @ 80000.00/
s (n=10000)
(warning: too few iterations for a reliable count)

So yes, it depends on the data you have and the second option is by far the worst for my example, that happened to be a ong string with few occurances to replace.
Reply to this.
Anonymous at 01 Feb, 2008 11:34
My GOD what a bad thread. Why would someone think that just because PERL's replace function can take regular expressions it will be slow if you DON'T use one??? And then write that idiotic code to replace the s///....
Then another idiot tries to benchmark and selects a very small string that STARTS with the match target!!!
On any regular test you will see the OP code being horrendously slow. The anonymous poster's "refined" code is a lot faster but of course not near as fast as s///.
Reply to this.
Anonymous at 01 Feb, 2008 11:36
I even ran an example to show my point by adding junk (about 20 chars) before "Hello":
Benchmark: timing 1000000 iterations of str_replace_1, str_replace_2, str_replace_3...
str_replace_1: 1 wallclock secs ( 1.96 usr + 0.02 sys = 1.98 CPU) @ 505050.51/s (n=1000000)
str_replace_2: 26 wallclock secs (24.74 usr + 0.09 sys = 24.83 CPU) @ 40273.86/s (n=1000000)
str_replace_3: 4 wallclock secs ( 3.05 usr + 0.01 sys = 3.06 CPU) @ 326797.39/s (n=1000000)
Reply to this.
Anonymous at 13 Mar, 2008 07:33
Can you Anonymous people differentiate yourselves? I can't keep track!
Reply to this.
Anonymous #3 or 4 at 14 Mar, 2008 04:54
PS It didn't work for me. But I picked out the 2 lines I needed.

$idx = index( $_, $search, 0);
substr($_, $idx, length($search) ) = $replace;

I need to replace "|P|2.x" with "T|P|2.x" - Once.

That worked for me.
Thanks!
Reply to this.
Anonymous #3 or 4 at 14 Mar, 2008 04:55
Special Prize for anyone who guesses context of above substitution


(note, the prize is the warm feeling inside :-)
Reply to this.
Anonymous at 23 Apr, 2008 07:32
you should try to be constructive and contributing instead of exhibiting your sh@#thead epeen.
Reply to this.
Comment


Comment




Comment Formating : HTML tags a, strong, em, b, i, code, pre, p and br allowed. Other tags will be shown as code(< will become &lt;). Urls, Line breaks will be auto-formated.
Subscribe to Feed