PHP
downloads | documentation | faq | getting help | mailing lists | reporting bugs | php.net sites | links | conferences | my php.net

search for in the

preg_split> <preg_replace_callback
Last updated: Sat, 30 Jun 2007

view this page in

preg_replace

(PHP 4, PHP 5)

preg_replace — Perform a regular expression search and replace

Description

mixed preg_replace ( mixed $pattern, mixed $replacement, mixed $subject [, int $limit [, int &$count]] )

Searches subject for matches to pattern and replaces them with replacement.

Parameters

pattern

The pattern to search for. It can be either a string or an array with strings.

The e modifier makes preg_replace() treat the replacement parameter as PHP code after the appropriate references substitution is done. Tip: make sure that replacement constitutes a valid PHP code string, otherwise PHP will complain about a parse error at the line containing preg_replace().

replacement

The string or an array with strings to replace. If this parameter is a string and the pattern parameter is an array, all patterns will be replaced by that string. If both pattern and replacement parameters are arrays, each pattern will be replaced by the replacement counterpart. If there are fewer elements in the replacement array than in the pattern array, any extra patterns will be replaced by an empty string.

replacement may contain references of the form \\n or (since PHP 4.0.4) $n, with the latter form being the preferred one. Every such reference will be replaced by the text captured by the n'th parenthesized pattern. n can be from 0 to 99, and \\0 or $0 refers to the text matched by the whole pattern. Opening parentheses are counted from left to right (starting from 1) to obtain the number of the capturing subpattern.

When working with a replacement pattern where a backreference is immediately followed by another number (i.e.: placing a literal number immediately after a matched pattern), you cannot use the familiar \\1 notation for your backreference. \\11, for example, would confuse preg_replace() since it does not know whether you want the \\1 backreference followed by a literal 1, or the \\11 backreference followed by nothing. In this case the solution is to use \${1}1. This creates an isolated $1 backreference, leaving the 1 as a literal.

When using the e modifier, this function escapes some characters (namely ', ", \ and NULL) in the strings that replace the backreferences. This is done to ensure that no syntax errors arise from backreference usage with either single or double quotes (e.g. 'strlen(\'$1\')+strlen("$2")'). Make sure you are aware of PHP's string syntax to know exactly how the interpreted string will look like.

subject

The string or an array with strings to search and replace.

If subject is an array, then the search and replace is performed on every entry of subject, and the return value is an array as well.

limit

The maximum possible replacements for each pattern in each subject string. Defaults to -1 (no limit).

count

If specified, this variable will be filled with the number of replacements done.

Return Values

preg_replace() returns an array if the subject parameter is an array, or a string otherwise.

If matches are found, the new subject will be returned, otherwise subject will be returned unchanged.

ChangeLog

VersionDescription
5.1.0 Added the count parameter
4.0.4 Added the '$n' form for the replacement parameter
4.0.2 Added the limit parameter

Examples

Example 1695. Using backreferences followed by numeric literals

<?php
$string
= 'April 15, 2003';
$pattern = '/(\w+) (\d+), (\d+)/i';
$replacement = '${1}1,$3';
echo
preg_replace($pattern, $replacement, $string);
?>

The above example will output:


April1,2003

    

Example 1696. Using indexed arrays with preg_replace()

<?php
$string
= 'The quick brown fox jumped over the lazy dog.';
$patterns[0] = '/quick/';
$patterns[1] = '/brown/';
$patterns[2] = '/fox/';
$replacements[2] = 'bear';
$replacements[1] = 'black';
$replacements[0] = 'slow';
echo
preg_replace($patterns, $replacements, $string);
?>

The above example will output:


The bear black slow jumped over the lazy dog.

    

By ksorting patterns and replacements, we should get what we wanted.

<?php
ksort
($patterns);
ksort($replacements);
echo
preg_replace($patterns, $replacements, $string);
?>

The above example will output:


The slow black bear jumped over the lazy dog.

    

Example 1697. Replacing several values

<?php
$patterns
= array ('/(19|20)(\d{2})-(\d{1,2})-(\d{1,2})/',
                  
'/^\s*{(\w+)}\s*=/');
$replace = array ('\3/\4/\1\2', '$\1 =');
echo
preg_replace($patterns, $replace, '{startDate} = 1999-5-27');
?>

The above example will output:


$startDate = 5/27/1999

    

Example 1698. Using the 'e' modifier

<?php
preg_replace
("/(<\/?)(\w+)([^>]*>)/e",
            
"'\\1'.strtoupper('\\2').'\\3'",
            
$html_body);
?>

This would capitalize all HTML tags in the input text.


Example 1699. Strip whitespace

This example strips excess whitespace from a string.

<?php
$str
= 'foo   o';
$str = preg_replace('/\s\s+/', ' ', $str);
// This will be 'foo o' now
echo $str;
?>

Example 1700. Using the count parameter

<?php
$count
= 0;

echo
preg_replace(array('/\d/', '/\s/'), '*', 'xp 4 to', -1 , $count);
echo
$count; //3
?>

The above example will output:


xp***to
3

    

Notes

Note: When using arrays with pattern and replacement, the keys are processed in the order they appear in the array. This is not necessarily the same as the numerical index order. If you use indexes to identify which pattern should be replaced by which replacement, you should perform a ksort() on each array prior to calling preg_replace().

See Also

preg_match()
preg_replace_callback()
preg_split()



add a note add a note User Contributed Notes
preg_replace
anzenews at volja dot net
13-Jul-2007 05:38
As steven -a-t- acko dot net explained, using /e modifier is tricky if strings have quotes in them. However, the solution might be easier than manual coding of some sort of stripslashes - just use preg_replace_callback instead of preg_replace. It is safer anyway.
anon
12-Jun-2007 05:29
For the benefit of perl coders,

$s =~ s/PATTERN/REPLACEMENT/g;

becomes:

<?
$s
= preg_replace('/PATTERN/', 'REPLACEMENT', $s);
?>

Note that you have to assign the result back to $s.  If your preg_replace doesn't seem to be working, you may have merely forgotten to assign the return to $s.
igasparetto at hotmail dot com
26-Apr-2007 01:21
Displaying results of a search engine:

$words=explode(" ", $_POST['query']);
foreach($words as $word){
    $patterns[]='/'.$word.'/i';
    $replaces[]='<span class="textFound">$0</span>';
}

// run sql

$display_results="";

foreach($res as $row)
    $display_results .= "<p>" . preg_replace($patterns, $replaces, nl2br(htmlentities( $row['field'] ) ) ) . "</p>\n";

echo $display_results;
ismith at nojunk dot motorola dot com
21-Mar-2007 10:47
Be aware that when using the "/u" modifier, if your input text contains any bad UTF-8 code sequences, then preg_replace will return an empty string, regardless of whether there were any matches.

This is due to the PCRE library returning an error code if the string contains bad UTF-8.
mrozenoer at overstream dot net
06-Mar-2007 08:30
I could not find a function to unescape javascript unicode escapes anywhere (e.g., "\u003c"=>"<").

<?php
function js_uni_decode($s) {
    return
preg_replace('/\\\u([0-9a-f]{4})/ie', "chr(hexdec('\\1'))"$s);
}
echo
js_uni_decode("\u003c");
?>
dani dot church at gmail dot youshouldknowthisone
07-Feb-2007 11:09
Note that it is in most cases much more efficient to use preg_replace_callback(), with a named function or an anonymous function created with create_function(), instead of the /e modifier.  When preg_replace() is called with the /e modifier, the interpreter must parse the replacement string into PHP code once for every replacement made, while preg_replace_callback() uses a function that only needs to be parsed once.
JON
31-Jan-2007 03:41
for a url explode I would suggest parse_url($url). Its far simpler than the list of preg_replaces used.
Alexey Lebedev
07-Sep-2006 02:21
Wasted several hours because of this:

$str='It&#039;s a string with HTML entities';
preg_replace('~&#(\d+);~e', 'code2utf($1)', $str);

This code must convert numeric html entities to utf8. And it does with a little exception. It treats wrong codes starting with &#0

The reason is that code2utf will be called with leading zero, exactly what the pattern matches - code2utf(039).
And it does matter! PHP treats 039 as octal number.
Try print(011);

Solution:
preg_replace('~&#0*(\d+);~e', 'code2utf($1)', $str);
tim at t-network dot nl
18-Jul-2006 10:58
This function has a little quirk.

When you are trying to use backreferences in the pattern, you MUST use \\n, and not $n. $n doesn't work.
robvdl at gmail dot com
21-Apr-2006 05:15
For those of you that have ever had the problem where clients paste text from msword into a CMS, where word has placed all those fancy quotes throughout the text, breaking the XHTML validator... I have created a nice regular expression, that replaces ALL high UTF-8 characters with HTML entities, such as &#8217;.

Note that most user examples on php.net I have read, only replace selected characters, such as single and double quotes. This replaces all high characters, including greek characters, arabian characters, smilies, whatever.

It took me ages to get it just downto two regular expressions, but it handles all high level characters properly.

$text = preg_replace('/([\xc0-\xdf].)/se', "'&#' . ((ord(substr('$1', 0, 1)) - 192) * 64 + (ord(substr('$1', 1, 1)) - 128)) . ';'", $text);
$text = preg_replace('/([\xe0-\xef]..)/se', "'&#' . ((ord(substr('$1', 0, 1)) - 224) * 4096 + (ord(substr('$1', 1, 1)) - 128) * 64 + (ord(substr('$1', 2, 1)) - 128)) . ';'", $text);
Eric
09-Apr-2006 11:54
Here recently I needed a way to replace links (<a href="blah.com/blah.php">Blah</a>) with their anchor text, in this case Blah. It might seem simple enough for some..or most, but at the benefit of helping others:

<?php

$value
= '<a href="http://www.domain.com/123.html">123</a>';

echo
preg_replace('/<a href="(.*?)">(.*?)<\\/a>/i', '$2', $value);

//Output
// 123

?>
ae at instinctive dot de
28-Mar-2006 07:40
Something innovative for a change ;-) For a news system, I have a special format for links:

"Go to the [Blender3D Homepage|http://www.blender3d.org] for more Details"

To get this into a link, use:

$new = preg_replace('/\[(.*?)\|(.*?)\]/', '<a href="$2" target="_blank">$1</a>', $new);
SG_01
19-Jan-2006 04:43
Re: wcc at techmonkeys dot org

You could put this in 1 replace for faster execution as well:

<?php

/*
 * Removes all blank lines from a string.
 */
function removeEmptyLines($string)
{
   return
preg_replace("/(^[\r\n]*|[\r\n]+)[\s\t]*[\r\n]+/", "\n", $string);
}

?>
kyle at vivahate dot com
22-Dec-2005 12:08
Here is a regular expression to "slashdotify" html links.  This has worked well for me, but if anyone spots errors, feel free to make corrections.

<?php
$url
= '<a attr="garbage" href="http://us3.php.net/preg_replace">preg_replace - php.net</a>';
$url = preg_replace( '/<.*href="?(.*:\/\/)?([^ \/]*)([^ >"]*)"?[^>]*>(.*)(<\/a>)/', '<a href="$1$2$3">$4</a> [$2]', $url );
?>

Will output:

<a href="http://us3.php.net/preg_replace">preg_replace - php.net</a> [us3.php.net]
istvan dot csiszar at weblab dot hu
21-Dec-2005 01:53
This is an addition to the previously sent removeEvilTags function. If you don't want to remove the style tag entirely, just certain style attributes within that, then you might find this piece of code useful:

<?php

function removeEvilStyles($tagSource)
{
  
// this will leave everything else, but:
   
$evilStyles = array('font', 'font-family', 'font-face', 'font-size', 'font-size-adjust', 'font-stretch', 'font-variant');

   
$find = array();
   
$replace = array();
   
    foreach (
$evilStyles as $v)
    {
       
$find[]    = "/$v:.*?;/";
       
$replace[] = '';
    }
   
    return
preg_replace($find, $replace, $tagSource);
}

function
removeEvilTags($source)
{
   
$allowedTags = '<h1><h2><h3><h4><h5><a><img><label>'.
       
'<p><br><span><sup><sub><ul><li><ol>'.
       
'<table><tr><td><th><tbody><div><hr><em><b><i>';
   
$source = strip_tags(stripslashes($source), $allowedTags);
    return
trim(preg_replace('/<(.*?)>/ie', "'<'.removeEvilStyles('\\1').'>'", $source));
}

?>
jhm at cotren dot net
18-Feb-2005 02:04
It took me a while to figure this one out, but here is a nice way to use preg_replace to convert a hex encoded string back to clear text

<?php
    $text
= "PHP rocks!";
   
$encoded = preg_replace(
          
"'(.)'e"
         
,"dechex(ord('\\1'))"
         
,$text
   
);
    print
"ENCODED: $encoded\n";
?>
ENCODED: 50485020726f636b7321
<?php
   
print "DECODED: ".preg_replace(
      
"'([\S,\d]{2})'e"
     
,"chr(hexdec('\\1'))"
     
,$encoded)."\n";
?>
DECODED: PHP rocks!
gabe at mudbuginfo dot com
18-Oct-2004 01:39
It is useful to note that the 'limit' parameter, when used with 'pattern' and 'replace' which are arrays, applies to each individual pattern in the patterns array, and not the entire array.
<?php

$pattern
= array('/one/', '/two/');
$replace = array('uno', 'dos');
$subject = "test one, one two, one two three";

echo
preg_replace($pattern, $replace, $subject, 1);
?>

If limit were applied to the whole array (which it isn't), it would return:
test uno, one two, one two three

However, in reality this will actually return:
test uno, one dos, one two three
steven -a-t- acko dot net
08-Feb-2004 09:45
People using the /e modifier with preg_replace should be aware of the following weird behaviour. It is not a bug per se, but can cause bugs if you don't know it's there.

The example in the docs for /e suffers from this mistake in fact.

With /e, the replacement string is a PHP expression. So when you use a backreference in the replacement expression, you need to put the backreference inside quotes, or otherwise it would be interpreted as PHP code. Like the example from the manual for preg_replace:

preg_replace("/(<\/?)(\w+)([^>]*>)/e",
             "'\\1'.strtoupper('\\2').'\\3'",
             $html_body);

To make this easier, the data in a backreference with /e is run through addslashes() before being inserted in your replacement expression. So if you have the string

 He said: "You're here"

It would become:

 He said: \"You\'re here\"

...and be inserted into the expression.
However, if you put this inside a set of single quotes, PHP will not strip away all the slashes correctly! Try this:

 print ' He said: \"You\'re here\" ';
 Output: He said: \"You're here\"

This is because the sequence \" inside single quotes is not recognized as anything special, and it is output literally.

Using double-quotes to surround the string/backreference will not help either, because inside double-quotes, the sequence \' is not recognized and also output literally. And in fact, if you have any dollar signs in your data, they would be interpreted as PHP variables. So double-quotes are not an option.

The 'solution' is to manually fix it in your expression. It is easiest to use a separate processing function, and do the replacing there (i.e. use "my_processing_function('\\1')" or something similar as replacement expression, and do the fixing in that function).

If you surrounded your backreference by single-quotes, the double-quotes are corrupt:
$text = str_replace('\"', '"', $text);

People using preg_replace with /e should at least be aware of this.

I'm not sure how it would be best fixed in preg_replace. Because double-quotes are a really bad idea anyway (due to the variable expansion), I would suggest that preg_replace's auto-escaping is modified to suit the placement of backreferences inside single-quotes (which seemed to be the intention from the start, but was incorrectly applied).

preg_split> <preg_replace_callback
Last updated: Sat, 30 Jun 2007
 
 
show source | credits | sitemap | contact | advertising | mirror sites