Skip to content

Conversation

@jordikroon
Copy link

@jordikroon jordikroon commented Dec 21, 2025

This is a fix/implementation for #20589

Add a new flag PREG_REPLACE_COUNT_CHANGES to preg_replace() and preg_replace_callback() that counts only effective changes rather than all matched replacements. When this flag is set, the replacement count is incremented only when the replacement differs from the original matched text, allowing callers to distinguish between identity replacements and actual content changes.

  • Add PREG_REPLACE_COUNT_CHANGES constant definition
  • Extend php_pcre_replace() and php_pcre_replace_impl() signatures with flags parameter
  • Implement change detection logic comparing replacement with original match
  • Update function signatures throughout the codebase (php_pcre_replace_array, php_replace_in_subject, _preg_replace_common)
  • Add test coverage for identity replacements, backreferences, and callbacks
  • Update stub definitions and internal function signatures

…ments

Add a new flag PREG_REPLACE_COUNT_CHANGES to preg_replace() and preg_replace_callback() that counts only effective changes rather than all matched replacements. When this flag is set, the replacement count is incremented only when the replacement differs from the original matched text, allowing callers to distinguish between identity replacements and actual content changes.

- Add PREG_REPLACE_COUNT_CHANGES constant definition
- Extend php_pcre_replace() and php_pcre_replace_impl() signatures with flags parameter
- Implement change detection logic comparing replacement with original match
- Update function signatures throughout the codebase (php_pcre_replace_array, php_replace_in_subject, _preg_replace_common)
- Add test coverage for identity replacements, backreferences, and callbacks
- Update stub definitions and internal function signatures
…ments

Add a new flag PREG_REPLACE_COUNT_CHANGES to preg_replace() and preg_replace_callback() that counts only effective changes rather than all matched replacements. When this flag is set, the replacement count is incremented only when the replacement differs from the original matched text, allowing callers to distinguish between identity replacements and actual content changes.

- Add PREG_REPLACE_COUNT_CHANGES constant definition
- Extend php_pcre_replace() and php_pcre_replace_impl() signatures with flags parameter
- Implement change detection logic comparing replacement with original match
- Update function signatures throughout the codebase (php_pcre_replace_array, php_replace_in_subject, _preg_replace_common)
- Add test coverage for identity replacements, backreferences, and callbacks
- Update stub definitions and internal function signatures
@jordikroon jordikroon changed the title feat: add PREG_REPLACE_COUNT_CHANGES flag for counting actual replace Add PREG_REPLACE_COUNT_CHANGES flag for counting actual replace Dec 21, 2025
@jordikroon jordikroon changed the title Add PREG_REPLACE_COUNT_CHANGES flag for counting actual replace Fix GH-20589: Add PREG_REPLACE_COUNT_CHANGES flag for counting actual replace Dec 21, 2025
Two calls to php_preg_replace in win32/sendmail.c were missing the
final parameter. Add 0 as the count parameter to fix the function calls.
Add new PREG_REPLACE_COUNT_CHANGES flag to preg_replace() and
preg_replace_callback() functions that counts only effective changes
instead of all matched replacements. When this flag is set, the count
parameter will only increment if the replacement actually differs from
the original matched text, allowing developers to distinguish between
matches that were replaced with identical content versus actual changes.

Also add TCP_USER_TIMEOUT socket constant for Linux to set the maximum
time in milliseconds that transmitted data can remain unacknowledged.
…stant and preg_replace flags parameter

Add expected output for the new PREG_REPLACE_COUNT_CHANGES constant and the new optional $flags parameter in preg_replace function signature.
Update test expectations to reflect changes in the PCRE extension:
- Constants count increased from 19 to 20
- preg_replace() now has 6 parameters instead of 5
@ndossche
Copy link
Member

I don't like this because this won't work in general. Simple proof below.

<?php
function callback($arr) {
    return match ($arr[0]) {
        'a' => 'ab',
        'bba' => 'ba',
    };
}

$after = preg_replace_callback('/^a|bba/', 'callback', 'abba', -1, $count, PREG_REPLACE_COUNT_CHANGES);
echo $after,"\n"; // Identical to input
var_dump($count); // int(2)

Track local replacement counts separately and only report actual changes if the
final output differs from the input. This ensures PREG_REPLACE_COUNT_CHANGES
correctly returns 0 when individual replacements occur but the final result is
identical to the original string.
Add comprehensive test cases for PREG_REPLACE_COUNT_CHANGES flag covering:
- Limit parameter behavior with cancel-out patterns
- Zero-length matches (lookahead assertions)
- Empty matches at every position
- Subject arrays (per-element counting)
- UTF-8 byte comparison correctness

Also improve comment clarity by removing "Edge case" prefix.
@jordikroon
Copy link
Author

Nice catch @ndossche. Didn't think of that. I have updated the implementation to cover that. Also added some more tests that I could think of.

@ndossche
Copy link
Member

I'm not sure this can be fixed in general. See this small change that results in a change count of 2, while you could argue that the result should be 1:

<?php
function callback($arr) {
    return match ($arr[0]) {
        'a' => 'ab',
        'bba' => 'baa',
    };
}

$after = preg_replace_callback('/^a|bba/', 'callback', 'abba', -1, $count, PREG_REPLACE_COUNT_CHANGES);
echo $after,"\n";
var_dump($count);

I think the main problem is that it's going to be hard to make the change count well-defined.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants