Hello and welcome to our community! Is this your first visit?
Register
Enjoy an ad free experience by logging in. Not a member yet? Register.
Results 1 to 4 of 4
  1. #1
    Regular Coder
    Join Date
    Jan 2004
    Posts
    107
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Wiki-Style Highlights

    I've been boggled over how do do this efficently for almost a week now;

    How would you compare and highlight the differences between two large paragraphs efficently; Similar to the way you can view the differences in a Wiki article after it's edited?

    Ie;
    http://en.wikipedia.org/w/index.php?...oldid=59809982

    Thank you for any help!

  • #2
    Regular Coder
    Join Date
    Feb 2005
    Location
    Texas
    Posts
    472
    Thanks
    1
    Thanked 0 Times in 0 Posts
    I'd like to know this too.
    If you're reading this, it may already be too late!

  • #3
    Regular Coder
    Join Date
    Jun 2004
    Posts
    565
    Thanks
    0
    Thanked 18 Times in 18 Posts
    There are free libraries out there to provide a diff between two files.

    I've googled a bit and found this: http://phpwiki.cvs.sourceforge.net/p...hp?view=markup
    Example output: http://phpwiki.sourceforge.net/phpwi...ff=MostPopular

    If you don't like this you may also try out my partial port of Python's difflib module, though it's not much tested, undocumented and incomplete:
    PHP Code:
    <?php
    /*
    This is a port of a part of the difflib module in Python's standard
    library to PHP (Python version 2.5).

    The original file seems to be covered by the
    Python license (http://www.python.org/download/releases/2.5/license/).

    */

    function Match_create($i$j$length)
    {
        return array(
    'aStart' => $i'bStart' => $j'length' => $length);
    }
    function 
    Match_cmp($matchA$matchB)
    {
        
    $result $matchA['aStart'] - $matchB['aStart'];
        if(
    $result)
        {
            return 
    $result;
        }
        
    $result $matchA['bStart'] - $matchB['bStart'];
        if(
    $result)
        {
            return 
    $result;
        }
        return 
    $matchA['length'] - $matchB['length'];
    }
    function 
    Match_toString($match)
    {
        return 
    'Match('.$match['aStart'].', '.$match['bStart'].', '.$match['length'].')';
    }

    define('TAG_EQUAL'0);
    define('TAG_INSERT'1);
    define('TAG_DELETE'2);
    define('TAG_REPLACE'3);

    function 
    OpCode_create($tag$aStart$aEnd$bStart$bEnd)
    {
        return array(
    'tag' => $tag'aStart' => $aStart'aEnd' => $aEnd'bStart' => $bStart'bEnd' => $bEnd);
    }
    function 
    OpCode_toString($opCode)
    {
        return 
    'OpCode("'.Tag_toString($opCode['tag']).'", '.$opCode['aStart'].', '.
            
    $opCode['aEnd'].', '.$opCode['bStart'].', '.$opCode['bEnd'].')';
    }
    function 
    Tag_toString($tag)
    {
        switch(
    $tag)
        {
            case 
    TAG_EQUAL: return 'equal';
            case 
    TAG_INSERT: return 'insert';
            case 
    TAG_DELETE: return 'delete';
            case 
    TAG_REPLACE: return 'replace';
            default: 
    trigger_error('Tag_toString: $tag is not a valid tag, given: '.$tagE_USER_ERROR);
        }
    }

    function 
    QueueEntry_create($aStart$aEnd$bStart$bEnd)
    {
        return array(
    $aStart$aEnd$bStart$bEnd);
    }

    class 
    SequenceMatcher
    {
        protected
            
    $seqA ''$seqB '',
            
    $isJunk NULL,
            
    $count,
            
    $popular$junk,
            
    $elemPosB,
            
    $matches,
            
    $opCodes,
            
    $elemCountB;
            
        public function 
    __construct($seqA ''$seqB ''$isJunk NULL$count 'strlen')
        {
            
    $this->setCounter($count);
            
    $this->setJunkMatcher($isJunk);
            
    $this->setSequences($seqA$seqB);
        }
        public function 
    setSequences($seqA$seqB)
        {
            
    $this->setSequenceA($seqA);
            
    $this->setSequenceB($seqB);
        }
        public function 
    setSequenceA($seqA)
        {
            if(
    $seqA == $this->seqA)
            {
                return;
            }
            
    $this->seqA $seqA;
            
    $this->matches = array();
            
    $this->opCodes = array();
        }
        public function 
    setSequenceB($seqB)
        {
            if(
    $seqB == $this->seqB)
            {
                return;
            }
            
    $this->seqB $seqB;
            
    $this->matches = array();
            
    $this->opCodes = array();
            
    $this->elementCountB = array();
            
    $this->chainB();
        }
        public function 
    setJunkMatcher($isJunk)
        {
            if(
    $isJunk == $this->isJunk)
            {
                return;
            }
            
    $this->isJunk $isJunk;
            if(
    $this->seqB)
            {
                
    $this->chainB();
            }
        }
        public function 
    setCounter($count)
        {
            if(!
    is_callable($count))
            {
                
    trigger_error('SequenceMatcher::setCounter: count must be callable, given: '.((string) $count), E_USER_ERROR);
            }
            
    $this->count $count;
        }
        public function 
    chainB()
        {
            
    $bCount call_user_func($this->count$this->seqB);
            
    $elemPosB = array();
            
    $popular = array();
            
    $junk = array();
            
            for(
    $i 0$i $bCount; ++$i)
            {
                
    $elem $this->seqB[$i];
                if(isset(
    $elemPosB[$elem]))
                {
                    if(
    $bCount >= 200 && count($elemPosB[$elem]) * 100 $bCount)
                    {
                        
    $popular[$elem] = 1;
                        
    //unset($elemPosB[$elem]);
                    
    }
                    else
                    {
                        
    $elemPosB[$elem][] = $i;
                    }
                }
                else
                {
                    
    $elemPosB[$elem] = array($i);
                }
            }
            
    $popularKeys array_keys($popular);
            foreach(
    $popularKeys as $key)
            {
                unset(
    $elemPosB[$key]);
            }
            if(
    is_callable($this->isJunk))
            {
                
    $isJunk $this->isJunk;
                foreach(
    $popularKeys as $key)
                {
                    if(
    call_user_func($isJunk$key))
                    {
                        
    $this->junk[$key] = 1;
                        unset(
    $popular[$key]);
                    }
                }
                foreach(
    array_keys($elemPosB) as $key)
                {
                    if(
    call_user_func($isJunk$key))
                    {
                        
    $junk[$key] = 1;
                        unset(
    $elemPosB[$key]);
                    }
                }
            }
            
            
    $this->elemPosB $elemPosB;
            
    $this->popular $popular;
            
    $this->junk $junk;
        }
        public function 
    isJunkInB($elem)
        {
            return (isset(
    $this->junk[$elem]));
        }
        public function 
    isPopularInB($elem)
        {
            return (isset(
    $this->popular[$elem]));
        }
        public function 
    findLongestMatch($aLo$aHi$bLo$bHi)
        {
            
    $bestI $aLo;
            
    $bestJ $bLo;
            
    $bestSize 0;
            
    $j2Len = array();
            
    $newJ2Len;
            
    $k 0;
            
            for(
    $i $aLo$i $aHi; ++$i)
            {
                
    $newJ2Len = array();
                if(isset(
    $this->elemPosB[$this->seqA[$i]]))
                {
                    foreach(
    $this->elemPosB[$this->seqA[$i]] as $j)
                    {
                        if(
    $j $bLo)
                        {
                            continue;
                        }
                        if(
    $j >= $bHi)
                        {
                            break;
                        }
                        
    $k $newJ2Len[$j] = (isset($j2Len[$j 1])) ? $j2Len[$j 1] + 1;
                        if(
    $k $bestSize)
                        {
                            
    $bestI $i $k 1;
                            
    $bestJ $j $k 1;
                            
    $bestSize $k;
                        }
                    }
                }
                
    $j2Len $newJ2Len;
            }
            
            --
    $bestI;
            --
    $bestJ;
            while(    
    $bestI >= $aLo && $bestJ >= $bLo && 
                    !
    $this->isJunkInB($this->seqB[$bestJ]) && 
                    
    $this->seqA[$bestI] == $this->seqB[$bestJ])
            {
                --
    $bestI;
                --
    $bestJ;
                ++
    $bestSize;
            }
            while(    
    $bestI >= $aLo && $bestJ >= $bLo && 
                    
    $this->isJunkInB($this->seqB[$bestJ]) && 
                    
    $this->seqA[$bestI] == $this->seqB[$bestJ])
            {
                --
    $bestI;
                --
    $bestJ;
                ++
    $bestSize;
            }
            ++
    $bestI;
            ++
    $bestJ;
            
            
    $toIndexI $bestI $bestSize;
            
    $toIndexJ $bestJ $bestSize;
            while(    
    $toIndexI $aHi && $toIndexJ $bHi && 
                    !
    $this->isJunkInB($this->seqB[$toIndexJ]) && 
                    
    $this->seqA[$toIndexI] == $this->seqB[$toIndexJ])
            {
                ++
    $toIndexI;
                ++
    $toIndexJ;
                ++
    $bestSize;
            }
            while(    
    $toIndexI $aHi && $toIndexJ $bHi && 
                    
    $this->isJunkInB($this->seqB[$toIndexJ]) && 
                    
    $this->seqA[$toIndexI] == $this->seqB[$toIndexJ])
            {
                ++
    $toIndexI;
                ++
    $toIndexJ;
                ++
    $bestSize;
            }
            return 
    Match_create($bestI$bestJ$bestSize);
        }
        public function 
    getMatchingBlocks()
        {
            if(!empty(
    $this->matches))
            {
                return 
    $this->matches;
            }
            
            
    $lengthA call_user_func($this->count$this->seqA);
            
    $lengthB call_user_func($this->count$this->seqB);
            
    $queue = array(QueueEntry_create(0$lengthA0$lengthB));
            while(
    count($queue) > 0)
            {
                list(
    $aLo$aHi$bLo$bHi) = array_pop($queue);
                
                
    $x $this->findLongestMatch($aLo$aHi$bLo$bHi);
                if(
    $x['length'] > 0)
                {
                    
    $this->matches[] = $x;
                    if(
    $aLo $x['aStart'] && $bLo $x['bStart'])
                    {
                        
    $queue[] = QueueEntry_create($aLo$x['aStart'], $bLo$x['bStart']);
                    }
                    if(
    $x['aStart'] + $x['length'] < $aHi && $x['bStart'] + $x['length'] < $bHi)
                    {
                        
    $queue[] = QueueEntry_create($x['aStart'] + $x['length'], $aHi$x['bStart'] + $x['length'], $bHi);
                    }
                }
            }
            
    usort($this->matches'Match_cmp');
            
            
    $i1 $j1 $k1 0;
            
            
    $nonAdjacent = array();
            foreach(
    $this->matches as $m)
            {
                if(
    $i1 $k1 == $m['aStart'] && $j1 $k1 == $m['bStart'])
                {
                    
    $k1 += $m['length'];
                }
                else
                {
                    if(
    $k1 0)
                    {
                        
    $nonAdjacent[] = Match_create($i1$j1$k1);
                    }
                    
    $i1 $m['aStart'];
                    
    $j1 $m['bStart'];
                    
    $k1 $m['length'];
                }
            }
            if(
    $k1 0)
            {
                
    $nonAdjacent[] = Match_create($i1$j1$k1);
            }
            
            
    $nonAdjacent[] = Match_create($lengthA$lengthB0);
            return 
    $this->matches $nonAdjacent;
        }
        public function 
    getOpCodes()
        {
            if(!empty(
    $this->opCodes))
            {
                return 
    $this->opCodes;
            }
            
            
    $i $j 0;
            
    $answer = array();
            
            foreach(
    $this->getMatchingBlocks() as $m)
            {
                
    $tag 0;
                if(
    $i $m['aStart'] && $j $m['bStart'])
                {
                    
    $tag TAG_REPLACE;
                }
                elseif(
    $i $m['aStart'])
                {
                    
    $tag TAG_DELETE;
                }
                elseif(
    $j $m['bStart'])
                {
                    
    $tag TAG_INSERT;
                }
                if(
    $tag)
                {
                    
    $answer[] = OpCode_create($tag$i$m['aStart'], $j$m['bStart']);
                }
                
    $i $m['aStart'] + $m['length'];
                
    $j $m['bStart'] + $m['length'];
                
                if(
    $m['length'])
                {
                    
    $answer[] = OpCode_create(TAG_EQUAL$m['aStart'], $i$m['bStart'], $j);
                }
            }
            return 
    $this->opCodes $answer;
        }
        public function 
    getGroupedOpCodes($n 3)
        {
            
    $codes $this->getOpCodes();
            if(empty(
    $codes))
            {
                
    $codes = array(OpCode_create(TAG_EQUAL0101));
            }
            if(
    $codes[0]['tag'] == TAG_EQUAL)
            {
                
    $codes[0]['aStart'] = max($codes[0]['aStart'], $codes[0]['aEnd'] - $n);
                
    $codes[0]['bStart'] = max($codes[0]['bStart'], $codes[0]['bEnd'] - $n);
            }
            
    $top count($codes) - 1;
            if(
    $codes[$top]['tag'] == TAG_EQUAL)
            {
                
    $codes[$top]['aEnd'] = min($codes[$top]['aEnd'], $codes[$top]['aStart'] + $n);
                
    $codes[$top]['bEnd'] = min($codes[$top]['bEnd'], $codes[$top]['bStart'] + $n);
            }
            
    $nn $n $n;
            
    $group = array();
            
    $result = array();
            foreach(
    $codes as $opcode)
            {
                if(
    $opcode['tag'] == TAG_EQUAL && $opcode['aEnd'] - $opcode['aStart'] > $nn)
                {
                    
    $group[] = OpCode_create(
                        
    $opcode['tag'],
                        
    $opcode['aStart'],
                        
    min($opcode['aEnd'], $opcode['aStart'] + $n),
                        
    $opcode['bStart'],
                        
    min($opcode['bEnd'], $opcode['bStart'] + $n)
                    );
                    
    $result[] = $group;
                    
    $group = array();
                    
    $opcode['aStart'] = max($opcode['aStart'], $opcode['aEnd'] - $n);
                    
    $opcode['bStart'] = max($opcode['bStart'], $opcode['bEnd'] - $n);
                }
                
    $group[] = OpCode_create($opcode['tag'], $opcode['aStart'], $opcode['aEnd'], $opcode['bStart'], $opcode['bEnd']);
            }
            if(!empty(
    $group) && !(== count($group) && $group[0]['tag'] == TAG_EQUAL))
            {
                
    $result[] = $group;
            }
            return 
    $result;
        }
        protected static function 
    calculateRatio($totalMatchLength$length)
        {
            return (
    $length) ? $totalMatchLength $length 1;
        }
        public function 
    ratio()
        {
            
    $matches 0;
            foreach(
    $this->getMatchingBlocks() as $m)
            {
                
    $matches += $m['length'];
            }
            return 
    self::calculateRatio($matchescall_user_func($this->count$this->seqA) + call_user_func($this->count$this->seqB));
        }
        public function 
    quickRatio()
        {
            
    $lengthB call_user_func($this->count$this->seqB);
            if(empty(
    $this->elemCountB))
            {
                
    $elemCountB = array();
                for(
    $i 0$i $lengthB; ++$i)
                {
                    
    $elem $this->seqB[$i];
                    if(isset(
    $elemCountB[$elem]))
                    {
                        ++
    $elemCountB[$elem];
                    }
                    else
                    {
                        
    $elemCountB[$elem] = 1;
                    }
                }
            }
            else
            {
                
    $elemCountB $this->elemCountB;
            }
            
            
    $available = array();
            
    $matches 0;
            
            for(
    $i 0$lengthA call_user_func($this->count$this->seqA); $i $lengthA; ++$i)
            {
                
    $elem $this->seqA[$i];
                
                
    $number = (isset($available[$elem])) ?
                    
    $available[$elem] :
                    ((isset(
    $elemCountB[$elem])) ?
                        
    $elemCountB[$elem] :
                        
    0);
                
                
    $available[$elem] = $number 1;
                if(
    $number 0)
                {
                    ++
    $matches;
                }
            }
            
            
    $this->elemCountB $elemCountB;
            return 
    self::calculateRatio($matches$lengthA $lengthB);
        }
        public function 
    realQuickRatio() // optimise
        
    {
            
    $lengthA call_user_func($this->count$this->seqA);
            
    $lengthB call_user_func($this->count$this->seqB);
            return 
    self::calculateRatio(min($lengthA$lengthB), $lengthA $lengthB);
        }
        protected static function 
    sequenceToString($sequence$start$end)
        {
            
    $str '';
            for(; 
    $start $end; ++$start)
            {
                
    $str .= (string) $sequence[$start];
            }
            return 
    $str;
        }
        public function 
    __toString()
        {
            
    $str '<pre>SequenceMatcher(`';
            foreach(
    $this->getOpCodes() as $code)
            {
                switch(
    $code['tag'])
                {
                    case 
    TAG_EQUAL:
                        
    $str .= htmlspecialchars(self::sequenceToString($this->seqA$code['aStart'], $code['aEnd']));
                        break;
                    case 
    TAG_INSERT:
                        
    $str .= '<ins style="color:green">'.htmlspecialchars(self::sequenceToString($this->seqB$code['bStart'], $code['bEnd'])).'</ins>';
                        break;
                    case 
    TAG_DELETE:
                        
    $str .= '<del style="color:red">'.htmlspecialchars(self::sequenceToString($this->seqA$code['aStart'], $code['aEnd'])).'</del>';
                        break;
                    case 
    TAG_REPLACE:
                        
    $str .= '<span style="color:blue">(</span><del style="color:red">'.htmlspecialchars(self::sequenceToString($this->seqA$code['aStart'], $code['aEnd'])).'</del><ins style="color:green">'.htmlspecialchars(self::sequenceToString($this->seqB$code['bStart'], $code['bEnd'])).'</ins><span style="color:blue">)</span>';
                        break;
                }
            }
            return 
    $str.'` => '.(round($this->ratio() * 100) / 100).')</pre>';
        }
        protected static function 
    sortCloseMatches($matchA$matchB)
        {
            return ((
    $diff $matchB[1] - $matchA[1]) < 0) ?
                -
    :
                ((
    $diff 0) ?
                    
    :
                    
    0);
        }
        public function 
    getCloseMatches($sequence$possibilities$showRatio TRUE$maxMatches 3$minRatio 0.6)
        {
            if(
    >= $maxMatches)
            {
                
    trigger_error('SequenceMatcher::getCloseMatches: $maxMatches must be > 0, given: '.$maxMatches.'; defaulting to 3'E_USER_WARNING);
                
    $maxMatches 3;
            }
            if(
    $minRatio || $minRatio)
            {
                
    trigger_error('SequenceMatcher::getCloseMatches: $minRatio must be between 0.0 and 1.0, given: '.$minRatio.'; defaulting to 0.6'E_USER_WARNING);
                
    $minRatio 0.6;
            }
            
    $result = array();
            
    $this->setSequenceB($sequence);
            foreach(
    $possibilities as $possibility)
            {
                
    $this->setSequenceA($possibility);
                if(    
    $this->realQuickRatio() >= $minRatio && $this->quickRatio() >= $minRatio &&
                    (
    $ratio $this->ratio()) >= $minRatio)
                {
                    
    $result[] = array($possibility$ratio);
                }
            }
            
    usort($result, array('self''sortCloseMatches'));
            if(!
    $showRatio)
            {
                
    $res = array();
                for(
    $i 0$i $maxMatches; ++$i)
                {
                    
    $res[$i] = $result[$i][0];
                }
                return 
    $res;
            }
            echo 
    $maxMatches;
            return 
    array_slice($result0$maxMatches);
        }
        protected static function 
    dumpInto(&$result$tag$seq$lo$hi)
        {
            for(; 
    $lo $hi; ++$lo)
            {
                
    $result[] = $tag.$seq[$lo];
            }
        }
        public function 
    getUnifiedDiff($fileNameFrom ''$fileDateFrom ''$fileNameTo ''$fileDateTo ''$contextLines 3$lineDelimiter "\n")
        {
            
    $this->setCounter('count');  // just in case the user forgot it
            
    $result = array(
                
    '--- '.$fileNameFrom.' '.$fileDateFrom.$lineDelimiter,
                
    '+++ '.$fileNameTo.' '.$fileDateTo.$lineDelimiter
            
    );
            foreach(
    $this->getGroupedOpCodes($contextLines) as $group)
            {
                
    $top count($group) - 1;
                
    $result[] = '@@ -'.($group[0]['aStart'] + 1).','.($group[$top]['aEnd'] - $group[0]['aStart']).
                    
    ' +'.($group[0]['bStart'] + 1).','.($group[$top]['bEnd'] - $group[0]['bStart']).' @@'.$lineDelimiter;
                foreach(
    $group as $opCode)
                {
                    switch(
    $opCode['tag'])
                    {
                        case 
    TAG_EQUAL:
                            
    self::dumpInto($result' '$this->seqA$opCode['aStart'], $opCode['aEnd']);
                            break;
                        case 
    TAG_REPLACE:
                            
    self::dumpInto($result'-'$this->seqA$opCode['aStart'], $opCode['aEnd']);
                        case 
    TAG_INSERT:
                            
    self::dumpInto($result'+'$this->seqB$opCode['bStart'], $opCode['bEnd']);
                            break;
                        case 
    TAG_DELETE:
                            
    self::dumpInto($result'-'$this->seqA$opCode['aStart'], $opCode['aEnd']);
                    }
                }
            }
            return 
    $result;
        }
    }
    dumpfi
    "Failure is not an option. It comes bundled with the software."
    ....../)/)..(\__/).(\(\................../)_/)......
    .....(-.-).(='.'=).(-.-)................(o.O)...../<)
    ....(.).(.)("}_("}(.)(.)...............(.)_(.))Ż/.
    ŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻ
    Little did the bunnies suspect that one of them was a psychotic mass murderer with a 6 ft. axe.

  • #4
    Regular Coder
    Join Date
    Jun 2004
    Posts
    565
    Thanks
    0
    Thanked 18 Times in 18 Posts
    I had to split the post, because it was too long. Here's the 2nd part of the port:
    PHP Code:
    <?php
    class Differ
    {
        protected
            
    $s,
            
    $fr,
            
    $lineJunk,
            
    $charJunk;
            
        public function 
    __construct($charJunk NULL$lineJunk NULL)
        {
            
    $this->charJunk $charJunk;
            
    $this->lineJunk $lineJunk;
            
    $this->= new SequenceMatcher(NULLNULLNULL'count');
            
    $this->fr = new SequenceMatcher(NULLNULL$charJunk);
        }
        protected static function 
    dump($tag$seq$lo$hi)
        {
            
    $result = array();
            for(; 
    $lo $hi; ++$lo)
            {
                
    $result[] = $tag.' '.$seq[$lo];
            }
            return 
    $result;
        }
        public function 
    compare($seqA$seqB)
        {
            
    $result = array();
            
    $this->s->setSequences($seqA$seqB);
            foreach(
    $this->s->getOpCodes() as $opCode)
            {
                switch(
    $opCode['tag'])
                {
                    case 
    TAG_EQUAL:
                        
    $result array_merge($resultself::dump(' '$seqA$opCode['aStart'], $opCode['aEnd']));
                        break;
                    case 
    TAG_INSERT:
                        
    $result array_merge($resultself::dump('+'$seqB$opCode['bStart'], $opCode['bEnd']));
                        break;
                    case 
    TAG_DELETE:
                        
    $result array_merge($resultself::dump('-'$seqA$opCode['aStart'], $opCode['aEnd']));
                        break;
                    case 
    TAG_REPLACE:
                        
    $result array_merge($result$this->fancyReplace($seqA$opCode['aStart'], $opCode['aEnd'], $seqB$opCode['bStart'], $opCode['bEnd']));
                        break;
                    default:
                        
    trigger_error('Differ::compare: unknown tag returned by SequenceMatcher::getOpCodes()'E_USER_ERROR);
                }
            }
            return 
    $result;
        }
        protected static function 
    plainReplace($seqA$aLo$aHi$seqB$bLo$bHi)
        {
            
    assert('$aLo < $aHi && $bLo < $bHi');
            return 
    array_merge(self::dump('-'$seqA$aLo$aHi), self::dump('+'$seqB$bLo$bHi));
        }
        protected function 
    fancyHelper($seqA$aLo$aHi$seqB$bLo$bHi)
        {
            return (
    $aLo $aHi) ?
                ((
    $bLo $bHi) ?
                    
    $this->fancyReplace($seqA$aLo$aHi$seqB$bLo$bHi) :
                    
    self::dump('-'$seqA$aLo$aHi)) :
                
    self::dump('+'$seqB$bLo$bHi);
                    
        }
        protected function 
    fancyReplace($seqA$aLo$aHi$seqB$bLo$bHi)
        {
            
    $bestRatio 0.74;
            
    $minRatio 0.75;
            
    $equalAStart $equalBStart 0;
            
    $fr $this->fr;
            for(
    $j $bLo$j $bHi; ++$j)
            {
                
    $elemB $seqB[$j];
                
    $fr->setSequenceB($elemB);
                for(
    $i $aLo$i $aHi; ++$i)
                {
                    
    $elemA $seqA[$i];
                    if(
    $elemB == $elemA)
                    {
                        if(!
    $equalAStart)
                        {
                            
    $equalAStart $i;
                            
    $equalBStart $j;
                        }
                        continue;
                    }
                    
    $fr->setSequenceA($elemA);
                    if(    
    $fr->realQuickRatio() >= $bestRatio && $fr->quickRatio() >= $bestRatio &&
                        (
    $ratio $fr->ratio()) >= $bestRatio)
                    {
                        
    $bestRatio $ratio;
                        
    $bestI $i;
                        
    $bestJ $j;
                    }
                }
            }
            if(
    $bestRatio $minRatio)
            {
                if(!
    $equalAStart)
                {
                    return 
    self::plainReplace($seqA$aLo$aHi$seqB$bLo$bHi);
                }
                
    $bestRatio 1.0;
                
    $bestI $equalAStart;
                
    $bestJ $equalBStart;
            }
            else
            {
                
    $equalAStart 0;
            }
            
    $result $this->fancyHelper($seqA$aLo$bestI$seqB$bLo$bestJ);
            
    $elemA $seqA[$bestI];
            
    $elemB $seqB[$bestJ];
            if(!
    $equalAStart)
            {
                
    $aTags $bTags '';
                
    $fr->setSequences($elemA$elemB);
                foreach(
    $fr->getOpCodes() as $opCode)
                {
                    switch(
    $opCode['tag'])
                    {
                        case 
    TAG_REPLACE:
                            
    $aTags .= str_repeat('^'$opCode['aEnd'] - $opCode['aStart']);
                            
    $bTags .= str_repeat('^'$opCode['bEnd'] - $opCode['bStart']);
                            break;
                        case 
    TAG_DELETE:
                            
    $aTags .= str_repeat('-'$opCode['aEnd'] - $opCode['aStart']);
                            break;
                        case 
    TAG_INSERT:
                            
    $aTags .= str_repeat('+'$opCode['bEnd'] - $opCode['bStart']);
                            break;
                        case 
    TAG_EQUAL:
                            
    $str str_repeat(' '$opCode['aEnd'] - $opCode['aStart']);
                            
    $aTags .= $str;
                            
    $bTags .= $str;
                            break;
                        default:
                            
    trigger_error('Differ::compare: unknown tag returned by SequenceMatcher::getOpCodes()'E_USER_ERROR);
                    }
                }
                
    $result array_merge($resultself::qFormat($elemA$elemB$aTags$bTags));
            }
            else
            {
                
    $result[] = '  '.$elemA;
            }
            return 
    array_merge($result$this->fancyHelper($seqA$bestI 1$aHi$seqB$bestJ 1$bHi));
        }
        protected static function 
    countLeading($str$char)
        {
            for(
    $i 0$length strlen($str); $i $length; ++$i)
            {
                if(
    $str[$i] != $char)
                {
                    return 
    $i;
                }
            }
            return 
    $i;
        }
        protected static function 
    qFormat($aLine$bLine$aTags$bTags)
        {
            
    $common min(self::countLeading($aLine"\t"), self::countLeading($bLine"\t"));
            
    $common min($commonself::countLeading(substr($aTags0$common), ' '));
            
    $aTags rtrim(substr($aTags$common));
            
    $bTags rtrim(substr($bTags$common));
            
    $result = array('- '.$aLine);
            if(
    $aTags)
            {
                
    $result[] = '? '.str_repeat("\t"$common).$aTags."\n";
            }
            
    $result[] = '+ '.$bLine;
            if(
    $bTags)
            {
                
    $result[] = '? '.str_repeat("\t"$common).$bTags."\n";
            }
            return 
    $result;
        }
    }
    // Example

    $textA 'PHP generally runs on a web server, taking PHP code as its input and creating Web pages as output.

    When running server-side, the PHP model can be seen as an alternative to Microsoft\'s ASP.NET/C#/VB.NET system, Macromedia\'s ColdFusion, Sun Microsystems\' JSP, Zope, mod_perl and the Ruby on Rails framework. To more directly compete with the "framework" approach taken by these systems, Zend are working on the Zend Framework - an emerging (as of June 2006) set of PHP building blocks and best practices.

    The LAMP architecture has become popular in the Web industry as a way of deploying inexpensive, reliable, scalable, secure web applications. PHP is commonly used as the P in this bundle alongside Linux, Apache and MySQL. PHP can be used with a large number of relational database management systems, runs on all of the most popular web servers and is available for many different operating systems. This flexibility means that PHP has a wide installation base across the Internet; PHP is one of the most popular programming languages for implementing websites with over 20 million Internet domains using PHP[2].

    Examples of popular server-side PHP applications include phpBB, Wordpress and MediaWiki. PHP can even be used to access files for editing, copying, deleting, and more.

    More recently, PHP has been adapted to provide a command line interface, as well as GUI libraries such as GTK+ and text mode libraries like ncurses in order to facilitate development of a broader range of software. As PHP is higher-level than shell scripting, its use on the command line is desirable for some automation tasks that shell scripting has traditionally been used for.
    '
    ;

    $textB 'PHP generally runs on a server, taking PHP code as its input and creating Web pages as output.

    When running server-side, the PHP model can be seen as an alternative to Microsoft\'s ASP.NET/C#/VB.NET system, Macromedia\'s ColdFusion, Sun Microsystems\' JSP, Zope, mod_perl and the Ruby on Rails framework. To more directly compete with the "framework" approach taken by these systems, Zend are working on the Zend Framework - an emerging (as of June 2006) set of PHP building blocks and best practices.

    The LAMP architecture has become popular in the Web industry as a way of deploying inexpensive, reliable, scalable, secure web applications. PHP is commonly used as the P in this bundle alongside Linux, Apache and MySQL. PHP can be used with a large number of relational database management systems, runs on all of the most popular web servers and is available for many different operating systems. This flexibility means that PHP has a wide installation base across the Internet; PHP is one of the most popular programming languages for implementing websites with over 20 million Internet domains using PHP[2].

    Examples of popular server-side PHP applications include phpBB, Wordpress and MediaWiki.

    More recently, PHP has been adapted to provide a command line interface, as well as GUI libraries such as GTK+ and text mode libraries like ncurses in order to facilitate development of a broader range of software. As PHP is higher-level than shell scripting, its use on the command line is desirable for some automation tasks that shell scripting has traditionally been used for.
    '
    ;

    $d = new Differ();
    $diff $d->compare(explode("\n"$textA), explode("\n"$textB));
    ?>
    <html>
        <head>
            <style type="text/css">
            table
            {
                empty-cells:show;
            }
            table, th, td, thead, tr
            {
                border-collapse:collapse;
                border:1px solid black
            }
            </style>
        </head>
        <body>
        <?php
        
    echo '<table><thead><th>Tag</th><th>Text</th></thead><tbody>';
        foreach(
    $diff as $line)
        {
            echo 
    '<tr><td>',substr($line02),'</td><td><pre>',substr($line2),'&nbsp;</pre></td></tr>';
        }
        echo 
    '</tbody></table>';
        
    ?>
        </body>
    </html>
    dumpfi

    Edit: Here's the documentation of Pythons difflib module: http://www.python.org/doc/2.5/lib/module-difflib.html
    Last edited by dumpfi; 12-26-2006 at 12:39 PM.
    "Failure is not an option. It comes bundled with the software."
    ....../)/)..(\__/).(\(\................../)_/)......
    .....(-.-).(='.'=).(-.-)................(o.O)...../<)
    ....(.).(.)("}_("}(.)(.)...............(.)_(.))Ż/.
    ŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻ
    Little did the bunnies suspect that one of them was a psychotic mass murderer with a 6 ft. axe.


  •  

    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts
    •