PHP Classes

Fuse: Fuzzy search of arrays using the Bitap algorithm

Recommend this page to a friend!
  Info   View files Documentation   View files View files (86)   DownloadInstall with Composer Download .zip   Reputation   Support forum (1)   Blog    
Last Updated Ratings Unique User Downloads Download Rankings
2023-05-24 (10 months ago) RSS 2.0 feedNot enough user ratingsTotal: 108 This week: 1All time: 9,654 This week: 571Up
Version License PHP version Categories
fuse 1.0Custom (specified...5Algorithms, PHP 5, Searching
Description 

Author

This package can perform fuzzy search of arrays using the Bitap algorithm.

It can take an array of data elements that contain arrays of property values.

It can index data elements by given keys so it can perform search for elements with property values with exact or similar text.

This package is a port of the fuse.js JavaScript library.

Innovation Award
PHP Programming Innovation award nominee
January 2017
Number 18
Fuzzy search allows determining if a text string is similar to another. Bitap is a fuzzy search algorithm.

This package provides a pure PHP implementation of the Bitap algorithm ti match strings in arrays.

Manuel Lemos
Picture of AccountKiller
Name: AccountKiller <contact>
Classes: 1 package by
Country: ???
Age: ???
All time rank: 4388
Week rank: 420 Up
Innovation award
Innovation award
Nominee: 1x

Documentation

<div align="center">

The Fuse logo, a violet asterisk, in reference to the Fuse.js logo <br>

</div>

Fuse

_A fuzzy search library for PHP_

Tests Packagist PHP Version

This is a PHP port of the awesome Fuse.js project and aims to provide full API compatibility wherever possible.

Check out their demo and examples to get a good feel for what this library is capable of.

> Latest compatible Fuse.js version: 6.6.2

Table of Contents:

Installation

This package is available via Composer. To add it to your project, just run:

composer require loilo/fuse

Note that at least PHP 7.4 is needed to use Fuse.

Usage

Here's a simple usage example:

<?php
require_once 'vendor/autoload.php';

$list = [
    [
        'title' => "Old Man's War",
        'author' => 'John Scalzi',
    ],
    [
        'title' => 'The Lock Artist',
        'author' => 'Steve Hamilton',
    ],
    [
        'title' => 'HTML5',
        'author' => 'Remy Sharp',
    ],
    [
        'title' => 'Right Ho Jeeves',
        'author' => 'P.D Woodhouse',
    ],
];

$options = [
    'keys' => ['title', 'author'],
];

$fuse = new \Fuse\Fuse($list, $options);

$fuse->search('hamil');

This leads to the following results (where each result's item refers to the matched entry itself and refIndex provides the item's position in the original $list):

[
    [
        'item' => [
            'title' => 'The Lock Artist',
            'author' => 'Steve Hamilton',
        ],
        'refIndex' => 1,
    ],
    [
        'item' => [
            'title' => 'HTML5',
            'author' => 'Remy Sharp',
        ],
        'refIndex' => 2,
    ],
];

Options

Fuse has a lot of options to refine your search:

Basic Options

isCaseSensitive

  • Type: `bool`
  • Default: `false`

Indicates whether comparisons should be case sensitive.

includeScore

  • Type: `bool`
  • Default: `false`

Whether the score should be included in the result set. A score of 0 indicates a perfect match, while a score of 1 indicates a complete mismatch.

includeMatches

  • Type: `bool`
  • Default: `false`

Whether the matches should be included in the result set. When true, each record in the result set will include the indices of the matched characters. These can consequently be used for highlighting purposes.

minMatchCharLength

  • Type: `int`
  • Default: `1`

Only the matches whose length exceeds this value will be returned. (For instance, if you want to ignore single character matches in the result, set it to 2).

shouldSort

  • Type: `bool`
  • Default: `true`

Whether to sort the result list, by score.

findAllMatches

  • Type: `bool`
  • Default: `false`

When true, the matching function will continue to the end of a search pattern even if a perfect match has already been located in the string.

keys

  • Type: `array`
  • Default: `[]`

List of keys that will be searched. This supports nested paths, weighted search, searching in arrays of strings and objects.

Fuzzy Matching Options

location

  • Type: `int`
  • Default: `0`

Determines approximately where in the text is the pattern expected to be found.

threshold

  • Type: `float`
  • Default: `0.6`

At what point does the match algorithm give up. A threshold of 0.0 requires a perfect match (of both letters and location), a threshold of 1.0 would match anything.

distance

  • Type: `int`
  • Default: `100`

Determines how close the match must be to the fuzzy location (specified by location). An exact letter match which is distance characters away from the fuzzy location would score as a complete mismatch. A distance of 0 requires the match be at the exact location specified. A distance of 1000 would require a perfect match to be within 800 characters of the location to be found using a threshold of 0.8.

ignoreLocation

  • Type: `bool`
  • Default: `false`

When true, search will ignore location and distance, so it won't matter where in the string the pattern appears.

> Tip: The default options only search the first 60 characters. This should suffice if it is reasonably expected that the match is within this range. To modify this behavior, set the appropriate combination of location, threshold, distance (or ignoreLocation). > > To better understand how these options work together, read about Fuse.js' Scoring Theory.

Advanced Options

useExtendedSearch

  • Type: `bool`
  • Default: `false`

When true, it enables the use of unix-like search commands. See example.

getFn

  • Type: `callable`
  • Default: source

The function to use to retrieve an object's value at the provided path. The default will also search nested paths.

sortFn

  • Type: `callable`
  • Default: source

The function to use to sort all the results. The default will sort by ascending relevance score, ascending index.

ignoreFieldNorm

  • Type: `bool`
  • Default: `false`

When true, the calculation for the relevance score (used for sorting) will ignore the field-length norm.

> Tip: The only time it makes sense to set ignoreFieldNorm to true is when it does not matter how many terms there are, but only that the query term exists.

fieldNormWeight

  • Type: `float`
  • Default: `1`

Determines how much the field-length norm affects scoring. A value of 0 is equivalent to ignoring the field-length norm. A value of 0.5 will greatly reduce the effect of field-length norm, while a value of 2.0 will greatly increase it.

Global Config

You can access and manipulate default values of all options above via the config method:

// Get an associative array of all options listed above
Fuse::config();

// Merge associative array of options into default config
Fuse::config(['shouldSort' => false]);

// Get single default option
Fuse::config('shouldSort');

// Set single default option
Fuse::config('shouldSort', false);

Methods

The following methods are available on each Fuse\Fuse instance:

search

Searches the entire collection of documents, and returns a list of search results.

public function search(mixed $pattern, ?array $options): array

The $pattern can be one of:

The $options:

  • `limit` (type: `int`): Denotes the max number of returned search results.

setCollection

Set/replace the entire collection of documents. If no index is provided, one will be generated.

public function setCollection(array $docs, ?\Fuse\Core\FuseIndex $index): void

Example:

$fruits = ['apple', 'orange'];
$fuse = new Fuse($fruits);

$fuse->setCollection(['banana', 'pear']);

add

Adds a doc to the collection and update the index accordingly.

public function add(mixed $doc): void

Example:

$fruits = ['apple', 'orange'];
$fuse = new Fuse($fruits);

$fuse->add('banana');

sizeof($fruits); // => 3

remove

Removes all documents from the list which the predicate returns truthy for, and returns an array of the removed docs. The predicate is invoked with two arguments: ($doc, $index).

public function remove(?callable $predicate): array

Example:

$fruits = ['apple', 'orange', 'banana', 'pear'];
$fuse = new Fuse($fruits);

$results = $fuse->remove(fn($doc) => $doc === 'banana' || $doc === 'pear');
sizeof($fuse->getCollection()); // => 2
$results; // => ['banana', 'pear']

removeAt

Removes the doc at the specified index.

public function removeAt(int $index): void

Example:

$fruits = ['apple', 'orange', 'banana', 'pear'];
$fuse = new Fuse($fruits);

$fuse->removeAt(1);

$fuse->getCollection(); // => ['apple', 'banana', 'pear']

getIndex

Returns the generated Fuse index.

public function getIndex(): \Fuse\Core\FuseIndex

Example:

$fruits = ['apple', 'orange', 'banana', 'pear'];
$fuse = new Fuse($fruits);

$fuse->getIndex()->size(); // => 4

Indexing

The following methods are available on each Fuse\Fuse instance:

Fuse::createIndex

Pre-generate the index from the list, and pass it directly into the Fuse instance. If the list is (considerably) large, it speeds up instantiation.

public static function createIndex(array $keys, array $docs, array $options = []): \Fuse\Core\FuseIndex

Example:

$list = [ ... ]; // See the example from the 'Usage' section
$options = [ 'keys' => [ 'title', 'author.firstName' ] ];

// Create the Fuse index
$myIndex = Fuse::createIndex($options['keys'], $list);

// Initialize Fuse with the index
$fuse = new Fuse($list, $options, $myIndex);

Fuse::parseIndex

Parses a JSON-serialized Fuse index.

public static function parseIndex(array $data, array $options = []): \Fuse\Core\FuseIndex

Example:

// (1) When the data is collected

$list = [ ... ]; // See the example from the 'Usage' section
$options = [ 'keys' => [ 'title', 'author.firstName' ] ];

// Create the Fuse index
$myIndex = Fuse::createIndex($options['keys'], $list);

// Serialize and save it
file_put_contents('fuse-index.json', json_encode($myIndex));


// (2) When the search is needed

// Load and deserialize index to an array
$fuseIndex = json_decode(file_get_contents('fuse-index.json'), true);
$myIndex = Fuse::parseIndex($fuseIndex);

// Initialize Fuse with the index
$fuse = new Fuse($list, $options, $myIndex);

Differences with Fuse.js

<!-- prettier-ignore --> &nbsp; | Fuse.js | PHP Fuse -|-|- Get Fuse Version | Fuse.version | ? | Access global configuration | Fuse.config property | Fuse::config method List modification | Using fuse.add() etc. modifies the original list passed to the new Fuse constructor. | In PHP, arrays are a primitive data type, which means that your original list is never modified by Fuse. To receive the current list after adding/removing items, the $fuse->getCollection() method can be used.

Development

Project Scope

Please note that I'm striving for feature parity with Fuse.js and therefore will add neither features nor fixes to the search logic that are not reflected in Fuse.js itself.

If you have any issues with search results that are _not_ obviously bugs in this PHP port, and you happen to know JavaScript, please check if your use case works correctly in the online demo of Fuse.js as that is the canonical Fuse implementation. If the issue appears there as well, please open an issue in their repo.

Setup

> To start development on Fuse, you need git, PHP (? 7.4) and Composer. > > Since code is formatted using Prettier, it's also recommended to have Node.js/npm installed as well as using an editor which supports Prettier formatting.

Clone the repository and cd into it:

git clone https://github.com/loilo/fuse.git
cd fuse

Install Composer dependencies:

composer install

Install npm dependencies (optional but recommended). This is only needed for code formatting as npm dependencies include Prettier plugins used by this project.

npm ci

Quality Assurance

There are different kinds of code checks in place for this project. All of these are run when a pull request is submitted but can also be run locally:

<!-- prettier-ignore --> Command | Purpose | Description -|-|- vendor/bin/phpcs | check code style | Run PHP_CodeSniffer to verify that the Fuse source code abides by the PSR-12 coding style. vendor/bin/psalm | static analysis | Run Psalm against the codebase to avoid type-related errors and unsafe coding patterns. vendor/bin/phpunit | check program logic | Run all PHPUnit tests from the test folder.

Contributing

Before submitting a pull request, please add relevant tests to the test folder.


  Files folder image Files  
File Role Description
Files folder image.github (1 file, 1 directory)
Files folder image.vscode (1 file)
Files folder imagesrc (1 file, 6 directories)
Files folder imagetest (3 files, 3 directories)
Accessible without login Plain text file .editorconfig Data Auxiliary data
Accessible without login Plain text file .prettierignore Data Auxiliary data
Accessible without login Plain text file .prettierrc Data Auxiliary data
Accessible without login Plain text file composer.json Data Auxiliary data
Accessible without login Plain text file composer.lock Data Auxiliary data
Accessible without login Plain text file fuse.svg Data Auxiliary data
Accessible without login Plain text file LICENSE Lic. License text
Accessible without login Plain text file package-lock.json Data Auxiliary data
Accessible without login Plain text file package.json Data Auxiliary data
Accessible without login Plain text file phpcs.xml Data Auxiliary data
Accessible without login Plain text file phpunit.xml Data Auxiliary data
Accessible without login Plain text file psalm.xml Data Auxiliary data
Accessible without login Plain text file README.md Doc. Documentation

  Files folder image Files  /  .github  
File Role Description
Files folder imageworkflows (1 file)
  Accessible without login Plain text file dependabot.yml Data Auxiliary data

  Files folder image Files  /  .github  /  workflows  
File Role Description
  Accessible without login Plain text file test.yml Data Auxiliary data

  Files folder image Files  /  .vscode  
File Role Description
  Accessible without login Plain text file settings.json Data Auxiliary data

  Files folder image Files  /  src  
File Role Description
Files folder imageCore (7 files)
Files folder imageException (6 files)
Files folder imageHelpers (3 files)
Files folder imageSearch (1 file, 2 directories)
Files folder imageTools (3 files)
Files folder imageTransform (2 files)
  Plain text file Fuse.php Class Class source

  Files folder image Files  /  src  /  Core  
File Role Description
  Accessible without login Plain text file computeScore.php Aux. Auxiliary script
  Accessible without login Plain text file config.php Example Example script
  Accessible without login Plain text file format.php Aux. Auxiliary script
  Plain text file KeyType.php Class Class source
  Plain text file LogicalOperator.php Class Class source
  Accessible without login Plain text file parse.php Example Example script
  Plain text file Register.php Class Class source

  Files folder image Files  /  src  /  Exception  
File Role Description
  Plain text file IncorrectSearcherTypeException.php Class Class source
  Plain text file InvalidConfigKeyException.php Class Class source
  Plain text file InvalidKeyWeightValueException.php Class Class source
  Plain text file LogicalSearchInval...ForKeyException.php Class Class source
  Plain text file MissingKeyPropertyException.php Class Class source
  Plain text file PatternLengthTooLargeException.php Class Class source

  Files folder image Files  /  src  /  Helpers  
File Role Description
  Accessible without login Plain text file get.php Aux. Auxiliary script
  Accessible without login Plain text file sort.php Aux. Auxiliary script
  Accessible without login Plain text file types.php Aux. Auxiliary script

  Files folder image Files  /  src  /  Search  
File Role Description
Files folder imageBitap (6 files)
Files folder imageExtended (10 files)
  Plain text file SearchInterface.php Class Class source

  Files folder image Files  /  src  /  Search  /  Bitap  
File Role Description
  Plain text file BitapSearch.php Class Class source
  Accessible without login Plain text file computeScore.php Aux. Auxiliary script
  Plain text file Constants.php Class Class source
  Accessible without login Plain text file convertMaskToIndices.php Aux. Auxiliary script
  Accessible without login Plain text file createPatternAlphabet.php Aux. Auxiliary script
  Accessible without login Plain text file search.php Example Example script

  Files folder image Files  /  src  /  Search  /  Extended  
File Role Description
  Plain text file BaseMatch.php Class Class source
  Plain text file ExactMatch.php Class Class source
  Plain text file FuzzyMatch.php Class Class source
  Plain text file IncludeMatch.php Class Class source
  Plain text file InverseExactMatch.php Class Class source
  Plain text file InversePrefixExactMatch.php Class Class source
  Plain text file InverseSuffixExactMatch.php Class Class source
  Plain text file parseQuery.php Class Class source
  Plain text file PrefixExactMatch.php Class Class source
  Plain text file SuffixExactMatch.php Class Class source

  Files folder image Files  /  src  /  Tools  
File Role Description
  Plain text file FuseIndex.php Class Class source
  Plain text file KeyStore.php Class Class source
  Plain text file Norm.php Class Class source

  Files folder image Files  /  src  /  Transform  
File Role Description
  Accessible without login Plain text file transformMatches.php Aux. Auxiliary script
  Accessible without login Plain text file transformScore.php Aux. Auxiliary script

  Files folder image Files  /  test  
File Role Description
Files folder imagefixtures (1 file)
Files folder imageFuzzySearch (23 files)
Files folder imageLogicalSearch (4 files)
  Plain text file ExtendedSearchTest.php Class Class source
  Plain text file PhpSpecificTest.php Class Class source
  Plain text file ScoringTest.php Class Class source

  Files folder image Files  /  test  /  fixtures  
File Role Description
  Accessible without login Plain text file books.php Aux. Auxiliary script

  Files folder image Files  /  test  /  FuzzySearch  
File Role Description
  Plain text file BreakingValuesTest.php Class Class source
  Plain text file ConsiderFieldLengthTest.php Class Class source
  Plain text file CustomSearchFunctionTest.php Class Class source
  Plain text file DeepKeyTest.php Class Class source
  Plain text file DefaultOptionsTest.php Class Class source
  Plain text file FindAllMatchesTest.php Class Class source
  Plain text file FlatListTest.php Class Class source
  Plain text file IgnoreLocationAndFieldLengthNormTest.php Class Class source
  Plain text file IncludeIdAndScoreTest.php Class Class source
  Plain text file IncludeScoreTest.php Class Class source
  Plain text file LargeSearchStringsTest.php Class Class source
  Plain text file MinCharLengthAndLargePatternTest.php Class Class source
  Plain text file MinCharLengthTest.php Class Class source
  Plain text file NumericIdsTest.php Class Class source
  Plain text file ObjectValuesTest.php Class Class source
  Plain text file RecurseIntoArraysTest.php Class Class source
  Plain text file RecurseIntoObjectsInArraysTest.php Class Class source
  Plain text file RecurseIntoObjects...jectInArrayTest.php Class Class source
  Plain text file SearchLocationTest.php Class Class source
  Plain text file SetNewListTest.php Class Class source
  Plain text file SortedSearchResultsTest.php Class Class source
  Plain text file StandardDottedKeysTest.php Class Class source
  Plain text file WeightedSearchTest.php Class Class source

  Files folder image Files  /  test  /  LogicalSearch  
File Role Description
  Plain text file NestedConditionsTest.php Class Class source
  Plain text file ParserTest.php Class Class source
  Plain text file SearchTest.php Class Class source
  Plain text file SearchWithDottedKeysTest.php Class Class source

 Version Control Unique User Downloads Download Rankings  
 100%
Total:108
This week:1
All time:9,654
This week:571Up
User Comments (1)
nice
7 years ago (muabshir)
70%StarStarStarStar