๐Ÿ‡ New course: RabbitMQ + M2

Learn more

How to Use Multiple Collections in Magento 2

Learn how to properly use multiple collection instances in Magento 2 including creating new collections with factories, clearing & resetting, and PHP's clone keyword.

Mark Shust

Mark Shust

Updated 12 min read

Magento is powered by some pretty impressive architectural components. These are the parts that allow Magento to scale as well as it does.

One of the design patterns that paves the way for this architecture is the collection factory. This is a powerful tool that allows Magento to retrieve data from the database in a very efficient way. But, you may encounter some inconsistencies if you try to use one collection factory instance multiple times within a single function.

What's a Collection Factory in Magento?

Before we dive into the details, let's talk about what a collection factory is first. A collection factory can be thought of as a place where data entities are stored & organized in a specific way -- almost like a data warehouse. These entities are neatly categorized and sorted so they can be easily found, rather than just being scattered around in a pile.

The factory provides a way to return instances of specific entities, which are then used to retrieve data from the database. You can think of a factory as a massive library, where the data entities are represented by books. All the books are specifically categorized, which makes it easier to find the one you want.

In Magento, this goes a step further, as you can also use filters to narrow down your search based on attributes or fields of these entities.

Here's what fetching data from Magento's collection factory looks like:

<?php

declare(strict_types=1);

namespace MyVendor\MyModule\Model;

use Magento\Catalog\Model\ResourceModel\Product\CollectionFactory as ProductCollectionFactory;
use Magento\Catalog\Model\Product\Attribute\Source\Status;
use Magento\Catalog\Model\Product\Visibility;

class SomeCustomModel
{
    public function __construct(
        private ProductCollectionFactory $productCollectionFactory,
    ) {}
    
    public function getProductIds(): array
    {
        $ids = [];
        $productCollection = $this->productCollectionFactory->create();
        $productCollection
            ->addAttributeToFilter('status', ['eq' => Status::STATUS_ENABLED])
            ->addAttributeToFilter('visibility', ['in' => [
                Visibility::VISIBILITY_IN_CATALOG,
                Visibility::VISIBILITY_IN_SEARCH,
                Visibility::VISIBILITY_BOTH,
            ]]);
        
        foreach ($productCollection as $product) {
            $ids[] = $product->getEntityId();
        }
        
        return $ids;
    }
}

The lines above will retrieve product ID's, but only for products that are enabled and have a public visibility.

There's a Catch!

...well, kinda. The code above works perfectly, but you'll start to encounter problems when you try to call a single collection factory multiple times within the same function.

It's sort of like trying to put a hold at the library for two copies of the same book. The library probably won't let you do this, and if they do, there will be some confusion ๐Ÿคทโ€โ™‚๏ธ.

Just like our library, Magento will get confused when you're trying to retrieve different items from the same collection multiple times. The primary issue is that once Magento is done fetching the first collection instance, it won't remember to reset itself before creating the next one. This can lead to duplicate data or incorrect results.

But we can still fetch multiple collection instances within the same process. There are a few different options, so let's go over each.

The Problem With Collections

Let's modify our above example a bit. In this updated code, we want to get all simple products, all configurable products, and then assign each of them into two different variables, which will then be returned as keys and values of an array:

<?php

declare(strict_types=1);

namespace MyVendor\MyModule\Model;

use Magento\Catalog\Model\ResourceModel\Product\CollectionFactory as ProductCollectionFactory;
use Magento\Catalog\Model\Product\Attribute\Source\Status;
use Magento\Catalog\Model\Product\Visibility;

class SomeCustomModel
{
    public function __construct(
        private ProductCollectionFactory $productCollectionFactory,
    ) {}
    
    public function getProductIds(): array
    {
        $simpleProductIds = [];
        $configurableProductIds = [];
        $productCollection = $this->productCollectionFactory->create();
        $simpleProducts = $productCollection->addAttributeToFilter('type_id', ['eq' => Type::TYPE_SIMPLE]);
        $configurableProducts = $productCollection->addAttributeToFilter('type_id', ['eq' => Type::TYPE_CONFIGURABLE]);
        
        foreach ($simpleProducts as $product) {
            $simpleProductIds[] = $product->getEntityId();
        }
        
        foreach ($configurableProducts as $product) {
            $configurableProductIds[] = $product->getEntityId();
        }
        
        return [
            'simple' => $simpleProductIds,
            'configurable' => $configurableProductIds,
        ];
    }
}

Looks correct, right? Well, it's not.

While the $configurableProducts variable will contain configurable products... it will also contain simple products!

This is because the $productCollection variable is being used to fetch data from both collections, and Magento doesn't know to reset itself before fetching data from the second collection.

So at this point, you're probably a bit stumped on how to handle this ๐Ÿ˜…. But there are actually a few potential solutions that can properly resolve this situation, and we'll go over each one.

Solution 1: Create a New Instance of the Collection

The first solution is to create a second instance of the collection. This basically creates a completely new instance from the factory class, which you can then use to fetch a second collection without any issues.

Since it's a completely separate instance, changes to the first collection will not affect the second collection.

Let's rewrite our existing code:

$productCollection = $this->productCollectionFactory->create();
$simpleProducts = $productCollection->addAttributeToFilter('type_id', ['eq' => Type::TYPE_SIMPLE]);
$configurableProducts = $productCollection->addAttributeToFilter('type_id', ['eq' => Type::TYPE_CONFIGURABLE]);

...by adding in a second create() call, which will create a new instance of the collection from the factory:

$simpleProductCollection = $this->productCollectionFactory->create();

// Create a new instance of the collection
$configurableProductCollection = $this->productCollectionFactory->create();

$simpleProducts = $simpleProductCollection->addAttributeToFilter('type_id', ['eq' => Type::TYPE_SIMPLE]);
$configurableProducts = $configurableProductCollection->addAttributeToFilter('type_id', ['eq' => Type::TYPE_CONFIGURABLE]);

This is the most straightforward solution and is usually the recommended approach, since it's very simple and easy to understand.

Solution 2: Clear and Reset the Collection

The reason that we needed to create a new instance of the collection from the factory is because Magento doesn't have an official way to reset the collection. But, the core of Magento is built on top of Zend Framework, which does provide the ability to reset a collection.

The first step is to clear the collection, which will remove all filters and data from it. This is done by calling the clear() method directly on the collection object:

But, we're not yet done. There's a second step involved in this approach, and that is to reset the query linked to the collection.

Note that this is done by first calling getSelect() to retrieve the select statement object, and then calling the reset() method, passing in the \Zend_Db_Select::WHERE constant. This will reset the query to its original state:

$productCollection = $this->productCollectionFactory->create();
$simpleProducts = $productCollection->addAttributeToFilter('type_id', ['eq' => Type::TYPE_SIMPLE]); 

// First step: clear the collection 
$productCollection->clear();

// Second step: reset the query
$productCollection->getSelect()->reset(\Zend_Db_Select::WHERE);

$configurableProducts = $productCollection->addAttributeToFilter('type_id', ['eq' => Type::TYPE_CONFIGURABLE]);

Now the $configurableProducts variable will only contain configurable products, and not simple products.

Since this method requires two function calls, it is more prone to human error, so you need to be sure to call both clear() and reset() functions in order for this method to work properly.

For this reason, the first solution is usually recommended as it's more straightforward. However, this clear & reset approach requires you to only create a single instance from the collection factory, which could be useful and more performant in some situations.

Solution 3: Use PHP's clone Keyword

There is a third method that is not too commonly known, but is actually used throughout the Magento core code. It is to use the clone keyword to create a deep copy of the collection instance.

This approach allows the ability to create a full copy of an object, while also keeping the two objects completely independent of each other so that they can be further modified.

Using clone also eliminates the need to create a second instance of the collection from the factory. This may lead it to be the preferred and recommended approach in some very specific situations where optimal performance is absolutely critical.

It could also be useful if you would like to modify the collection object, but still be able to refer to the original collection instance later on in your code.

$productCollection = $this->productCollectionFactory->create();

// Create a deep copy of the collection
$secondProductCollection = clone $productCollection;

// You can now modify the second collection without affecting the first collection
$simpleProducts = $productCollection->addAttributeToFilter('type_id', ['eq' => Type::TYPE_SIMPLE]);
$configurableProducts = $secondProductCollection->addAttributeToFilter('type_id', ['eq' => Type::TYPE_CONFIGURABLE]);

// You can also still refer to the original collection later on in your code

This is the approach the Magento core code takes in quite a few places, such as in the getSuggestedCategoriesJson function of the Magento\Catalog\Block\Adminhtml\Category\Tree class:

Magento\Catalog\Block\Adminhtml\Category\Tree
...
public function getSuggestedCategoriesJson($namePart)
{
    $storeId = $this->getRequest()->getParam('store', $this->_getDefaultStoreId());

    /* @var $collection Collection */
    $collection = $this->_categoryFactory->create()->getCollection();

    $matchingNamesCollection = clone $collection;
    $escapedNamePart = $this->_resourceHelper->addLikeEscape(
        $namePart,
        ['position' => 'any']
    );
    $matchingNamesCollection->addAttributeToFilter(
        'name',
        ['like' => $escapedNamePart]
    )->addAttributeToFilter(
        'entity_id',
        ['neq' => Category::TREE_ROOT_ID]
    )->addAttributeToSelect(
        'path'
    )->setStoreId(
        $storeId
    );

    $shownCategoriesIds = [];
    foreach ($matchingNamesCollection as $category) {
        foreach (explode('/', $category->getPath() ?: '') as $parentId) {
            $shownCategoriesIds[$parentId] = 1;
        }
    }

    $collection->addAttributeToFilter(
        'entity_id',
        ['in' => array_keys($shownCategoriesIds)]
    )->addAttributeToSelect(
        ['name', 'is_active', 'parent_id']
    )->setStoreId(
        $storeId
    );

    $categoryById = [
        Category::TREE_ROOT_ID => [
            'id' => Category::TREE_ROOT_ID,
            'children' => [],
        ],
    ];
    foreach ($collection as $category) {
        foreach ([$category->getId(), $category->getParentId()] as $categoryId) {
            if (!isset($categoryById[$categoryId])) {
                $categoryById[$categoryId] = ['id' => $categoryId, 'children' => []];
            }
        }
        $categoryById[$category->getId()]['is_active'] = $category->getIsActive();
        $categoryById[$category->getId()]['label'] = $category->getName();
        $categoryById[$category->getParentId()]['children'][] = & $categoryById[$category->getId()];
    }

    return $this->_jsonEncoder->encode($categoryById[Category::TREE_ROOT_ID]['children']);
}
...

In this code snippet, youโ€™ll notice that the category collection is cloned into the $matchingNamesCollection variable. This is done to provide the ability to run additional filtering on the collection object, while still retaining the ability to reference the original collection object later on in the code.

This approach could also have been accomplished by calling the create() method twice as in our first solution, however, using clone is a bit more efficient since it does not require the creation of a second instance of the collection factory.

Magento most likely chose to use this approach throughout core code in order to optimize performance, especially with functions that are called frequently, and to be able to reference the original collection object later on in the code.

Following Best Practices

One problem, three solutions ๐Ÿค“. Which you choose is up to you!

All the above approaches work and are valid solutions, but it's important to understand the differences between them so that you can choose the best one for your specific situation. The "best practices" will be a bit fluid depending on the context.

Want to dig a bit deeper into the design patterns used in Magento 2? Here are a few options:

  1. Explore Magento 2 fundamentals & best practices course (1,000+ students)
  2. Grow your Magento expertise with all courses & lessons (700+ students)
  3. Learn visually with blocks of code & inline comments (3,000+ students)