Magento is powered by some pretty impressive architectural components. These are the parts that allow Magento to scale as well as it does.
One of the design patterns that paves the way for this architecture is the collection factory. This is a powerful tool that allows Magento to retrieve data from the database in a very efficient way. But, you may encounter some inconsistencies if you try to use one collection factory instance multiple times within a single function.
What's a Collection Factory in Magento?
Before we dive into the details, let's talk about what a collection factory is first. A collection factory can be thought of as a place where data entities are stored & organized in a specific way -- almost like a data warehouse. These entities are neatly categorized and sorted so they can be easily found, rather than just being scattered around in a pile.
The factory provides a way to return instances of specific entities, which are then used to retrieve data from the database. You can think of a factory as a massive library, where the data entities are represented by books. All the books are specifically categorized, which makes it easier to find the one you want.
In Magento, this goes a step further, as you can also use filters to narrow down your search based on attributes or fields of these entities.
Here's what fetching data from Magento's collection factory looks like:
The lines above will retrieve product ID's, but only for products that are enabled and have a public visibility.
There's a Catch!
...well, kinda. The code above works perfectly, but you'll start to encounter problems when you try to call a single collection factory multiple times within the same function.
It's sort of like trying to put a hold at the library for two copies of the same book. The library probably won't let you do this, and if they do, there will be some confusion 🤷♂️.
Just like our library, Magento will get confused when you're trying to retrieve different items from the same collection multiple times. The primary issue is that once Magento is done fetching the first collection instance, it won't remember to reset itself before creating the next one. This can lead to duplicate data or incorrect results.
But we can still fetch multiple collection instances within the same process. There are a few different options, so let's go over each.
The Problem With Collections
Let's modify our above example a bit. In this updated code, we want to get all simple products, all configurable products, and then assign each of them into two different variables, which will then be returned as keys and values of an array:
Looks correct, right? Well, it's not.
While the $configurableProducts
variable will contain configurable products... it will also contain simple products!
This is because the $productCollection
variable is being used to fetch data from both collections, and Magento doesn't know to reset itself before fetching data from the second collection.
So at this point, you're probably a bit stumped on how to handle this 😅. But there are actually a few potential solutions that can properly resolve this situation, and we'll go over each one.
Solution 1: Create a New Instance of the Collection
The first solution is to create a second instance of the collection. This basically creates a completely new instance from the factory class, which you can then use to fetch a second collection without any issues.
Since it's a completely separate instance, changes to the first collection will not affect the second collection.
Let's rewrite our existing code:
...by adding in a second create()
call, which will create a new instance of the collection from the factory:
This is the most straightforward solution and is usually the recommended approach, since it's very simple and easy to understand.
Solution 2: Clear and Reset the Collection
The reason that we needed to create a new instance of the collection from the factory is because Magento doesn't have an official way to reset the collection. But, the core of Magento is built on top of Zend Framework, which does provide the ability to reset a collection.
The first step is to clear the collection, which will remove all filters and data from it. This is done by calling the clear()
method directly on the collection object:
But, we're not yet done. There's a second step involved in this approach, and that is to reset the query linked to the collection.
Note that this is done by first calling getSelect()
to retrieve the select statement object, and then calling the reset()
method, passing in the \Zend_Db_Select::WHERE
constant. This will reset the query to its original state:
Now the $configurableProducts
variable will only contain configurable products, and not simple products.
Since this method requires two function calls, it is more prone to human error, so you need to be sure to call both clear()
and reset()
functions in order for this method to work properly.
For this reason, the first solution is usually recommended as it's more straightforward. However, this clear & reset approach requires you to only create a single instance from the collection factory, which could be useful and more performant in some situations.
Solution 3: Use PHP's clone Keyword
There is a third method that is not too commonly known, but is actually used throughout the Magento core code. It is to use the clone
keyword to create a deep copy of the collection instance.
This approach allows the ability to create a full copy of an object, while also keeping the two objects completely independent of each other so that they can be further modified.
Using clone
also eliminates the need to create a second instance of the collection from the factory. This may lead it to be the preferred and recommended approach in some very specific situations where optimal performance is absolutely critical.
It could also be useful if you would like to modify the collection object, but still be able to refer to the original collection instance later on in your code.
This is the approach the Magento core code takes in quite a few places, such as in the getSuggestedCategoriesJson
function of the Magento\Catalog\Block\Adminhtml\Category\Tree
class:
In this code snippet, you’ll notice that the category collection is cloned into the $matchingNamesCollection
variable. This is done to provide the ability to run additional filtering on the collection object, while still retaining the ability to reference the original collection object later on in the code.
This approach could also have been accomplished by calling the create()
method twice as in our first solution, however, using clone
is a bit more efficient since it does not require the creation of a second instance of the collection factory.
Magento most likely chose to use this approach throughout core code in order to optimize performance, especially with functions that are called frequently, and to be able to reference the original collection object later on in the code.
Following Best Practices
One problem, three solutions 🤓. Which you choose is up to you!
All the above approaches work and are valid solutions, but it's important to understand the differences between them so that you can choose the best one for your specific situation. The "best practices" will be a bit fluid depending on the context.
Want to dig a bit deeper into the design patterns used in Magento 2? Here are a few options:
- Explore Magento 2 fundamentals & best practices course (1,000+ students)
- Grow your Magento expertise with all courses & lessons (700+ students)
- Learn visually with blocks of code & inline comments (3,000+ students)