Magento – Products showing in the wrong category

I came across something odd a while back, and until recently not had time to investigate the issue.

Products were showing up in categories that they were not explicitly in.

In summary, when you reindex catalog_category_product Magento goes and looks at all the products and the categories that they are in. For the categories that the product are in, it gets the category tree of that category.

So consider this category hierarchy:

  1. Category A
    1. Sub Category A 1
      1. Bottom Category 1
      2. Bottom Category 2
    2. Sub Category A 2
      1. Bottom Category 3
      2. Bottom Category 4
    3. Sub Category A 3
      1. Bottom Category 5
      2. Bottom Category 6
  2. Category B
    1. Sub Category B 1
      1. Bottom Category 7
      2. Bottom Category 8
    2. Sub Category B 2
      1. Bottom Category 9
      2. Bottom Category 10
    3. Sub Category B 3
      1. Bottom Category 11
      2. Bottom Category 12

Now you create a product and put it in Bottom Category 9 and Bottom Category 11. When you reindex it’s put into the list to show up in the category and the parents on that tree, so Category B and Category B 2, and so on for Bottom Category 11.

Now over time we add more products and more categories, and several months later we decide to rearrange our category tree, as we’ve gotten too wide and our needs have changed, or some other reason. Say to something like this, a strikethru indicates a disabled category (as in is_active is 0 (it’s a boolean integer))

  1. Category A
    1. Sub Category A 1
      1. Bottom Category 1
      2. Bottom Category 2
      3. Bottom Category 3
      4. Bottom Category 4
    2. Sub Category A 2
    3. Sub Category A 3
      1. Bottom Category 5
      2. Bottom Category 6
  2. Category B
    1. Sub Category B 1
      1. Bottom Category 7
      2. Bottom Category 8
    2. Sub Category B 2
      1. Bottom Category 9
      2. Bottom Category 10
    3. Sub Category B 3
      1. Bottom Category 11
      2. Bottom Category 12

Now of course we reindex, and the problems start.

The Problem

We have a problem, the Product correctly shows in Bottom Category 11 and it’s parents, but it is still showing up in Sub Category A 1 and it’s parents in error.

It seems that the reindexer takes categories that are disabled (as in is_active is false), and their tree, into account when building the list of category pages that a product can be displayed on.

So in order to fix this I went deep into Magento core to go and see which query was at fault. Turns out that it’s not exactly broken, it’s just Magento doesn’t check for it a give category is active or not when it builds the list.

The query at fault is:

INSERT INTO `catalog_category_anc_products_index_idx` (`category_id`, `product_id`, `position`) SELECT STRAIGHT_JOIN DISTINCT `ca`.`category_id`, `cp`.`product_id`, MIN(IF(ca.category_id = ce.entity_id, `cp`.`position`, (`ce`.`position` + 1) * (`ce`.`level` + 1 * 10000) + `cp`.`position`)) AS `position` FROM `catalog_category_anc_categs_index_idx` AS `ca`
 INNER JOIN `catalog_category_entity` AS `ce` ON `ce`.`path` LIKE `ca`.`path` OR ce.entity_id = ca.category_id
 INNER JOIN `catalog_category_product` AS `cp` ON cp.category_id = ce.entity_id
 INNER JOIN `catalog_category_product_index_enbl_idx` AS `pv` ON pv.product_id = cp.product_id GROUP BY `ca`.`category_id`,
	`cp`.`product_id`

As you can see it just collects from catalog_category_entity, which has a table structure of:

mysql> describe catalog_category_entity;
+------------------+----------------------+------+-----+---------+----------------+
| Field            | Type                 | Null | Key | Default | Extra          |
+------------------+----------------------+------+-----+---------+----------------+
| entity_id        | int(10) unsigned     | NO   | PRI | NULL    | auto_increment |
| entity_type_id   | smallint(5) unsigned | NO   |     | 0       |                |
| attribute_set_id | smallint(5) unsigned | NO   |     | 0       |                |
| parent_id        | int(10) unsigned     | NO   |     | 0       |                |
| created_at       | timestamp            | YES  |     | NULL    |                |
| updated_at       | timestamp            | YES  |     | NULL    |                |
| path             | varchar(255)         | NO   | MUL | NULL    |                |
| position         | int(11)              | NO   |     | 0       |                |
| level            | int(11)              | NO   | MUL | 0       |                |
| children_count   | int(11)              | NO   |     | 0       |                |
+------------------+----------------------+------+-----+---------+----------------+
10 rows in set (0.00 sec)

As you can see that table structure is somewhat limited. Since this is the bare bones basics of a category. And thus the query to collect categories of a product is in, considers inactive categories as if doesn’t have this information.

The Solution

The fix here is simple. Find where the query is defined and take into account the is_active category attribute, and we’ll do it the Magento way. Rather than hardcoding entity_id’s.

A quick note about how I found the query. I opened up public_html/lib/Varien/Db/Adapter/Pdo/Mysql.php and changed the variable $_debug from false to true, it’s around about line 103. This will output all the SQL queries to public_html/var/debug/pdo_mysql.log.

After that I just grep’ed the source catalog_category_anc_products_index_idx which told me where the table was defined in a xml config, and then grep’ed again for the internal reference name, (there was much much grep’ing for bit’s and pieces).

The query runs under the reindexAll function of Mage_Catalog_Model_Resource_Category_Indexer_Product. So basically take the ENTIRE contents of this function and setup a Model Resource override, and fix the query.

Module Basics

Creating our Module in the directory local/Company/Catalog

Normally I wouldn’t cover this, but we are overriding a resource which is a little different from just overriding a model.

etc/config.xml
<?xml version="1.0"?>
<config>
    <modules>
        <Company_Catalog>
            <version>0.0.1</version>
        </Company_Catalog>
    </modules>

    <global>
        <models>
            <companycatalog>
                <class>Company_Catalog_Model</class>
            </companycatalog>

            <catalog_resource_eav_mysql4>
                <rewrite>
                    <category_indexer_product>Company_Catalog_Model_Resource_Category_Indexer_Product</category_indexer_product>
                </rewrite>
            </catalog_resource_eav_mysql4>
        </models>
    </global>
</config>
Model/Resource/Category/Indexer/Product.php

Literally just a copy of the contents of reindexAll with some modifications. Starting at // Query Correction and ending at // end Query Correction.

<?php

class Company_Catalog_Model_Resource_Category_Indexer_Product extends Mage_Catalog_Model_Resource_Category_Indexer_Product
{
    /**
     * Rebuild all index data
     *
     * @return Mage_Catalog_Model_Resource_Category_Indexer_Product
     */
    public function reindexAll()
    {
        $this->useIdxTable(true);
        $this->beginTransaction();
        try {
            $this->clearTemporaryIndexTable();
            $idxTable = $this->getIdxTable();
            $idxAdapter = $this->_getIndexAdapter();
            $stores = $this->_getStoresInfo();
            /**
             * Build index for each store
             */
            foreach ($stores as $storeData) {
                $storeId    = $storeData['store_id'];
                $websiteId  = $storeData['website_id'];
                $rootPath   = $storeData['root_path'];
                $rootId     = $storeData['root_id'];
                /**
                 * Prepare visibility for all enabled store products
                 */
                $enabledTable = $this->_prepareEnabledProductsVisibility($websiteId, $storeId);
                /**
                 * Select information about anchor categories
                 */
                $anchorTable = $this->_prepareAnchorCategories($storeId, $rootPath);
                /**
                 * Add relations between not anchor categories and products
                 */
                $select = $idxAdapter->select();
                /** @var $select Varien_Db_Select */
                $select->from(
                    array('cp' => $this->_categoryProductTable),
                    array('category_id', 'product_id', 'position', 'is_parent' => new Zend_Db_Expr('1'),
                        'store_id' => new Zend_Db_Expr($storeId))
                )
                ->joinInner(array('pv' => $enabledTable), 'pv.product_id=cp.product_id', array('visibility'))
                ->joinLeft(array('ac' => $anchorTable), 'ac.category_id=cp.category_id', array())
                ->where('ac.category_id IS NULL');

                $query = $select->insertFromSelect(
                    $idxTable,
                    array('category_id', 'product_id', 'position', 'is_parent', 'store_id', 'visibility'),
                    false
                );
                $idxAdapter->query($query);

                /**
                 * Assign products not associated to any category to root category in index
                 */

                $select = $idxAdapter->select();
                $select->from(
                    array('pv' => $enabledTable),
                    array(new Zend_Db_Expr($rootId), 'product_id', new Zend_Db_Expr('0'), new Zend_Db_Expr('1'),
                        new Zend_Db_Expr($storeId), 'visibility')
                )
                ->joinLeft(array('cp' => $this->_categoryProductTable), 'pv.product_id=cp.product_id', array())
                ->where('cp.product_id IS NULL');

                $query = $select->insertFromSelect(
                    $idxTable,
                    array('category_id', 'product_id', 'position', 'is_parent', 'store_id', 'visibility'),
                    false
                );
                $idxAdapter->query($query);

                /**
                 * Prepare anchor categories products
                 */
                $anchorProductsTable = $this->_getAnchorCategoriesProductsTemporaryTable();
                $idxAdapter->delete($anchorProductsTable);

                $position = 'MIN('.
                    $idxAdapter->getCheckSql(
                        'ca.category_id = ce.entity_id',
                        $idxAdapter->quoteIdentifier('cp.position'),
                        '('.$idxAdapter->quoteIdentifier('ce.position').' + 1) * '
                        .'('.$idxAdapter->quoteIdentifier('ce.level').' + 1 * 10000)'
                        .' + '.$idxAdapter->quoteIdentifier('cp.position')
                    )
                .')';

                // Query Correction

                // load the is_active category attribute
                $attributeModel = Mage::getModel('eav/entity_attribute')->loadByCode(Mage_Catalog_Model_Category::ENTITY, 'is_active');
                $attributeTable = 'catalog_category_entity_' . $attributeModel->getBackendType();

                $select = $idxAdapter->select()
                ->useStraightJoin(true)
                ->distinct(true)
                ->from(array('ca' => $anchorTable), array('category_id'))
                ->joinInner(
                    array('ce' => $this->_categoryTable),
                    $idxAdapter->quoteIdentifier('ce.path') . ' LIKE ' .
                    $idxAdapter->quoteIdentifier('ca.path') . ' OR ce.entity_id = ca.category_id',
                    array()
                )

                ->joinInner(
                    array('cp' => $this->_categoryProductTable),
                    'cp.category_id = ce.entity_id',
                    array('product_id')
                )

                // left join the attribute
                ->joinInner(
                    array('ccei' => $attributeTable),
                    'ccei.entity_id = cp.category_id',
                    array()//select nothing
                )

                ->joinInner(
                    array('pv' => $enabledTable),
                    'pv.product_id = cp.product_id',
                    array('position' => $position)
                )

                ->where('ccei.attribute_id=' . $attributeModel->getAttributeId())//is_active
                ->where('ccei.value=1')

                ->group(array('ca.category_id', 'cp.product_id'));
                $query = $select->insertFromSelect($anchorProductsTable,
                    array('category_id', 'product_id', 'position'), false);
                $idxAdapter->query($query);

                // end Query Correction

                /**
                 * Add anchor categories products to index
                 */
                $select = $idxAdapter->select()
                ->from(
                    array('ap' => $anchorProductsTable),
                    array('category_id', 'product_id',
                        'position', // => new Zend_Db_Expr('MIN('. $idxAdapter->quoteIdentifier('ap.position').')'),
                        'is_parent' => $idxAdapter->getCheckSql('cp.product_id > 0', 1, 0),
                        'store_id' => new Zend_Db_Expr($storeId))
                )
                ->joinLeft(
                    array('cp' => $this->_categoryProductTable),
                    'cp.category_id=ap.category_id AND cp.product_id=ap.product_id',
                    array()
                )
                ->joinInner(array('pv' => $enabledTable), 'pv.product_id = ap.product_id', array('visibility'));

                $query = $select->insertFromSelect(
                    $idxTable,
                    array('category_id', 'product_id', 'position', 'is_parent', 'store_id', 'visibility'),
                    false
                );
                $idxAdapter->query($query);

                $select = $idxAdapter->select()
                    ->from(array('e' => $this->getTable('catalog/product')), null)
                    ->join(
                        array('ei' => $enabledTable),
                        'ei.product_id = e.entity_id',
                        array())
                    ->joinLeft(
                        array('i' => $idxTable),
                        'i.product_id = e.entity_id AND i.category_id = :category_id AND i.store_id = :store_id',
                        array())
                    ->where('i.product_id IS NULL')
                    ->columns(array(
                        'category_id'   => new Zend_Db_Expr($rootId),
                        'product_id'    => 'e.entity_id',
                        'position'      => new Zend_Db_Expr('0'),
                        'is_parent'     => new Zend_Db_Expr('1'),
                        'store_id'      => new Zend_Db_Expr($storeId),
                        'visibility'    => 'ei.visibility'
                    ));

                $query = $select->insertFromSelect(
                    $idxTable,
                    array('category_id', 'product_id', 'position', 'is_parent', 'store_id', 'visibility'),
                    false
                );

                $idxAdapter->query($query, array('store_id' => $storeId, 'category_id' => $rootId));
            }

            $this->syncData();

            /**
             * Clean up temporary tables
             */
            $this->clearTemporaryIndexTable();
            $idxAdapter->delete($enabledTable);
            $idxAdapter->delete($anchorTable);
            $idxAdapter->delete($anchorProductsTable);
            $this->commit();
        } catch (Exception $e) {
            $this->rollBack();
            throw $e;
        }
        return $this;
    }
}

The fix in depth

The first thing done here is to Magento Safe load the Category Attribute details for is_active.

                // load the is_active category attribute
                $attributeModel = Mage::getModel('eav/entity_attribute')->loadByCode(Mage_Catalog_Model_Category::ENTITY, 'is_active');
                $attributeTable = 'catalog_category_entity_' . $attributeModel->getBackendType();

After this join,

                ->joinInner(
                    array('cp' => $this->_categoryProductTable),
                    'cp.category_id = ce.entity_id',
                    array('product_id')
                )

we need to add another join, in this case we are joining to the table that contains the is_active information, it’s usually catalog_category_entity_int but we Magento Safe loaded it above.

                // left join the attribute
                ->joinInner(
                    array('ccei' => $attributeTable),
                    'ccei.entity_id = cp.category_id',
                    array()//select nothing
                )

Finally we need to adjust to add a where to clean up the join and select active categories only, after the last join and before the group function call:

                ->where('ccei.attribute_id=' . $attributeModel->getAttributeId())//is_active
                ->where('ccei.value=1')

This changes the query to:

INSERT INTO `catalog_category_anc_products_index_idx` (`category_id`, `product_id`, `position`) SELECT STRAIGHT_JOIN DISTINCT `ca`.`category_id`, `cp`.`product_id`, MIN(IF(ca.category_id = ce.entity_id, `cp`.`position`, (`ce`.`position` + 1) * (`ce`.`level` + 1 * 10000) + `cp`.`position`)) AS `position` FROM `catalog_category_anc_categs_index_idx` AS `ca`
 INNER JOIN `catalog_category_entity` AS `ce` ON `ce`.`path` LIKE `ca`.`path` OR ce.entity_id = ca.category_id
 INNER JOIN `catalog_category_product` AS `cp` ON cp.category_id = ce.entity_id
 INNER JOIN `catalog_category_entity_int` AS `ccei` ON ccei.entity_id = cp.category_id
 INNER JOIN `catalog_category_product_index_enbl_idx` AS `pv` ON pv.product_id = cp.product_id WHERE (ccei.attribute_id=119) AND (ccei.value=1) GROUP BY `ca`.`category_id`,
	`cp`.`product_id`

And Finally

Now with this module in place (remember to activate it in etc/modules) clear your cache and reindex. And hey presto, the product is no longer showing up in a category that is a parent of a category that the product is in that is now disabled. (Bit of a mouthful that)

In Summary

Much nicer, and not too much of a performance hit really.

Overall it means your products only show in the right active categories.

Generally it is only a issue when you disable a category and don’t delete it and then rearrange things leaving remnants of the old category tree floating about, so basically anything a common user would do.

I don’t tend to delete stuff myself, in case things get reactivated, and I’ll reuse the records.

Hopefully this is of help, rather than deleting records!

Mac, OBS and Streaming Audio Right

OBS has for some time been working on it’s new version of OBS which is called OBS studio, and is cross platform. Finally we are free from having to faff about with WireCast, we can now just plug and play and Stream away. BUT WAIT it’s not just that simple!

If you ever tried Minecraft’s in game streaming it will at least (force) prompt you to download additional required software to get you going.

This software (whilst not needed, (well it’s not but you are a idiot if you don’t setup your audio properly)) will get your Audio going properly on OBS streaming when you are not using additional Hardware.

This piece of software is called Soundflower, and is an example of Virtual Audio Cabling, which basically creates a software based Sound Card inside your computer in order to route audio more usefully!

Installing

So first things first, grab yourself a copy of OBS Studio, you can find the latest version on GitHub here. Download and install!

Next we’ll grab a copy of SoundFlower which is on their site here or GitHub here.

SoundFlower will need you to restart your Computer.

Getting Setup

Lets fire up OBS Studio:

OBS 0.9.1

I’ve got a few extra scenes setup, but this is the layout you’ll be presented with!

We are going to add a source, (first make a scene of course) in this case a “Audio Output Capture”

Audio Selection

And then, to keep things simple, just select the SoundFlower 2 Channel. We don’t need to do anything wacky.

Input Select

Finally we need to change the Sound Settings under System Preferences. So open System Preferences and Sound. And select Soundflower (2ch) as the Output device.

System Prefs Sound

Now suddenly everything will go quiet as all your audio is being sent to OBS/Soundflower: (it’s good to check that OBS is actually detecting at this point)

Audio Monitoring

We need to route the Audio from your computer via Soundflower back to Speakers/headphones. For this open SoundflowerBed:

SoundFlowerBed

It’s under Applications/Soundflower, this fires up a Menu Item in the Top Dock

Top Menu

Just click that, and send the Soundflower (2ch) back to “Built in Output/headphones” And now both you and OBS can hear everything.

Soundflower Routing

Your computer’s audio is now sent to this device and captured by OBS.

(You can tell as my mic here is picking up my speakers):

Preview

And there we go 🙂

BONUS

(Billy May’s Voice) BUT WAIT THERES MORE!

Xbox 360 Controller on Mac under Yosemite?

Unofficial Release 0.14 works fine and it’s on GitHub here

BONUS BONUS

Now what do you do if you get Beachball of Death when trying to use the SoundflowerBed Icon from the menu?

First open Terminal (sorry).
Then type:

ps aux | grep Soundflowerbed.app

Then you’ll see something like this:

SoundflowerBed Hunt

Then type:

kill 9331

Where 9331 is replaced with the number here:

SoundflowerBed Kill

It is VERY important you kill the right thing. As otherwise you might break something and have to restart the whole computer.

Once you have kill’ed SoundflowerBed, just reopen and all should be good.

Further Reading: Some Tips from Tuts+

Magento – The missing view type!

Magento has four options when it comes to display products on your e-commerce website:

  1. Not Visible Individually
  2. Catalog
  3. Search
  4. Catalog, Search

Now, normally that about covers most use cases. But if you have configurable products it is somewhat more common place to set the children to Not Visible Individually.

This is great and all, but what if you are using Google Shopping real-time attribute/schema scraping which requires the Child to be visible individually to read those tags (and can be handy for direct to a page when SKU searching on the front end), and you don’t want the children to show in either your category or search grid pages?

This is where an additional catalog visibility type comes in useful, and it is actually very straight forward to achieve!

First setup a basic Magento module/plugin, which I won’t bore you with the complete details of, but we are binding to a model of the Mage Catalog Module under a rewrite!

Set up your

local/Company/Catalog/etc/config.xml

as follows:

<?xml version="1.0"?>
<config>
    <modules>
        <Company_Catalog>
            <version>0.0.1</version>
        </Company_Catalog>
    </modules>
    <global>
        <models>
            <catalog>
                <rewrite>
                    <product_visibility>Company_Catalog_Model_Product_Visibility</product_visibility>
               </rewrite>
            </catalog>
        </models>
    </global>
</config>

Fairly straight forward here, setup a basic module and override Mage/Catalog/Product/Visibility.

Next setup the Model

local/Company/Catalog/Model/Product/Visibility.php
class Company_Catalog_Model_Product_Visibility extends Mage_Catalog_Model_Product_Visibility {
    const VISIBILITY_PAGE = 5;
    
    public function getVisibleInSiteIds()
    {
        return array(self::VISIBILITY_IN_SEARCH, self::VISIBILITY_IN_CATALOG, self::VISIBILITY_BOTH, self::VISIBILITY_PAGE);
    }
    
    static public function getOptionArray()
    {
        return array(
            self::VISIBILITY_NOT_VISIBLE=> Mage::helper('catalog')->__('Not Visible Individually'),
            self::VISIBILITY_IN_CATALOG => Mage::helper('catalog')->__('Catalog'),
            self::VISIBILITY_IN_SEARCH  => Mage::helper('catalog')->__('Search'),
            self::VISIBILITY_BOTH       => Mage::helper('catalog')->__('Catalog, Search'),
            self::VISIBILITY_PAGE       => Mage::helper('catalog')->__('Own Page Only')
        );
    }
    
    /**
     * Retrieve all options
     *
     * @return array
     */
    static public function getAllOption()
    {
        $options = self::getOptionArray();
        array_unshift($options, array('value'=>'', 'label'=>''));
        return $options;
    }
    
    /**
     * Retireve all options
     *
     * @return array
     */
    static public function getAllOptions()
    {
        $res = array();
        $res[] = array('value'=>'', 'label'=> Mage::helper('catalog')->__('-- Please Select --'));
        foreach (self::getOptionArray() as $index => $value) {
            $res[] = array(
               'value' => $index,
               'label' => $value
            );
        }
        return $res;
    }
    
    /**
     * Retrieve option text
     *
     * @param int $optionId
     * @return string
     */
    static public function getOptionText($optionId)
    {
        $options = self::getOptionArray();
        return isset($options[$optionId]) ? $options[$optionId] : null;
    }
}

First of all we add a new Constant, logically named

VISIBILITY_PAGE

and we give it the next free “id” of 5. (Check core/Mage/Catalog/Model/Product/Visibility.php for the base list).

Next the important function

getVisibleInSiteIds

tells Magento which Products to allow to be shown individually, so we just add our constant to the array here.

All the other functions deal with rendering the new Visibility type in the admin interface for editing products (as otherwise the parent functions are used and they can’t “see” the new visibility type).

As per usual when you add/edit a new module make sure to clear the cache accordingly!

That is it all done, short and sweet!

Fun with WP.me and Twitter Cards

I was trying to figure out why my Twitter Cards weren’t working when I was tinkering with someone else’s WordPress install.

Turns out at some point Twitter decided to follow the WP.me redirect and grab the page behind is differently.

So, now you need to whitelist your domain rather than WP.me.

To do that pop along to The Card Validator with one of your none WP.me Blog Post URL’s in hand and pop it in the validator box.

twitter_card_validation

Click request approval, fill in the little form and wack submit! It might throw an error but retest your page and all should be good!

pending

Give it a few minutes:

waiting

And Finally:

finally

A working Tweet!

Updated like so:

Barry_Carlyon_on_Twitter___Fun_with_WP_me_and_Twitter_Cards_http___t_co_WYcrgwtMgG_

A Week of Commuting

So today ends my first week of my Manchester based job at Fred Aldous. Currently this means I’ve been commuting to and from Manchester every day.

Currently my journey goes

  • My House
  • Burley Park
  • Leeds train station
  • Manchester Piccadilly
  • Starbucks
  • Short walk

And of course the reverse!

So far it’s been difficult to get a seat but I don’t mind standing since it is some form of exercise 😛

Anyway, what really annoys me is when a train arrives at a station and people gravitate towards the doors.

They do it in such a way so that it makes it difficult for people to get off. Which aside from being a pain in the arse do them, the moment there is a gap people start piling on, and of course at peak times there is always a person or two stuck in the aisle trying to get off whilst people are piling on the train.

Inevitably this causes delays as the piling on lot have to stop and clear a route for those getting off to get off.

/Begin Rant/
Why can’t people just take a step or to back and not be in such a rush! If you let people get off and just give everyone some space. I’m sure it will be better for everyone involved and potentially help keep the trains on time at these peak times!
/End Rant/

Anyway my first week hasn’t been to bad. But more on my new job in another post. Currently standing on a train! Trying to publish this blog post in the little windows of coverage I get! And we appear to be following a slow moving train!