How does Twitch’s new EventSub work?

Preamble

In the beginning, Twitch created Webhooks. And the world rejoiced, finally we no longer have to long poll for things such as followers! Hurrah! But it came with a gotcha, since Twitch Webhooks is based on the Websub specification, each and every Webhook you wanted to listen to had to be renewed, now this is limited to 10 days (max) or the time remainging on the Token you authenticate with (when asking to listen to priviledge topics such as subscribers), which generally meant you had to remake certain subscriptions every 4 hours, after of course renewing the token with the refresh token. Since the maximum length of a Twitch user token is 4 hours. (Implicit tokens use 60 days, but you can’t refresh those).

This wasn’t a good solution for developers, since it creates siginificant load, and “wasting” of your rate limits with Helix. Even more so when working with multiiple streamers.

Twitch has acknowledged this as a problem, which leads us to the new product of EventSub.

Aside from ignoring the need to remake your subscriptions periodically, EventSub also doesn’t need a correspoding Helix endpoint to exist for the topic you wish to listen on (this also means a different format for the data payloads), and means EventSub can make new topic types without waiting for Helix to create the endpoint first.

So, now onwards to the meat of this post!

EventSub, hows does it work?

The long and short of it, is that everything is sent via HTTP Post requests, to your SSL protected endpoint.

Unlike Webhooks, Eventsub only authenticates using App Access Tokens (for server to server requests), but how does this work when you are attempting subscribe to a channels new Subscribers topic, or the Ban/timeout events topics?

Authentication

Well, when you make a subscription request to EventSub, Twitch looks at your App Access token, then checks in the background if the requested broadcaster has connected to your Application at any point, with the relevant scopes, and not revoked that connection.

So, it’s a “two legged” approach to authentication. But means that you, the developer, don’t need to store the broadcasters access or refresh tokens, after they have authorised. though you generally would in order to perform “catch up” if your application goes offline/restarts for any reason.

Infrastructure

So to utilise EventSub, you need to create a portal that allows broadcasters to grant access to your Application to their account with the relevant scopes, using “regular” User Authentication.

Then on authentication you check/create eventsub subscriptions as needed, using your App Access Token. You should also store and use the returned User Acces token for use on catchups

And you need a server that can recieve HTTP Posts from Twitch for verification of the connection and accepting data payloads. (Generlly here you’ll HTTP 2xx OK as quickly as possible and send the data into background processes.

TLDR: for example if you can call Get Broadcaster Subscriptions (at the moment you obtained the Access Token) then your ClientID can subscribe to the channel.subscribe EventSub.

Summary

Hopefully this should help out anyone that is new to EventSub or getting involved with the Twitch Channel Points Hackathon.

Since many people may look at EventSub to recieve new Channel point redemptions but might trip up over the initial authentication setup.

There are a number of examples on my GitHub that might help, but if you need any further help, please join us on the TwitchDev Discord Server!

Just thought I would write up this post with a ramble of notes on how EventSub Authentication works, since theres gonna be a spike in people asking and probably getting stuck if they are new to working with Twitch.

Twitch API Examples

I spend a lot of time on the Twitch Developer forums and Discord helping out other third party developers. That among other things led to me being asked to become a Twitch Ambassador, which is probably a story for another post.

As part of spending a lot of time helping of Forums/Discord, it become useful to write up some examples in various languages for people to refer to, since some people prefer code examples over documentation, and it’s easier to demonstrate how to tie multiple calls/endpoints together for the desired result.

To that end my GitHub Repo at barrycarlyon/twitch_misc now exists and holds examples from Authentication flows (from Implicit to server access and regular user in-between), extension config/pubsub, and examples for Webhooks and the new Eventsub (which is worth a look!). So if you are looking for some examples do checkout the Repository. Some of the examples can even be tested on GitHub itself via GitHub pages, the examples available are listed in the readme and at the Github Pages site.

Twitch also recently made the requirement that all calls to helix (aka the New API) need to be Authenticated using a Bearer, which made it difficult for Extensions to get the viewers details. So to that end I created a basic example of how to do that in an Extension with a “User Profile Extension” example. Which is at BarryCarlyon/twitch_profile_extension. So this covers a good way to handle that flow.

Right now most of the examples are nodeJS, or PHP, but there are some in Python kicking about!

I’ll be looking at adding more examples and other examples in other languages as we go!

I’m usually really bad at commenting my code as I prefer reading the code, but I made a conscious effort to add useful code comments on these repos!

Google oAuth and offline access

Been doing a lot of various stuff and things for CohhCarnage and some of that stuff has involved building an achievements tracking system for the website.

One of those achievements, is for YouTube Subscription. Where the achievement is awarded to the logged in user, if the user has subscribed to a given YouTube channel, in this case Cohh’s YouTube.

In order to make sure that people can’t “cheat” the system, we ask them to link their Google/YouTube account with the website and use the relevant API to look up their Subscription status.

Initially this worked fine, but I ran into some issues where the oAuth token stored has expired and thus I can’t do a status check, for cases where the user links their YouTube to their Cohhilition Account then doesn’t subscribe on YouTube until after 24 hours later (or some caching issue with Google).

So, the simple fix using Googles PHP Library for oAuth’ing is to just do a

<?php

$client->setAccessType('offline')

Now, this works fine for the most part, you happily get a refresh token, and can thus renew your token.

Then comes a hiccup, if for whatever reason you have offline access type on, and the user has previously authorised the application and it’s offline permission, you DON’T get a refresh token in some cases. Some user cases include:

  • you’ve lost their token,
  • or got a bad one
  • or the user managed to find the authentication loop (again) when they shouldn’t, and thus a new code/token combo is generated

Normally you are using something like:

<?php

$client = new Google_Client();
$client->setClientId($client_id);
$client->setClientSecret($client_secret);
$client->setRedirectUri($redirect_uri);
$client->addScope("email");
$client->addScope("profile"); 
$client->setAccessType('offline');// last forever/give me a refresh

But in order to make sure that you get a refresh_token EVERY time someone goes through the authentication loop, you have to adjust as follows:

<?php

$client = new Google_Client();
$client->setClientId($client_id);
$client->setClientSecret($client_secret);
$client->setRedirectUri($redirect_uri);
$client->addScope("email");
$client->addScope("profile"); 
$client->setAccessType('offline');// last forever/give me a refresh
$client->setApprovalPrompt('force');// force a refresh token return everytime

Apparently, using

'offline'

is supposed to imply

'force'

according to some Stack Overflows posts, but this doesn’t seem the case.

In the end my full Google_Client setup looks like:

<?php

        $client = new Google_Client();
        $client->setAuthConfig($consumer);
        $client->addScope('profile');
        $client->addScope('email');
        $client->addScope('https://www.googleapis.com/auth/youtube.readonly');
        $client->setAccessType('offline');// asks for a refresh token
        $client->setApprovalPrompt('force');// forces the refresh token being returned
        $client->setIncludeGrantedScopes(true);
        $client->setRedirectUri($callback);

Just an odd thing I came across recently that I thought I would write up. Most of the notes here are from Stack Overflow post on the subject

PHP5.6+ cURL and file uploads

Came across something odd today, and thought I’d condense down my Tweets on the subject into a blog post.

Basically, I use cURL and some wacky wacky stuff to upload files to a site over HTTP POST. And since I’d just grabbed PHP 7 from HomeBrew, it had overridden my PHP 5.5 install that comes as standard on OSX 10.10 and thus I cross checked the script with my MacPorts PHP 5.6 install and found the same. (Yes THREE different PHP versions for science…)

“Traditionally” the method for this would be something along the lines of:

<?php

    $ch = curl_init('SOMEURL');
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
    curl_setopt($ch, CURLOPT_POST, true);
    curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.153 Safari/537.36');

    curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);

    $data = array(
        'some_file' => '@' . $some_path
    );
    curl_setopt($ch, CURLOPT_POSTFIELDS, $data);

    $r = curl_exec($ch);
    curl_close($ch);

On PHP 5.5 and previous that works file, using a @ at the start of a POST entry would instruct PHP/cURL to treat the data/string as a File Path to Upload.

This behaviour is controlled by the PHP cURL constant of CURLOPT_SAFE_UPLOAD. In PHP 5.6 this constant changed from default FALSE to default TRUE, setting to TRUE means that a string starting @ is treated as a String and not a File Path to upload. The changes are documented on the PHP.net website, but the primary trip up is that most of us just use the defaults and we get tripped up when things change.

So, after trying to set this to FALSE under PHP 5.6 it still wasn’t working, and under PHP 7 you are thrown an error to indicate that you are not allowed to change this constant any more for security reasons, which is fine.

The solution is to use the CURLFile class, which is pretty straightforward:

<?php

    $ch = curl_init('SOMEURL');
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
    curl_setopt($ch, CURLOPT_POST, true);
    curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.153 Safari/537.36');

    curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);

    $data = array(
        'some_file' => new CURLFile($some_path)
    );
    curl_setopt($ch, CURLOPT_POSTFIELDS, $data);

    $r = curl_exec($ch);
    curl_close($ch);

This is the truly lazy edition, just chuck a new CURLFile($path) at it, instead of the @. I’m sure CURLFile does more useful stuff, but this was enough to get me back up and running!

Thus endeth this blog post!

Magento – Products showing in the wrong category

I came across something odd a while back, and until recently not had time to investigate the issue.

Products were showing up in categories that they were not explicitly in.

In summary, when you reindex catalog_category_product Magento goes and looks at all the products and the categories that they are in. For the categories that the product are in, it gets the category tree of that category.

So consider this category hierarchy:

  1. Category A
    1. Sub Category A 1
      1. Bottom Category 1
      2. Bottom Category 2
    2. Sub Category A 2
      1. Bottom Category 3
      2. Bottom Category 4
    3. Sub Category A 3
      1. Bottom Category 5
      2. Bottom Category 6
  2. Category B
    1. Sub Category B 1
      1. Bottom Category 7
      2. Bottom Category 8
    2. Sub Category B 2
      1. Bottom Category 9
      2. Bottom Category 10
    3. Sub Category B 3
      1. Bottom Category 11
      2. Bottom Category 12

Now you create a product and put it in Bottom Category 9 and Bottom Category 11. When you reindex it’s put into the list to show up in the category and the parents on that tree, so Category B and Category B 2, and so on for Bottom Category 11.

Now over time we add more products and more categories, and several months later we decide to rearrange our category tree, as we’ve gotten too wide and our needs have changed, or some other reason. Say to something like this, a strikethru indicates a disabled category (as in is_active is 0 (it’s a boolean integer))

  1. Category A
    1. Sub Category A 1
      1. Bottom Category 1
      2. Bottom Category 2
      3. Bottom Category 3
      4. Bottom Category 4
    2. Sub Category A 2
    3. Sub Category A 3
      1. Bottom Category 5
      2. Bottom Category 6
  2. Category B
    1. Sub Category B 1
      1. Bottom Category 7
      2. Bottom Category 8
    2. Sub Category B 2
      1. Bottom Category 9
      2. Bottom Category 10
    3. Sub Category B 3
      1. Bottom Category 11
      2. Bottom Category 12

Now of course we reindex, and the problems start.

The Problem

We have a problem, the Product correctly shows in Bottom Category 11 and it’s parents, but it is still showing up in Sub Category A 1 and it’s parents in error.

It seems that the reindexer takes categories that are disabled (as in is_active is false), and their tree, into account when building the list of category pages that a product can be displayed on.

So in order to fix this I went deep into Magento core to go and see which query was at fault. Turns out that it’s not exactly broken, it’s just Magento doesn’t check for it a give category is active or not when it builds the list.

The query at fault is:

INSERT INTO `catalog_category_anc_products_index_idx` (`category_id`, `product_id`, `position`) SELECT STRAIGHT_JOIN DISTINCT `ca`.`category_id`, `cp`.`product_id`, MIN(IF(ca.category_id = ce.entity_id, `cp`.`position`, (`ce`.`position` + 1) * (`ce`.`level` + 1 * 10000) + `cp`.`position`)) AS `position` FROM `catalog_category_anc_categs_index_idx` AS `ca`
 INNER JOIN `catalog_category_entity` AS `ce` ON `ce`.`path` LIKE `ca`.`path` OR ce.entity_id = ca.category_id
 INNER JOIN `catalog_category_product` AS `cp` ON cp.category_id = ce.entity_id
 INNER JOIN `catalog_category_product_index_enbl_idx` AS `pv` ON pv.product_id = cp.product_id GROUP BY `ca`.`category_id`,
	`cp`.`product_id`

As you can see it just collects from catalog_category_entity, which has a table structure of:

mysql> describe catalog_category_entity;
+------------------+----------------------+------+-----+---------+----------------+
| Field            | Type                 | Null | Key | Default | Extra          |
+------------------+----------------------+------+-----+---------+----------------+
| entity_id        | int(10) unsigned     | NO   | PRI | NULL    | auto_increment |
| entity_type_id   | smallint(5) unsigned | NO   |     | 0       |                |
| attribute_set_id | smallint(5) unsigned | NO   |     | 0       |                |
| parent_id        | int(10) unsigned     | NO   |     | 0       |                |
| created_at       | timestamp            | YES  |     | NULL    |                |
| updated_at       | timestamp            | YES  |     | NULL    |                |
| path             | varchar(255)         | NO   | MUL | NULL    |                |
| position         | int(11)              | NO   |     | 0       |                |
| level            | int(11)              | NO   | MUL | 0       |                |
| children_count   | int(11)              | NO   |     | 0       |                |
+------------------+----------------------+------+-----+---------+----------------+
10 rows in set (0.00 sec)

As you can see that table structure is somewhat limited. Since this is the bare bones basics of a category. And thus the query to collect categories of a product is in, considers inactive categories as if doesn’t have this information.

The Solution

The fix here is simple. Find where the query is defined and take into account the is_active category attribute, and we’ll do it the Magento way. Rather than hardcoding entity_id’s.

A quick note about how I found the query. I opened up public_html/lib/Varien/Db/Adapter/Pdo/Mysql.php and changed the variable $_debug from false to true, it’s around about line 103. This will output all the SQL queries to public_html/var/debug/pdo_mysql.log.

After that I just grep’ed the source catalog_category_anc_products_index_idx which told me where the table was defined in a xml config, and then grep’ed again for the internal reference name, (there was much much grep’ing for bit’s and pieces).

The query runs under the reindexAll function of Mage_Catalog_Model_Resource_Category_Indexer_Product. So basically take the ENTIRE contents of this function and setup a Model Resource override, and fix the query.

Module Basics

Creating our Module in the directory local/Company/Catalog

Normally I wouldn’t cover this, but we are overriding a resource which is a little different from just overriding a model.

etc/config.xml
<?xml version="1.0"?>
<config>
    <modules>
        <Company_Catalog>
            <version>0.0.1</version>
        </Company_Catalog>
    </modules>

    <global>
        <models>
            <companycatalog>
                <class>Company_Catalog_Model</class>
            </companycatalog>

            <catalog_resource_eav_mysql4>
                <rewrite>
                    <category_indexer_product>Company_Catalog_Model_Resource_Category_Indexer_Product</category_indexer_product>
                </rewrite>
            </catalog_resource_eav_mysql4>
        </models>
    </global>
</config>
Model/Resource/Category/Indexer/Product.php

Literally just a copy of the contents of reindexAll with some modifications. Starting at // Query Correction and ending at // end Query Correction.

<?php

class Company_Catalog_Model_Resource_Category_Indexer_Product extends Mage_Catalog_Model_Resource_Category_Indexer_Product
{
    /**
     * Rebuild all index data
     *
     * @return Mage_Catalog_Model_Resource_Category_Indexer_Product
     */
    public function reindexAll()
    {
        $this->useIdxTable(true);
        $this->beginTransaction();
        try {
            $this->clearTemporaryIndexTable();
            $idxTable = $this->getIdxTable();
            $idxAdapter = $this->_getIndexAdapter();
            $stores = $this->_getStoresInfo();
            /**
             * Build index for each store
             */
            foreach ($stores as $storeData) {
                $storeId    = $storeData['store_id'];
                $websiteId  = $storeData['website_id'];
                $rootPath   = $storeData['root_path'];
                $rootId     = $storeData['root_id'];
                /**
                 * Prepare visibility for all enabled store products
                 */
                $enabledTable = $this->_prepareEnabledProductsVisibility($websiteId, $storeId);
                /**
                 * Select information about anchor categories
                 */
                $anchorTable = $this->_prepareAnchorCategories($storeId, $rootPath);
                /**
                 * Add relations between not anchor categories and products
                 */
                $select = $idxAdapter->select();
                /** @var $select Varien_Db_Select */
                $select->from(
                    array('cp' => $this->_categoryProductTable),
                    array('category_id', 'product_id', 'position', 'is_parent' => new Zend_Db_Expr('1'),
                        'store_id' => new Zend_Db_Expr($storeId))
                )
                ->joinInner(array('pv' => $enabledTable), 'pv.product_id=cp.product_id', array('visibility'))
                ->joinLeft(array('ac' => $anchorTable), 'ac.category_id=cp.category_id', array())
                ->where('ac.category_id IS NULL');

                $query = $select->insertFromSelect(
                    $idxTable,
                    array('category_id', 'product_id', 'position', 'is_parent', 'store_id', 'visibility'),
                    false
                );
                $idxAdapter->query($query);

                /**
                 * Assign products not associated to any category to root category in index
                 */

                $select = $idxAdapter->select();
                $select->from(
                    array('pv' => $enabledTable),
                    array(new Zend_Db_Expr($rootId), 'product_id', new Zend_Db_Expr('0'), new Zend_Db_Expr('1'),
                        new Zend_Db_Expr($storeId), 'visibility')
                )
                ->joinLeft(array('cp' => $this->_categoryProductTable), 'pv.product_id=cp.product_id', array())
                ->where('cp.product_id IS NULL');

                $query = $select->insertFromSelect(
                    $idxTable,
                    array('category_id', 'product_id', 'position', 'is_parent', 'store_id', 'visibility'),
                    false
                );
                $idxAdapter->query($query);

                /**
                 * Prepare anchor categories products
                 */
                $anchorProductsTable = $this->_getAnchorCategoriesProductsTemporaryTable();
                $idxAdapter->delete($anchorProductsTable);

                $position = 'MIN('.
                    $idxAdapter->getCheckSql(
                        'ca.category_id = ce.entity_id',
                        $idxAdapter->quoteIdentifier('cp.position'),
                        '('.$idxAdapter->quoteIdentifier('ce.position').' + 1) * '
                        .'('.$idxAdapter->quoteIdentifier('ce.level').' + 1 * 10000)'
                        .' + '.$idxAdapter->quoteIdentifier('cp.position')
                    )
                .')';

                // Query Correction

                // load the is_active category attribute
                $attributeModel = Mage::getModel('eav/entity_attribute')->loadByCode(Mage_Catalog_Model_Category::ENTITY, 'is_active');
                $attributeTable = 'catalog_category_entity_' . $attributeModel->getBackendType();

                $select = $idxAdapter->select()
                ->useStraightJoin(true)
                ->distinct(true)
                ->from(array('ca' => $anchorTable), array('category_id'))
                ->joinInner(
                    array('ce' => $this->_categoryTable),
                    $idxAdapter->quoteIdentifier('ce.path') . ' LIKE ' .
                    $idxAdapter->quoteIdentifier('ca.path') . ' OR ce.entity_id = ca.category_id',
                    array()
                )

                ->joinInner(
                    array('cp' => $this->_categoryProductTable),
                    'cp.category_id = ce.entity_id',
                    array('product_id')
                )

                // left join the attribute
                ->joinInner(
                    array('ccei' => $attributeTable),
                    'ccei.entity_id = cp.category_id',
                    array()//select nothing
                )

                ->joinInner(
                    array('pv' => $enabledTable),
                    'pv.product_id = cp.product_id',
                    array('position' => $position)
                )

                ->where('ccei.attribute_id=' . $attributeModel->getAttributeId())//is_active
                ->where('ccei.value=1')

                ->group(array('ca.category_id', 'cp.product_id'));
                $query = $select->insertFromSelect($anchorProductsTable,
                    array('category_id', 'product_id', 'position'), false);
                $idxAdapter->query($query);

                // end Query Correction

                /**
                 * Add anchor categories products to index
                 */
                $select = $idxAdapter->select()
                ->from(
                    array('ap' => $anchorProductsTable),
                    array('category_id', 'product_id',
                        'position', // => new Zend_Db_Expr('MIN('. $idxAdapter->quoteIdentifier('ap.position').')'),
                        'is_parent' => $idxAdapter->getCheckSql('cp.product_id > 0', 1, 0),
                        'store_id' => new Zend_Db_Expr($storeId))
                )
                ->joinLeft(
                    array('cp' => $this->_categoryProductTable),
                    'cp.category_id=ap.category_id AND cp.product_id=ap.product_id',
                    array()
                )
                ->joinInner(array('pv' => $enabledTable), 'pv.product_id = ap.product_id', array('visibility'));

                $query = $select->insertFromSelect(
                    $idxTable,
                    array('category_id', 'product_id', 'position', 'is_parent', 'store_id', 'visibility'),
                    false
                );
                $idxAdapter->query($query);

                $select = $idxAdapter->select()
                    ->from(array('e' => $this->getTable('catalog/product')), null)
                    ->join(
                        array('ei' => $enabledTable),
                        'ei.product_id = e.entity_id',
                        array())
                    ->joinLeft(
                        array('i' => $idxTable),
                        'i.product_id = e.entity_id AND i.category_id = :category_id AND i.store_id = :store_id',
                        array())
                    ->where('i.product_id IS NULL')
                    ->columns(array(
                        'category_id'   => new Zend_Db_Expr($rootId),
                        'product_id'    => 'e.entity_id',
                        'position'      => new Zend_Db_Expr('0'),
                        'is_parent'     => new Zend_Db_Expr('1'),
                        'store_id'      => new Zend_Db_Expr($storeId),
                        'visibility'    => 'ei.visibility'
                    ));

                $query = $select->insertFromSelect(
                    $idxTable,
                    array('category_id', 'product_id', 'position', 'is_parent', 'store_id', 'visibility'),
                    false
                );

                $idxAdapter->query($query, array('store_id' => $storeId, 'category_id' => $rootId));
            }

            $this->syncData();

            /**
             * Clean up temporary tables
             */
            $this->clearTemporaryIndexTable();
            $idxAdapter->delete($enabledTable);
            $idxAdapter->delete($anchorTable);
            $idxAdapter->delete($anchorProductsTable);
            $this->commit();
        } catch (Exception $e) {
            $this->rollBack();
            throw $e;
        }
        return $this;
    }
}

The fix in depth

The first thing done here is to Magento Safe load the Category Attribute details for is_active.

                // load the is_active category attribute
                $attributeModel = Mage::getModel('eav/entity_attribute')->loadByCode(Mage_Catalog_Model_Category::ENTITY, 'is_active');
                $attributeTable = 'catalog_category_entity_' . $attributeModel->getBackendType();

After this join,

                ->joinInner(
                    array('cp' => $this->_categoryProductTable),
                    'cp.category_id = ce.entity_id',
                    array('product_id')
                )

we need to add another join, in this case we are joining to the table that contains the is_active information, it’s usually catalog_category_entity_int but we Magento Safe loaded it above.

                // left join the attribute
                ->joinInner(
                    array('ccei' => $attributeTable),
                    'ccei.entity_id = cp.category_id',
                    array()//select nothing
                )

Finally we need to adjust to add a where to clean up the join and select active categories only, after the last join and before the group function call:

                ->where('ccei.attribute_id=' . $attributeModel->getAttributeId())//is_active
                ->where('ccei.value=1')

This changes the query to:

INSERT INTO `catalog_category_anc_products_index_idx` (`category_id`, `product_id`, `position`) SELECT STRAIGHT_JOIN DISTINCT `ca`.`category_id`, `cp`.`product_id`, MIN(IF(ca.category_id = ce.entity_id, `cp`.`position`, (`ce`.`position` + 1) * (`ce`.`level` + 1 * 10000) + `cp`.`position`)) AS `position` FROM `catalog_category_anc_categs_index_idx` AS `ca`
 INNER JOIN `catalog_category_entity` AS `ce` ON `ce`.`path` LIKE `ca`.`path` OR ce.entity_id = ca.category_id
 INNER JOIN `catalog_category_product` AS `cp` ON cp.category_id = ce.entity_id
 INNER JOIN `catalog_category_entity_int` AS `ccei` ON ccei.entity_id = cp.category_id
 INNER JOIN `catalog_category_product_index_enbl_idx` AS `pv` ON pv.product_id = cp.product_id WHERE (ccei.attribute_id=119) AND (ccei.value=1) GROUP BY `ca`.`category_id`,
	`cp`.`product_id`

And Finally

Now with this module in place (remember to activate it in etc/modules) clear your cache and reindex. And hey presto, the product is no longer showing up in a category that is a parent of a category that the product is in that is now disabled. (Bit of a mouthful that)

In Summary

Much nicer, and not too much of a performance hit really.

Overall it means your products only show in the right active categories.

Generally it is only a issue when you disable a category and don’t delete it and then rearrange things leaving remnants of the old category tree floating about, so basically anything a common user would do.

I don’t tend to delete stuff myself, in case things get reactivated, and I’ll reuse the records.

Hopefully this is of help, rather than deleting records!