BitMap Indexing in Database

What is bitmap indexing?

Bitmap Indexing is a type of database indexing that uses bitmaps.

Bitmap indexes provide greatest advantage when applied on columns having low cardinality, that is, columns in which the number of distinct values is small compared to the number of rows in the table.

matrix GIF

What is Cardinality?

Cardinality means uniqueness of data in column.

For example:

  • A column representing Gender column where gender can be Male or Female, will have cardinality as 2.
  • A column representing Category column where were we have 100 types of categories, will have cardinality as 100.

What is bitmap?

Bitmap is array of bits which uses sequence of bits to store information. 

Example: 001101111000110

I don’t understand anything you wrote above, can you tell me in layman language?

no way wtf GIF by Bounce

Let’s start with an example:

  • Consider you have a table to store Amazon inventory (Inventory table)
  • Your inventory table looks like:
idnamepricecategoryavailabilitystatus
1COFFEE100GROCERYYESAPPROVED
2COOKIES50GROCERYYESAPPROVED
3MOBILE10000ELECTRONICSNOIN_REVIEW
4TV20000ELECTRONICSYESAPPROVED
5MUSIC_SYSTEM5000ELECTRONICSYESREJECTED
6CAR500000BOOKNOREJECTED

In above table, following columns are good candidate for Bitmap Index:

  • Category [Cardinatlity is 3]
  • Availability [Cardinality is 2]
  • Status [Cardinality is 3]

In Bitmap indexing, number of bitmap required will be equal to cardinality for that column.

Let’s create Bitmap for Category and Status column as example.

Based on which ID have which category we set the bit to 1

Category/id123456
GROCERY110000
ELECTRONICS001110
BOOK000001

Based on which ID have which status we set the bit to 1:

category/id123456
APPROVED110100
IN_REVIEW001000
REJECTED000011

So, how this bitmap data helps when we query for any data?

Lets consider few examples:

what to find?how data is found?
Find all records with status as APPROVEDGo through Status bitmap, and find all ID which have corresponding APPROVED status bit value as 1
Find all records with status as APPROVED or IN_REVIEWWe have to do OR operation of Bitmap index of APPROVED and IN_REVIEW status.
Do a BITWISE OR of APPROVED status bitmap an IN_REVIEW status bitmap.
Find all records where Category is GROCERY and Status is APPROVEDWe have to do AND operation between GROCERY Category bitmap and APPROVED Status bitmap
Dj Khaled Yes GIF by VH1

What type of indexes are good for column with high cardinality columns and which one are good for low cardinality columns?

B-tree indexes are most effective for high-cardinality data: that is, data with many possible values.

BitMap are most effective for low cardinality data.

For example:

  • Cardinality of Gender column which only has two distinct values (male and female), is 2.
    • This is good candidate for bitmap indexes.
  • Cardinality of Item Name for table representing e-commerce inventory, where total rows are close to 100K and total columns close to 70K
    • This is not good candidate for bitmap indexes.

What is the problem of using B+ Tree with data with low cardinality?

B+ Tree builds a big tree, and no matter what the cardinality is, still tree is build.

This tree consumes a lot more memory if you compare it Bitmap index if the cardinality of the column is low.

The Office Jim GIF

In BitMap we uses very small amount of memory, but performance is same as B+ tree.

Can bitmap index have better performance than B-Tree indexes?

On a table with one million rows, a column with 10,000 distinct values is a candidate for a bitmap index. A bitmap index on this column can out-perform a B-tree index, particularly when this column is often queried in conjunction with other columns.

better GIF

Any query which I can use to find cardinality of column?

Well you can use this query:

SELECT column, COUNT(*) FROM table_name GROUP BY column;

Any special consideration while maintaining Bitmap indexes?

Maintaining a bitmap index takes a lot of resources (note we are talking about maintaining, not executing query using bitmap), therefore, bitmap indexes are only good for the read-only tables or tables that have infrequently updates. 

Therefore, you often find bitmap indexes are extensively used in the data warehouse environment.

Watching Mohawk Girls GIF by CBC

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s