Playing with Lego Data from Rebrickable

3 minute read

Source: This exploratory analysis originated from a DataCamp project.

Ah, the good old childhood days of playing with your Lego sets. I went down this memory lane last week when I visited a bookstore and found a Lego play area. I added bricks to someone else’s tower and built a car next to it. #opensource #ftw

Fun fact: “Lego” was derived from the Danish phrase leg godt, which means “play well”.

In this quick analysis, let's explore the brick colors and the parts & themes of the sets.


About the Data

A comprehensive database of lego blocks over the years is provided by Rebrickable. The data is available as csv files. The following schema gives us a sense of the available data.


So many colors!

I imagine there are many unique brick colors. Let’s start by looking at the first 5 rows of colors data. id name rgb is_trans 0 -1 Unknown 0033B2 f 1 0 Black 05131D f 2 1 Blue 0055BF f 3 2 Green 237841 f 4 3 Dark Turquoise 008F9B f

But, how many distinct colors of bricks exactly are available?

135 colors


Wow, that’s a lot of colors! They even have variations in the form of transparency.

The colors data has a column named is_trans that indicates whether a color is transparent (t) or not (f).

Exploring the distribution of transparent vs. non-transparent colors.

           id  name  rgb
f         107   107  107
t          28    28   28

28 of the 135 colors are transparent. You remember these?


Parts in a Lego Set

Another interesting dataset available in this database is the sets data. It contains a comprehensive list of sets over the years and the number of parts that each of these sets contained.


Wait a second, there is a set with 471 parts and another with only 2 parts?

How many parts are there on average in a Lego set?

1950  10.142857
1953  16.500000
1954  12.357143
1955  36.857143
1956  18.500000

Also, let’s look at how has this changed over the years.

This volatility in the short-run and upward trend in the long-run look eerily similar to movements in the stock market. I imagine seasonality plays a big role in the themes that Lego comes out with. Correspondingly the number of parts must change. Tell me in case you explore?

Lego Themes over the years

Lego blocks ship under multiple themes.

Let's look into the themes dataset to explore how many themes Lego has shipped over the years.

1950         2
1953         1
1954         2
1955         4
1956         3
...        ...
2013        93
2014        92
2015        99
2016        88
2017        78

[66 rows x 1 columns]

Pause. Isn’t it amazing that we have our favorite toy’s data all the way from 1950?

The number of themes have also been steadily increasing like the number of parts.

Taking a closer look at themes_by_year, we see

71 themes released in 1999

Why look at 1999 in particular?

Star Wars: The Phantom Menace was released in 1999. Lego came out with their Star Wars special edition. This became a huge success!


Lego blocks offer an unlimited amount of fun across ages. There are many lego enthusiasts who collect special editions the same way investors collect art. We only scratched the surface by exploring trends around colors, parts, and themes of the Lego bricks. The database is rich with more information for enthusiasts looking to dive deeper. Jump right in.

Update 10 Dec 2021: The Guardian does a story on why investing in lego is more lucrative than gold. The fun continues!


For more detail, code and interactive graphs Visit Notebook

Leave a comment